One note on the Environment Variables and Configuration discussion.
My understanding is that passed ENV variables are added to the configuration in the "GlobalConfiguration.loadConfig()" method (or similar). For all the code inside Flink, it looks like the data was in the config to start with, just that the scripts that compute the variables can pass the values to the process without actually needing to write a file. For example the "GlobalConfiguration.loadConfig()" method would take any ENV variable prefixed with "flink" and add it as a config key. "flink_taskmanager_memory_size=2g" would become "taskmanager.memory.size: 2g". On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <[hidden email]> wrote: > Thanks for the comments, Till. > > I've also seen your comments on the wiki page, but let's keep the > discussion here. > > - Regarding 'TaskExecutorSpecifics', how do you think about naming it > 'TaskExecutorResourceSpecifics'. > - Regarding passing memory configurations into task executors, I'm in favor > of do it via environment variables rather than configurations, with the > following two reasons. > - It is easier to keep the memory options once calculate not to be > changed with environment variables rather than configurations. > - I'm not sure whether we should write the configuration in startup > scripts. Writing changes into the configuration files when running the > startup scripts does not sounds right to me. Or we could make a copy of > configuration files per flink cluster, and make the task executor to load > from the copy, and clean up the copy after the cluster is shutdown, which > is complicated. (I think this is also what Stephan means in his comment on > the wiki page?) > - Regarding reserving memory, I think this change should be included in > this FLIP. I think a big part of motivations of this FLIP is to unify > memory configuration for streaming / batch and make it easy for configuring > rocksdb memory. If we don't support memory reservation, then streaming jobs > cannot use managed memory (neither on-heap or off-heap), which makes this > FLIP incomplete. > - Regarding network memory, I think you are right. I think we probably > don't need to change network stack from using direct memory to using unsafe > native memory. Network memory size is deterministic, cannot be reserved as > managed memory does, and cannot be overused. I think it also works if we > simply keep using direct memory for network and include it in jvm max > direct memory size. > > Thank you~ > > Xintong Song > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <[hidden email]> > wrote: > > > Hi Xintong, > > > > thanks for addressing the comments and adding a more detailed > > implementation plan. I have a couple of comments concerning the > > implementation plan: > > > > - The name `TaskExecutorSpecifics` is not really descriptive. Choosing a > > different name could help here. > > - I'm not sure whether I would pass the memory configuration to the > > TaskExecutor via environment variables. I think it would be better to > write > > it into the configuration one uses to start the TM process. > > - If possible, I would exclude the memory reservation from this FLIP and > > add this as part of a dedicated FLIP. > > - If possible, then I would exclude changes to the network stack from > this > > FLIP. Maybe we can simply say that the direct memory needed by the > network > > stack is the framework direct memory requirement. Changing how the memory > > is allocated can happen in a second step. This would keep the scope of > this > > FLIP smaller. > > > > Cheers, > > Till > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song <[hidden email]> > > wrote: > > > > > Hi everyone, > > > > > > I just updated the FLIP document on wiki [1], with the following > changes. > > > > > > - Removed open question regarding MemorySegment allocation. As > > > discussed, we exclude this topic from the scope of this FLIP. > > > - Updated content about JVM direct memory parameter according to > > recent > > > discussions, and moved the other options to "Rejected Alternatives" > > for > > > the > > > moment. > > > - Added implementation steps. > > > > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <[hidden email]> wrote: > > > > > > > @Xintong: Concerning "wait for memory users before task dispose and > > > memory > > > > release": I agree, that's how it should be. Let's try it out. > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC when > > allocating > > > > direct memory buffer": There seems to be pretty elaborate logic to > free > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like going > with > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), then I > > think > > > it > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > "off_heap_managed_memory + direct_memory" even if we use RocksDB. > That > > > is a > > > > big if, though, I honestly have no idea :D Would be good to > understand > > > > this, though, because this would affect option (2) and option (1.2). > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song <[hidden email]> > > > > wrote: > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > Let me try to summarize your points. Please correct me if I'm > wrong. > > > > > > > > > > - Memory consumers should always avoid returning memory segments > > to > > > > > memory manager while there are still un-cleaned structures / > > threads > > > > > that > > > > > may use the memory. Otherwise, it would cause serious problems > by > > > > having > > > > > multiple consumers trying to use the same memory segment. > > > > > - JVM does not wait for GC when allocating direct memory buffer. > > > > > Therefore even we set proper max direct memory size limit, we > may > > > > still > > > > > encounter direct memory oom if the GC cleaning memory slower > than > > > the > > > > > direct memory allocation. > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > [hidden email] > > > > > .invalid> > > > > > wrote: > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > About option 2: > > > > > > > > > > > > if additional threads not cleanly shut down before we can exit > the > > > > task: > > > > > > In the current case of memory reuse, it has freed up the memory > it > > > > > > uses. If this memory is used by other tasks and asynchronous > > threads > > > > > > of exited task may still be writing, there will be concurrent > > > security > > > > > > problems, and even lead to errors in user computing results. > > > > > > > > > > > > So I think this is a serious and intolerable bug, No matter what > > the > > > > > > option is, it should be avoided. > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > I don't think it is a good idea, I've encountered so many > > situations > > > > > > that it's too late for GC to cause DirectMemory OOM. Release and > > > > > > allocate DirectMemory depend on the type of user job, which is > > > > > > often beyond our control. > > > > > > > > > > > > Best, > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > To:dev <[hidden email]> > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory Configuration for > > > > > > TaskExecutors > > > > > > > > > > > > My main concern with option 2 (manually release memory) is that > > > > segfaults > > > > > > in the JVM send off all sorts of alarms on user ends. So we need > to > > > > > > guarantee that this never happens. > > > > > > > > > > > > The trickyness is in tasks that uses data structures / algorithms > > > with > > > > > > additional threads, like hash table spill/read and sorting > threads. > > > We > > > > > need > > > > > > to ensure that these cleanly shut down before we can exit the > task. > > > > > > I am not sure that we have that guaranteed already, that's why > > option > > > > 1.1 > > > > > > seemed simpler to me. > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this way really > > > makes > > > > > > > things easier to understand. > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I think it > is > > > not > > > > > that > > > > > > > difficult to keep it segfault safe for memory manager, as long > as > > > we > > > > > > always > > > > > > > de-allocate the memory segment when it is released from the > > memory > > > > > > > consumers. Only if the memory consumer continue using the > buffer > > of > > > > > > memory > > > > > > > segment after releasing it, in which case we do want the job to > > > fail > > > > so > > > > > > we > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. Not only > > because > > > > the > > > > > > > assumption (regular GC is enough to clean direct buffers) may > not > > > > > always > > > > > > be > > > > > > > true, but also it makes harder for finding problems in cases of > > > > memory > > > > > > > overuse. E.g., user configured some direct memory for the user > > > > > libraries. > > > > > > > If the library actually use more direct memory then configured, > > > which > > > > > > > cannot be cleaned by GC because they are still in use, may lead > > to > > > > > > overuse > > > > > > > of the total container memory. In that case, if it didn't touch > > the > > > > JVM > > > > > > > default max direct memory limit, we cannot get a direct memory > > OOM > > > > and > > > > > it > > > > > > > will become super hard to understand which part of the > > > configuration > > > > > need > > > > > > > to be updated. > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if the > > exceeded > > > > > direct > > > > > > > memory does not reach the max direct memory limit specified by > > the > > > > > > > dedicated parameter. I think it is slightly better than 1.2, > only > > > > > because > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen <[hidden email] > > > > > > wrote: > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, maybe let me > > > > > summarize > > > > > > > it a > > > > > > > > bit differently: > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the GC. That > makes > > > it > > > > > > > segfault > > > > > > > > safe. But then we need a way to trigger GC in case > > de-allocation > > > > and > > > > > > > > re-allocation of a bunch of segments happens quickly, which > is > > > > often > > > > > > the > > > > > > > > case during batch scheduling or task restart. > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is one way to > do > > > > this > > > > > > > > - Another way could be to have a dedicated bookkeeping in > the > > > > > > > > MemoryManager (option 1.2), so that this is a number > > independent > > > of > > > > > the > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory for the > > > > > > > MemorySegments > > > > > > > > (option 2). That way we need not worry about triggering GC by > > > some > > > > > > > > threshold or bookkeeping, but it is harder to prevent > > segfaults. > > > We > > > > > > need > > > > > > > to > > > > > > > > be very careful about when we release the memory segments > (only > > > in > > > > > the > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > direct_memory" > > > > > > > and > > > > > > > > have "direct_memory" as a separate reserved memory pool. > > Because > > > if > > > > > we > > > > > > > just > > > > > > > > set "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > > > jvm_overhead", > > > > > > > > then there will be times when that entire memory is allocated > > by > > > > > direct > > > > > > > > buffers and we have nothing left for the JVM overhead. So we > > > either > > > > > > need > > > > > > > a > > > > > > > > way to compensate for that (again some safety margin cutoff > > > value) > > > > or > > > > > > we > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that it takes > > > > elaborate > > > > > > > logic > > > > > > > > to push recycling of direct buffers without always > triggering a > > > > full > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest to do in > the > > > > > > following > > > > > > > > order: > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory parameter, as > > > > discussed > > > > > > > > above. We would need to find a way to set the direct_memory > > > > parameter > > > > > > by > > > > > > > > default. We could start with 64 MB and see how it goes in > > > practice. > > > > > One > > > > > > > > danger I see is that setting this loo low can cause a bunch > of > > > > > > additional > > > > > > > > GCs compared to before (we need to watch this carefully). > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to implement, we > > could > > > > try > > > > > > how > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > "-XX:MaxDirectMemorySize" > > > > > > > parameter > > > > > > > > at all and assume that all the direct memory allocations that > > the > > > > JVM > > > > > > and > > > > > > > > Netty do are infrequent enough to be cleaned up fast enough > > > through > > > > > > > regular > > > > > > > > GC. I am not sure if that is a valid assumption, though. > > > > > > > > > > > > > > > > Best, > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering whether > > we > > > > can > > > > > > > avoid > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > network > > > > > > memory > > > > > > > > with > > > > > > > > > alternative 3. But after giving it a second thought, I > think > > > even > > > > > for > > > > > > > > > alternative 3 using direct memory for off-heap managed > memory > > > > could > > > > > > > cause > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this FLIP > it > > > to > > > > > have > > > > > > > > both > > > > > > > > > off-heap managed memory and network memory allocated > through > > > > > > > > > Unsafe.allocate(), which means they are practically native > > > memory > > > > > and > > > > > > > not > > > > > > > > > limited by JVM max direct memory. The only parts of memory > > > > limited > > > > > by > > > > > > > JVM > > > > > > > > > max direct memory are task off-heap memory and JVM > overhead, > > > > which > > > > > > are > > > > > > > > > exactly alternative 2 suggests to set the JVM max direct > > memory > > > > to. > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand the > two > > > > > > > alternatives > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes things > > > > > explicit. > > > > > > If > > > > > > > > we > > > > > > > > > > don't limit the direct memory, I fear that we might end > up > > > in a > > > > > > > similar > > > > > > > > > > situation as we are currently in: The user might see that > > her > > > > > > process > > > > > > > > > gets > > > > > > > > > > killed by the OS and does not know why this is the case. > > > > > > > Consequently, > > > > > > > > > she > > > > > > > > > > tries to decrease the process memory size (similar to > > > > increasing > > > > > > the > > > > > > > > > cutoff > > > > > > > > > > ratio) in order to accommodate for the extra direct > memory. > > > > Even > > > > > > > worse, > > > > > > > > > she > > > > > > > > > > tries to decrease memory budgets which are not fully used > > and > > > > > hence > > > > > > > > won't > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > Overhead): > > > > 200MB > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, Off-Heap > > > > Managed > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize to > > 200MB. > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize to a > > very > > > > > large > > > > > > > > > value, > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > Memory > > > and > > > > > JVM > > > > > > > > > > Overhead > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > alternative 3 > > > > > should > > > > > > > have > > > > > > > > > the > > > > > > > > > > > same utility. Setting larger -XX:MaxDirectMemorySize > will > > > not > > > > > > > reduce > > > > > > > > > the > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > Memory > > > and > > > > > JVM > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To avoid > > > that, > > > > > the > > > > > > > only > > > > > > > > > > thing > > > > > > > > > > > user can do is to modify the configuration and > > increase > > > > JVM > > > > > > > Direct > > > > > > > > > > > Memory > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > that > > > user > > > > > > > > increases > > > > > > > > > > JVM > > > > > > > > > > > Direct Memory to 250MB, this will reduce the total > > size > > > of > > > > > > other > > > > > > > > > > memory > > > > > > > > > > > pools to 750MB, given the total process memory > remains > > > > 1GB. > > > > > > > > > > > - For alternative 3, there is no chance of direct > OOM. > > > > There > > > > > > are > > > > > > > > > > chances > > > > > > > > > > > of exceeding the total process memory limit, but > given > > > > that > > > > > > the > > > > > > > > > > process > > > > > > > > > > > may > > > > > > > > > > > not use up all the reserved native memory (Off-Heap > > > > Managed > > > > > > > > Memory, > > > > > > > > > > > Network > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct memory > > > usage > > > > is > > > > > > > > > slightly > > > > > > > > > > > above > > > > > > > > > > > yet very close to 200MB, user probably do not need > to > > > > change > > > > > > the > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > feasible > > > > > > > > > configuration > > > > > > > > > > > for alternative 2 may lead to lower resource > utilization > > > > > compared > > > > > > > to > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the difference > > > > between > > > > > > > > > > > alternative 2 > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > Overhead. Then there is the risk that this size is > too > > > low > > > > > > > > resulting > > > > > > > > > > in a > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > something > > > > > larger > > > > > > > > than > > > > > > > > > > > > alternative 2. This would of course reduce the sizes > of > > > the > > > > > > other > > > > > > > > > > memory > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > utilization > > > > of > > > > > > > > memory > > > > > > > > > > > > compared to alternative 3? If alternative 3 strictly > > > sets a > > > > > > > higher > > > > > > > > > max > > > > > > > > > > > > direct memory size and we use only little, then I > would > > > > > expect > > > > > > > that > > > > > > > > > > > > alternative 3 results in memory under utilization. > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct memory > > size > > > > > when > > > > > > we > > > > > > > > do > > > > > > > > > > not > > > > > > > > > > > > > differentiate direct and native memory. If the > direct > > > > > > > > > > memory,including > > > > > > > > > > > > user > > > > > > > > > > > > > direct memory and framework direct memory,could be > > > > > calculated > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > i am in favor of setting direct memory with fixed > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need to > > check > > > > the > > > > > > > > memory > > > > > > > > > > > > > configurations in client to avoid submitting > > > successfully > > > > > and > > > > > > > > > failing > > > > > > > > > > > in > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>于2019年8月13日 > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right that > we > > > > should > > > > > > not > > > > > > > > > > include > > > > > > > > > > > > > this > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP should > > > > > > concentrate > > > > > > > > on > > > > > > > > > > how > > > > > > > > > > > to > > > > > > > > > > > > > > configure memory pools for TaskExecutors, with > > > minimum > > > > > > > > > involvement > > > > > > > > > > on > > > > > > > > > > > > how > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 may > not > > > > having > > > > > > the > > > > > > > > > same > > > > > > > > > > > over > > > > > > > > > > > > > > reservation issue that alternative 2 does, but at > > the > > > > > cost > > > > > > of > > > > > > > > > risk > > > > > > > > > > of > > > > > > > > > > > > > over > > > > > > > > > > > > > > using memory at the container level, which is not > > > good. > > > > > My > > > > > > > > point > > > > > > > > > is > > > > > > > > > > > > that > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM Overhead" > are > > > not > > > > > easy > > > > > > > to > > > > > > > > > > > config. > > > > > > > > > > > > > For > > > > > > > > > > > > > > alternative 2, users might configure them higher > > than > > > > > what > > > > > > > > > actually > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > alternative > > > 3, > > > > > > users > > > > > > > do > > > > > > > > > not > > > > > > > > > > > get > > > > > > > > > > > > > > direct OOM, so they may not config the two > options > > > > > > > aggressively > > > > > > > > > > high. > > > > > > > > > > > > But > > > > > > > > > > > > > > the consequences are risks of overall container > > > memory > > > > > > usage > > > > > > > > > > exceeds > > > > > > > > > > > > the > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till Rohrmann < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite good. > > > > > > Concerning > > > > > > > > the > > > > > > > > > > > first > > > > > > > > > > > > > open > > > > > > > > > > > > > > > question about allocating memory segments, I > was > > > > > > wondering > > > > > > > > > > whether > > > > > > > > > > > > this > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > strictly necessary to do in the context of this > > > FLIP > > > > or > > > > > > > > whether > > > > > > > > > > > this > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > details, > > > > I > > > > > > > would > > > > > > > > be > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > that we would widen the scope of this FLIP too > > much > > > > > > because > > > > > > > > we > > > > > > > > > > > would > > > > > > > > > > > > > have > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > MemoryManager > > > > > > > > where > > > > > > > > > > we > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > memory segments (this should mainly be batch > > > > > operators). > > > > > > > The > > > > > > > > > > > addition > > > > > > > > > > > > > of > > > > > > > > > > > > > > > the memory reservation call to the > MemoryManager > > > > should > > > > > > not > > > > > > > > be > > > > > > > > > > > > affected > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > this and I would hope that this is the only > point > > > of > > > > > > > > > interaction > > > > > > > > > > a > > > > > > > > > > > > > > > streaming job would have with the > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > setting > > > or > > > > > not > > > > > > > > > setting > > > > > > > > > > a > > > > > > > > > > > > max > > > > > > > > > > > > > > > direct memory limit, I would also be interested > > why > > > > > Yang > > > > > > > Wang > > > > > > > > > > > thinks > > > > > > > > > > > > > > > leaving it open would be best. My concern about > > > this > > > > > > would > > > > > > > be > > > > > > > > > > that > > > > > > > > > > > we > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > be in a similar situation as we are now with > the > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > If > > > > > > > > > > > > > > > the different memory pools are not clearly > > > separated > > > > > and > > > > > > > can > > > > > > > > > > spill > > > > > > > > > > > > over > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > understand > > > > > > what > > > > > > > > > > exactly > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > process to get killed for using too much > memory. > > > This > > > > > > could > > > > > > > > > then > > > > > > > > > > > > easily > > > > > > > > > > > > > > > lead to a similar situation what we have with > the > > > > > > > > cutoff-ratio. > > > > > > > > > > So > > > > > > > > > > > > why > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > setting a sane default value for max direct > > memory > > > > and > > > > > > > giving > > > > > > > > > the > > > > > > > > > > > > user > > > > > > > > > > > > > an > > > > > > > > > > > > > > > option to increase it if he runs into an OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to lower > > > > memory > > > > > > > > > > utilization > > > > > > > > > > > > than > > > > > > > > > > > > > > > alternative 3 where we set the direct memory > to a > > > > > higher > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong Song < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > I think setting a very large max direct > memory > > > size > > > > > > > > > definitely > > > > > > > > > > > has > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > direct > > > OOM, > > > > > and > > > > > > > we > > > > > > > > > > don't > > > > > > > > > > > > even > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > However, there are also some down sides of > > doing > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if a > task > > > > > > executor > > > > > > > > > > > container > > > > > > > > > > > > is > > > > > > > > > > > > > > > > killed due to overusing memory, it could > be > > > hard > > > > > for > > > > > > > use > > > > > > > > > to > > > > > > > > > > > know > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > - Another down side is that the JVM never > > > > trigger > > > > > GC > > > > > > > due > > > > > > > > > to > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory limit, because the limit is > > too > > > > high > > > > > > to > > > > > > > be > > > > > > > > > > > > reached. > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > means we kind of relay on heap memory to > > > trigger > > > > > GC > > > > > > > and > > > > > > > > > > > release > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > memory. That could be a problem in cases > > where > > > > we > > > > > > have > > > > > > > > > more > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > usage but not enough heap activity to > > trigger > > > > the > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > preferring > > > > > > setting a > > > > > > > > > very > > > > > > > > > > > > large > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > configuration > > > > > > > > that > > > > > > > > > > user > > > > > > > > > > > > > > > > explicitly specified, I think we should throw > > an > > > > > error. > > > > > > > > > > > > > > > > I think doing checking on the client side is > a > > > good > > > > > > idea, > > > > > > > > so > > > > > > > > > > that > > > > > > > > > > > > on > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > K8s we can discover the problem before > > submitting > > > > the > > > > > > > Flink > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > But we can not only rely on the client side > > > > checking, > > > > > > > > because > > > > > > > > > > for > > > > > > > > > > > > > > > > standalone cluster TaskManagers on different > > > > machines > > > > > > may > > > > > > > > > have > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > configurations and the client does see that. > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. After > all > > > the > > > > > > memory > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > control > > > > the > > > > > > > flink > > > > > > > > > > > memory > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct memory > > and > > > > > native > > > > > > > > > memory. > > > > > > > > > > > > They > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > included in task off-heap memory. Right? > So i > > > > don’t > > > > > > > think > > > > > > > > > we > > > > > > > > > > > > could > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > prefer > > > > > > leaving > > > > > > > > it a > > > > > > > > > > > very > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > memory(network > > > > > memory, > > > > > > > > > managed > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > is larger than total process memory, how do > > we > > > > deal > > > > > > > with > > > > > > > > > this > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > we need to check the memory configuration > in > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > 于2019年8月7日周三 > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > thread > > on > > > > > > > "FLIP-49: > > > > > > > > > > > Unified > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > where > > we > > > > > > > describe > > > > > > > > > how > > > > > > > > > > to > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. The > > FLIP > > > > > > document > > > > > > > > is > > > > > > > > > > > mostly > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > Configuration > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > with updates from follow-up discussions > > both > > > > > online > > > > > > > and > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several shortcomings > of > > > > > current > > > > > > > > > (Flink > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > Streaming > > > and > > > > > > Batch. > > > > > > > > > > > > > > > > > > - Complex and difficult configuration > of > > > > > RocksDB > > > > > > > in > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard to > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can be > > > > > summarized > > > > > > > as > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > account > > > for > > > > > > memory > > > > > > > > > usage > > > > > > > > > > > by > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory is > > > > > partitioned > > > > > > > > > > accounted > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > - Simplify memory configuration > options > > > and > > > > > > > > > calculations > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP wiki > > > > > document > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design doc > [2] > > is > > > > out > > > > > > of > > > > > > > > > sync, > > > > > > > > > > > and > > > > > > > > > > > > it > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > this > > > > > mailing > > > > > > > list > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering whether > > we > > > > can > > > > > > > avoid > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > network > > > > > > memory > > > > > > > > with > > > > > > > > > alternative 3. But after giving it a second thought, I > think > > > even > > > > > for > > > > > > > > > alternative 3 using direct memory for off-heap managed > memory > > > > could > > > > > > > cause > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this FLIP > it > > > to > > > > > have > > > > > > > > both > > > > > > > > > off-heap managed memory and network memory allocated > through > > > > > > > > > Unsafe.allocate(), which means they are practically native > > > memory > > > > > and > > > > > > > not > > > > > > > > > limited by JVM max direct memory. The only parts of memory > > > > limited > > > > > by > > > > > > > JVM > > > > > > > > > max direct memory are task off-heap memory and JVM > overhead, > > > > which > > > > > > are > > > > > > > > > exactly alternative 2 suggests to set the JVM max direct > > memory > > > > to. > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand the > two > > > > > > > alternatives > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes things > > > > > explicit. > > > > > > If > > > > > > > > we > > > > > > > > > > don't limit the direct memory, I fear that we might end > up > > > in a > > > > > > > similar > > > > > > > > > > situation as we are currently in: The user might see that > > her > > > > > > process > > > > > > > > > gets > > > > > > > > > > killed by the OS and does not know why this is the case. > > > > > > > Consequently, > > > > > > > > > she > > > > > > > > > > tries to decrease the process memory size (similar to > > > > increasing > > > > > > the > > > > > > > > > cutoff > > > > > > > > > > ratio) in order to accommodate for the extra direct > memory. > > > > Even > > > > > > > worse, > > > > > > > > > she > > > > > > > > > > tries to decrease memory budgets which are not fully used > > and > > > > > hence > > > > > > > > won't > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > Overhead): > > > > 200MB > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, Off-Heap > > > > Managed > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize to > > 200MB. > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize to a > > very > > > > > large > > > > > > > > > value, > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > Memory > > > and > > > > > JVM > > > > > > > > > > Overhead > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > alternative 3 > > > > > should > > > > > > > have > > > > > > > > > the > > > > > > > > > > > same utility. Setting larger -XX:MaxDirectMemorySize > will > > > not > > > > > > > reduce > > > > > > > > > the > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > Memory > > > and > > > > > JVM > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To avoid > > > that, > > > > > the > > > > > > > only > > > > > > > > > > thing > > > > > > > > > > > user can do is to modify the configuration and > > increase > > > > JVM > > > > > > > Direct > > > > > > > > > > > Memory > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > that > > > user > > > > > > > > increases > > > > > > > > > > JVM > > > > > > > > > > > Direct Memory to 250MB, this will reduce the total > > size > > > of > > > > > > other > > > > > > > > > > memory > > > > > > > > > > > pools to 750MB, given the total process memory > remains > > > > 1GB. > > > > > > > > > > > - For alternative 3, there is no chance of direct > OOM. > > > > There > > > > > > are > > > > > > > > > > chances > > > > > > > > > > > of exceeding the total process memory limit, but > given > > > > that > > > > > > the > > > > > > > > > > process > > > > > > > > > > > may > > > > > > > > > > > not use up all the reserved native memory (Off-Heap > > > > Managed > > > > > > > > Memory, > > > > > > > > > > > Network > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct memory > > > usage > > > > is > > > > > > > > > slightly > > > > > > > > > > > above > > > > > > > > > > > yet very close to 200MB, user probably do not need > to > > > > change > > > > > > the > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > feasible > > > > > > > > > configuration > > > > > > > > > > > for alternative 2 may lead to lower resource > utilization > > > > > compared > > > > > > > to > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the difference > > > > between > > > > > > > > > > > alternative 2 > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > Overhead. Then there is the risk that this size is > too > > > low > > > > > > > > resulting > > > > > > > > > > in a > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > something > > > > > larger > > > > > > > > than > > > > > > > > > > > > alternative 2. This would of course reduce the sizes > of > > > the > > > > > > other > > > > > > > > > > memory > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > utilization > > > > of > > > > > > > > memory > > > > > > > > > > > > compared to alternative 3? If alternative 3 strictly > > > sets a > > > > > > > higher > > > > > > > > > max > > > > > > > > > > > > direct memory size and we use only little, then I > would > > > > > expect > > > > > > > that > > > > > > > > > > > > alternative 3 results in memory under utilization. > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct memory > > size > > > > > when > > > > > > we > > > > > > > > do > > > > > > > > > > not > > > > > > > > > > > > > differentiate direct and native memory. If the > direct > > > > > > > > > > memory,including > > > > > > > > > > > > user > > > > > > > > > > > > > direct memory and framework direct memory,could be > > > > > calculated > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > i am in favor of setting direct memory with fixed > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need to > > check > > > > the > > > > > > > > memory > > > > > > > > > > > > > configurations in client to avoid submitting > > > successfully > > > > > and > > > > > > > > > failing > > > > > > > > > > > in > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>于2019年8月13日 > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right that > we > > > > should > > > > > > not > > > > > > > > > > include > > > > > > > > > > > > > this > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP should > > > > > > concentrate > > > > > > > > on > > > > > > > > > > how > > > > > > > > > > > to > > > > > > > > > > > > > > configure memory pools for TaskExecutors, with > > > minimum > > > > > > > > > involvement > > > > > > > > > > on > > > > > > > > > > > > how > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 may > not > > > > having > > > > > > the > > > > > > > > > same > > > > > > > > > > > over > > > > > > > > > > > > > > reservation issue that alternative 2 does, but at > > the > > > > > cost > > > > > > of > > > > > > > > > risk > > > > > > > > > > of > > > > > > > > > > > > > over > > > > > > > > > > > > > > using memory at the container level, which is not > > > good. > > > > > My > > > > > > > > point > > > > > > > > > is > > > > > > > > > > > > that > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM Overhead" > are > > > not > > > > > easy > > > > > > > to > > > > > > > > > > > config. > > > > > > > > > > > > > For > > > > > > > > > > > > > > alternative 2, users might configure them higher > > than > > > > > what > > > > > > > > > actually > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > alternative > > > 3, > > > > > > users > > > > > > > do > > > > > > > > > not > > > > > > > > > > > get > > > > > > > > > > > > > > direct OOM, so they may not config the two > options > > > > > > > aggressively > > > > > > > > > > high. > > > > > > > > > > > > But > > > > > > > > > > > > > > the consequences are risks of overall container > > > memory > > > > > > usage > > > > > > > > > > exceeds > > > > > > > > > > > > the > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till Rohrmann < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite good. > > > > > > Concerning > > > > > > > > the > > > > > > > > > > > first > > > > > > > > > > > > > open > > > > > > > > > > > > > > > question about allocating memory segments, I > was > > > > > > wondering > > > > > > > > > > whether > > > > > > > > > > > > this > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > strictly necessary to do in the context of this > > > FLIP > > > > or > > > > > > > > whether > > > > > > > > > > > this > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > details, > > > > I > > > > > > > would > > > > > > > > be > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > that we would widen the scope of this FLIP too > > much > > > > > > because > > > > > > > > we > > > > > > > > > > > would > > > > > > > > > > > > > have > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > MemoryManager > > > > > > > > where > > > > > > > > > > we > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > memory segments (this should mainly be batch > > > > > operators). > > > > > > > The > > > > > > > > > > > addition > > > > > > > > > > > > > of > > > > > > > > > > > > > > > the memory reservation call to the > MemoryManager > > > > should > > > > > > not > > > > > > > > be > > > > > > > > > > > > affected > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > this and I would hope that this is the only > point > > > of > > > > > > > > > interaction > > > > > > > > > > a > > > > > > > > > > > > > > > streaming job would have with the > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > setting > > > or > > > > > not > > > > > > > > > setting > > > > > > > > > > a > > > > > > > > > > > > max > > > > > > > > > > > > > > > direct memory limit, I would also be interested > > why > > > > > Yang > > > > > > > Wang > > > > > > > > > > > thinks > > > > > > > > > > > > > > > leaving it open would be best. My concern about > > > this > > > > > > would > > > > > > > be > > > > > > > > > > that > > > > > > > > > > > we > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > be in a similar situation as we are now with > the > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > If > > > > > > > > > > > > > > > the different memory pools are not clearly > > > separated > > > > > and > > > > > > > can > > > > > > > > > > spill > > > > > > > > > > > > over > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > understand > > > > > > what > > > > > > > > > > exactly > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > process to get killed for using too much > memory. > > > This > > > > > > could > > > > > > > > > then > > > > > > > > > > > > easily > > > > > > > > > > > > > > > lead to a similar situation what we have with > the > > > > > > > > cutoff-ratio. > > > > > > > > > > So > > > > > > > > > > > > why > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > setting a sane default value for max direct > > memory > > > > and > > > > > > > giving > > > > > > > > > the > > > > > > > > > > > > user > > > > > > > > > > > > > an > > > > > > > > > > > > > > > option to increase it if he runs into an OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to lower > > > > memory > > > > > > > > > > utilization > > > > > > > > > > > > than > > > > > > > > > > > > > > > alternative 3 where we set the direct memory > to a > > > > > higher > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong Song < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > I think setting a very large max direct > memory > > > size > > > > > > > > > definitely > > > > > > > > > > > has > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > direct > > > OOM, > > > > > and > > > > > > > we > > > > > > > > > > don't > > > > > > > > > > > > even > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > However, there are also some down sides of > > doing > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if a > task > > > > > > executor > > > > > > > > > > > container > > > > > > > > > > > > is > > > > > > > > > > > > > > > > killed due to overusing memory, it could > be > > > hard > > > > > for > > > > > > > use > > > > > > > > > to > > > > > > > > > > > know > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > - Another down side is that the JVM never > > > > trigger > > > > > GC > > > > > > > due > > > > > > > > > to > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory limit, because the limit is > > too > > > > high > > > > > > to > > > > > > > be > > > > > > > > > > > > reached. > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > means we kind of relay on heap memory to > > > trigger > > > > > GC > > > > > > > and > > > > > > > > > > > release > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > memory. That could be a problem in cases > > where > > > > we > > > > > > have > > > > > > > > > more > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > usage but not enough heap activity to > > trigger > > > > the > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > preferring > > > > > > setting a > > > > > > > > > very > > > > > > > > > > > > large > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > configuration > > > > > > > > that > > > > > > > > > > user > > > > > > > > > > > > > > > > explicitly specified, I think we should throw > > an > > > > > error. > > > > > > > > > > > > > > > > I think doing checking on the client side is > a > > > good > > > > > > idea, > > > > > > > > so > > > > > > > > > > that > > > > > > > > > > > > on > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > K8s we can discover the problem before > > submitting > > > > the > > > > > > > Flink > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > But we can not only rely on the client side > > > > checking, > > > > > > > > because > > > > > > > > > > for > > > > > > > > > > > > > > > > standalone cluster TaskManagers on different > > > > machines > > > > > > may > > > > > > > > > have > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > configurations and the client does see that. > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. After > all > > > the > > > > > > memory > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > control > > > > the > > > > > > > flink > > > > > > > > > > > memory > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct memory > > and > > > > > native > > > > > > > > > memory. > > > > > > > > > > > > They > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > included in task off-heap memory. Right? > So i > > > > don’t > > > > > > > think > > > > > > > > > we > > > > > > > > > > > > could > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > prefer > > > > > > leaving > > > > > > > > it a > > > > > > > > > > > very > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > memory(network > > > > > memory, > > > > > > > > > managed > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > is larger than total process memory, how do > > we > > > > deal > > > > > > > with > > > > > > > > > this > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > we need to check the memory configuration > in > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > 于2019年8月7日周三 > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > thread > > on > > > > > > > "FLIP-49: > > > > > > > > > > > Unified > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > where > > we > > > > > > > describe > > > > > > > > > how > > > > > > > > > > to > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. The > > FLIP > > > > > > document > > > > > > > > is > > > > > > > > > > > mostly > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > Configuration > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > with updates from follow-up discussions > > both > > > > > online > > > > > > > and > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several shortcomings > of > > > > > current > > > > > > > > > (Flink > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > Streaming > > > and > > > > > > Batch. > > > > > > > > > > > > > > > > > > - Complex and difficult configuration > of > > > > > RocksDB > > > > > > > in > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard to > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can be > > > > > summarized > > > > > > > as > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > account > > > for > > > > > > memory > > > > > > > > > usage > > > > > > > > > > > by > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory is > > > > > partitioned > > > > > > > > > > accounted > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > - Simplify memory configuration > options > > > and > > > > > > > > > calculations > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP wiki > > > > > document > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design doc > [2] > > is > > > > out > > > > > > of > > > > > > > > > sync, > > > > > > > > > > > and > > > > > > > > > > > > it > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > this > > > > > mailing > > > > > > > list > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Just add my 2 cents.
Using environment variables to override the configuration for different taskmanagers is better. We do not need to generate dedicated flink-conf.yaml for all taskmanagers. A common flink-conf.yam and different environment variables are enough. By reducing the distributed cached files, it could make launching a taskmanager faster. Stephan gives a good suggestion that we could move the logic into "GlobalConfiguration.loadConfig()" method. Maybe the client could also benefit from this. Different users do not have to export FLINK_CONF_DIR to update few config options. Best, Yang Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > One note on the Environment Variables and Configuration discussion. > > My understanding is that passed ENV variables are added to the > configuration in the "GlobalConfiguration.loadConfig()" method (or > similar). > For all the code inside Flink, it looks like the data was in the config to > start with, just that the scripts that compute the variables can pass the > values to the process without actually needing to write a file. > > For example the "GlobalConfiguration.loadConfig()" method would take any > ENV variable prefixed with "flink" and add it as a config key. > "flink_taskmanager_memory_size=2g" would become "taskmanager.memory.size: > 2g". > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <[hidden email]> > wrote: > > > Thanks for the comments, Till. > > > > I've also seen your comments on the wiki page, but let's keep the > > discussion here. > > > > - Regarding 'TaskExecutorSpecifics', how do you think about naming it > > 'TaskExecutorResourceSpecifics'. > > - Regarding passing memory configurations into task executors, I'm in > favor > > of do it via environment variables rather than configurations, with the > > following two reasons. > > - It is easier to keep the memory options once calculate not to be > > changed with environment variables rather than configurations. > > - I'm not sure whether we should write the configuration in startup > > scripts. Writing changes into the configuration files when running the > > startup scripts does not sounds right to me. Or we could make a copy of > > configuration files per flink cluster, and make the task executor to load > > from the copy, and clean up the copy after the cluster is shutdown, which > > is complicated. (I think this is also what Stephan means in his comment > on > > the wiki page?) > > - Regarding reserving memory, I think this change should be included in > > this FLIP. I think a big part of motivations of this FLIP is to unify > > memory configuration for streaming / batch and make it easy for > configuring > > rocksdb memory. If we don't support memory reservation, then streaming > jobs > > cannot use managed memory (neither on-heap or off-heap), which makes this > > FLIP incomplete. > > - Regarding network memory, I think you are right. I think we probably > > don't need to change network stack from using direct memory to using > unsafe > > native memory. Network memory size is deterministic, cannot be reserved > as > > managed memory does, and cannot be overused. I think it also works if we > > simply keep using direct memory for network and include it in jvm max > > direct memory size. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > Hi Xintong, > > > > > > thanks for addressing the comments and adding a more detailed > > > implementation plan. I have a couple of comments concerning the > > > implementation plan: > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. Choosing > a > > > different name could help here. > > > - I'm not sure whether I would pass the memory configuration to the > > > TaskExecutor via environment variables. I think it would be better to > > write > > > it into the configuration one uses to start the TM process. > > > - If possible, I would exclude the memory reservation from this FLIP > and > > > add this as part of a dedicated FLIP. > > > - If possible, then I would exclude changes to the network stack from > > this > > > FLIP. Maybe we can simply say that the direct memory needed by the > > network > > > stack is the framework direct memory requirement. Changing how the > memory > > > is allocated can happen in a second step. This would keep the scope of > > this > > > FLIP smaller. > > > > > > Cheers, > > > Till > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song <[hidden email]> > > > wrote: > > > > > > > Hi everyone, > > > > > > > > I just updated the FLIP document on wiki [1], with the following > > changes. > > > > > > > > - Removed open question regarding MemorySegment allocation. As > > > > discussed, we exclude this topic from the scope of this FLIP. > > > > - Updated content about JVM direct memory parameter according to > > > recent > > > > discussions, and moved the other options to "Rejected > Alternatives" > > > for > > > > the > > > > moment. > > > > - Added implementation steps. > > > > > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <[hidden email]> > wrote: > > > > > > > > > @Xintong: Concerning "wait for memory users before task dispose and > > > > memory > > > > > release": I agree, that's how it should be. Let's try it out. > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC when > > > allocating > > > > > direct memory buffer": There seems to be pretty elaborate logic to > > free > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like going > > with > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), then I > > > think > > > > it > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > "off_heap_managed_memory + direct_memory" even if we use RocksDB. > > That > > > > is a > > > > > big if, though, I honestly have no idea :D Would be good to > > understand > > > > > this, though, because this would affect option (2) and option > (1.2). > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > Let me try to summarize your points. Please correct me if I'm > > wrong. > > > > > > > > > > > > - Memory consumers should always avoid returning memory > segments > > > to > > > > > > memory manager while there are still un-cleaned structures / > > > threads > > > > > > that > > > > > > may use the memory. Otherwise, it would cause serious problems > > by > > > > > having > > > > > > multiple consumers trying to use the same memory segment. > > > > > > - JVM does not wait for GC when allocating direct memory > buffer. > > > > > > Therefore even we set proper max direct memory size limit, we > > may > > > > > still > > > > > > encounter direct memory oom if the GC cleaning memory slower > > than > > > > the > > > > > > direct memory allocation. > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > [hidden email] > > > > > > .invalid> > > > > > > wrote: > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > if additional threads not cleanly shut down before we can exit > > the > > > > > task: > > > > > > > In the current case of memory reuse, it has freed up the memory > > it > > > > > > > uses. If this memory is used by other tasks and asynchronous > > > threads > > > > > > > of exited task may still be writing, there will be concurrent > > > > security > > > > > > > problems, and even lead to errors in user computing results. > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No matter > what > > > the > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > I don't think it is a good idea, I've encountered so many > > > situations > > > > > > > that it's too late for GC to cause DirectMemory OOM. Release > and > > > > > > > allocate DirectMemory depend on the type of user job, which is > > > > > > > often beyond our control. > > > > > > > > > > > > > > Best, > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > To:dev <[hidden email]> > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory Configuration for > > > > > > > TaskExecutors > > > > > > > > > > > > > > My main concern with option 2 (manually release memory) is that > > > > > segfaults > > > > > > > in the JVM send off all sorts of alarms on user ends. So we > need > > to > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures / > algorithms > > > > with > > > > > > > additional threads, like hash table spill/read and sorting > > threads. > > > > We > > > > > > need > > > > > > > to ensure that these cleanly shut down before we can exit the > > task. > > > > > > > I am not sure that we have that guaranteed already, that's why > > > option > > > > > 1.1 > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this way > really > > > > makes > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I think it > > is > > > > not > > > > > > that > > > > > > > > difficult to keep it segfault safe for memory manager, as > long > > as > > > > we > > > > > > > always > > > > > > > > de-allocate the memory segment when it is released from the > > > memory > > > > > > > > consumers. Only if the memory consumer continue using the > > buffer > > > of > > > > > > > memory > > > > > > > > segment after releasing it, in which case we do want the job > to > > > > fail > > > > > so > > > > > > > we > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. Not only > > > because > > > > > the > > > > > > > > assumption (regular GC is enough to clean direct buffers) may > > not > > > > > > always > > > > > > > be > > > > > > > > true, but also it makes harder for finding problems in cases > of > > > > > memory > > > > > > > > overuse. E.g., user configured some direct memory for the > user > > > > > > libraries. > > > > > > > > If the library actually use more direct memory then > configured, > > > > which > > > > > > > > cannot be cleaned by GC because they are still in use, may > lead > > > to > > > > > > > overuse > > > > > > > > of the total container memory. In that case, if it didn't > touch > > > the > > > > > JVM > > > > > > > > default max direct memory limit, we cannot get a direct > memory > > > OOM > > > > > and > > > > > > it > > > > > > > > will become super hard to understand which part of the > > > > configuration > > > > > > need > > > > > > > > to be updated. > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if the > > > exceeded > > > > > > direct > > > > > > > > memory does not reach the max direct memory limit specified > by > > > the > > > > > > > > dedicated parameter. I think it is slightly better than 1.2, > > only > > > > > > because > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, maybe let > me > > > > > > summarize > > > > > > > > it a > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the GC. That > > makes > > > > it > > > > > > > > segfault > > > > > > > > > safe. But then we need a way to trigger GC in case > > > de-allocation > > > > > and > > > > > > > > > re-allocation of a bunch of segments happens quickly, which > > is > > > > > often > > > > > > > the > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is one way > to > > do > > > > > this > > > > > > > > > - Another way could be to have a dedicated bookkeeping in > > the > > > > > > > > > MemoryManager (option 1.2), so that this is a number > > > independent > > > > of > > > > > > the > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory for the > > > > > > > > MemorySegments > > > > > > > > > (option 2). That way we need not worry about triggering GC > by > > > > some > > > > > > > > > threshold or bookkeeping, but it is harder to prevent > > > segfaults. > > > > We > > > > > > > need > > > > > > > > to > > > > > > > > > be very careful about when we release the memory segments > > (only > > > > in > > > > > > the > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > > direct_memory" > > > > > > > > and > > > > > > > > > have "direct_memory" as a separate reserved memory pool. > > > Because > > > > if > > > > > > we > > > > > > > > just > > > > > > > > > set "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > > > > jvm_overhead", > > > > > > > > > then there will be times when that entire memory is > allocated > > > by > > > > > > direct > > > > > > > > > buffers and we have nothing left for the JVM overhead. So > we > > > > either > > > > > > > need > > > > > > > > a > > > > > > > > > way to compensate for that (again some safety margin cutoff > > > > value) > > > > > or > > > > > > > we > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that it takes > > > > > elaborate > > > > > > > > logic > > > > > > > > > to push recycling of direct buffers without always > > triggering a > > > > > full > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest to do in > > the > > > > > > > following > > > > > > > > > order: > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory parameter, as > > > > > discussed > > > > > > > > > above. We would need to find a way to set the direct_memory > > > > > parameter > > > > > > > by > > > > > > > > > default. We could start with 64 MB and see how it goes in > > > > practice. > > > > > > One > > > > > > > > > danger I see is that setting this loo low can cause a bunch > > of > > > > > > > additional > > > > > > > > > GCs compared to before (we need to watch this carefully). > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to implement, we > > > could > > > > > try > > > > > > > how > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > "-XX:MaxDirectMemorySize" > > > > > > > > parameter > > > > > > > > > at all and assume that all the direct memory allocations > that > > > the > > > > > JVM > > > > > > > and > > > > > > > > > Netty do are infrequent enough to be cleaned up fast enough > > > > through > > > > > > > > regular > > > > > > > > > GC. I am not sure if that is a valid assumption, though. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > whether > > > we > > > > > can > > > > > > > > avoid > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > > network > > > > > > > memory > > > > > > > > > with > > > > > > > > > > alternative 3. But after giving it a second thought, I > > think > > > > even > > > > > > for > > > > > > > > > > alternative 3 using direct memory for off-heap managed > > memory > > > > > could > > > > > > > > cause > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > FLIP > > it > > > > to > > > > > > have > > > > > > > > > both > > > > > > > > > > off-heap managed memory and network memory allocated > > through > > > > > > > > > > Unsafe.allocate(), which means they are practically > native > > > > memory > > > > > > and > > > > > > > > not > > > > > > > > > > limited by JVM max direct memory. The only parts of > memory > > > > > limited > > > > > > by > > > > > > > > JVM > > > > > > > > > > max direct memory are task off-heap memory and JVM > > overhead, > > > > > which > > > > > > > are > > > > > > > > > > exactly alternative 2 suggests to set the JVM max direct > > > memory > > > > > to. > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand the > > two > > > > > > > > alternatives > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > things > > > > > > explicit. > > > > > > > If > > > > > > > > > we > > > > > > > > > > > don't limit the direct memory, I fear that we might end > > up > > > > in a > > > > > > > > similar > > > > > > > > > > > situation as we are currently in: The user might see > that > > > her > > > > > > > process > > > > > > > > > > gets > > > > > > > > > > > killed by the OS and does not know why this is the > case. > > > > > > > > Consequently, > > > > > > > > > > she > > > > > > > > > > > tries to decrease the process memory size (similar to > > > > > increasing > > > > > > > the > > > > > > > > > > cutoff > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > memory. > > > > > Even > > > > > > > > worse, > > > > > > > > > > she > > > > > > > > > > > tries to decrease memory budgets which are not fully > used > > > and > > > > > > hence > > > > > > > > > won't > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > Overhead): > > > > > 200MB > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > Off-Heap > > > > > Managed > > > > > > > > Memory > > > > > > > > > > and > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize to > > > 200MB. > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize to > a > > > very > > > > > > large > > > > > > > > > > value, > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > Memory > > > > and > > > > > > JVM > > > > > > > > > > > Overhead > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > alternative 3 > > > > > > should > > > > > > > > have > > > > > > > > > > the > > > > > > > > > > > > same utility. Setting larger -XX:MaxDirectMemorySize > > will > > > > not > > > > > > > > reduce > > > > > > > > > > the > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > Memory > > > > and > > > > > > JVM > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > avoid > > > > that, > > > > > > the > > > > > > > > only > > > > > > > > > > > thing > > > > > > > > > > > > user can do is to modify the configuration and > > > increase > > > > > JVM > > > > > > > > Direct > > > > > > > > > > > > Memory > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > > that > > > > user > > > > > > > > > increases > > > > > > > > > > > JVM > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the total > > > size > > > > of > > > > > > > other > > > > > > > > > > > memory > > > > > > > > > > > > pools to 750MB, given the total process memory > > remains > > > > > 1GB. > > > > > > > > > > > > - For alternative 3, there is no chance of direct > > OOM. > > > > > There > > > > > > > are > > > > > > > > > > > chances > > > > > > > > > > > > of exceeding the total process memory limit, but > > given > > > > > that > > > > > > > the > > > > > > > > > > > process > > > > > > > > > > > > may > > > > > > > > > > > > not use up all the reserved native memory > (Off-Heap > > > > > Managed > > > > > > > > > Memory, > > > > > > > > > > > > Network > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > memory > > > > usage > > > > > is > > > > > > > > > > slightly > > > > > > > > > > > > above > > > > > > > > > > > > yet very close to 200MB, user probably do not need > > to > > > > > change > > > > > > > the > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > feasible > > > > > > > > > > configuration > > > > > > > > > > > > for alternative 2 may lead to lower resource > > utilization > > > > > > compared > > > > > > > > to > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > difference > > > > > between > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to Task > > > > > Off-Heap > > > > > > > > Memory > > > > > > > > > > and > > > > > > > > > > > > JVM > > > > > > > > > > > > > Overhead. Then there is the risk that this size is > > too > > > > low > > > > > > > > > resulting > > > > > > > > > > > in a > > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > something > > > > > > larger > > > > > > > > > than > > > > > > > > > > > > > alternative 2. This would of course reduce the > sizes > > of > > > > the > > > > > > > other > > > > > > > > > > > memory > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > utilization > > > > > of > > > > > > > > > memory > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > strictly > > > > sets a > > > > > > > > higher > > > > > > > > > > max > > > > > > > > > > > > > direct memory size and we use only little, then I > > would > > > > > > expect > > > > > > > > that > > > > > > > > > > > > > alternative 3 results in memory under utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > memory > > > size > > > > > > when > > > > > > > we > > > > > > > > > do > > > > > > > > > > > not > > > > > > > > > > > > > > differentiate direct and native memory. If the > > direct > > > > > > > > > > > memory,including > > > > > > > > > > > > > user > > > > > > > > > > > > > > direct memory and framework direct memory,could > be > > > > > > calculated > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > i am in favor of setting direct memory with fixed > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need to > > > check > > > > > the > > > > > > > > > memory > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > successfully > > > > > > and > > > > > > > > > > failing > > > > > > > > > > > > in > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>于2019年8月13日 > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right that > > we > > > > > should > > > > > > > not > > > > > > > > > > > include > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > should > > > > > > > concentrate > > > > > > > > > on > > > > > > > > > > > how > > > > > > > > > > > > to > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, with > > > > minimum > > > > > > > > > > involvement > > > > > > > > > > > on > > > > > > > > > > > > > how > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 may > > not > > > > > having > > > > > > > the > > > > > > > > > > same > > > > > > > > > > > > over > > > > > > > > > > > > > > > reservation issue that alternative 2 does, but > at > > > the > > > > > > cost > > > > > > > of > > > > > > > > > > risk > > > > > > > > > > > of > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > using memory at the container level, which is > not > > > > good. > > > > > > My > > > > > > > > > point > > > > > > > > > > is > > > > > > > > > > > > > that > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM Overhead" > > are > > > > not > > > > > > easy > > > > > > > > to > > > > > > > > > > > > config. > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > alternative 2, users might configure them > higher > > > than > > > > > > what > > > > > > > > > > actually > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > alternative > > > > 3, > > > > > > > users > > > > > > > > do > > > > > > > > > > not > > > > > > > > > > > > get > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > options > > > > > > > > aggressively > > > > > > > > > > > high. > > > > > > > > > > > > > But > > > > > > > > > > > > > > > the consequences are risks of overall container > > > > memory > > > > > > > usage > > > > > > > > > > > exceeds > > > > > > > > > > > > > the > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till Rohrmann < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > good. > > > > > > > Concerning > > > > > > > > > the > > > > > > > > > > > > first > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > question about allocating memory segments, I > > was > > > > > > > wondering > > > > > > > > > > > whether > > > > > > > > > > > > > this > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > strictly necessary to do in the context of > this > > > > FLIP > > > > > or > > > > > > > > > whether > > > > > > > > > > > > this > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > > details, > > > > > I > > > > > > > > would > > > > > > > > > be > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > that we would widen the scope of this FLIP > too > > > much > > > > > > > because > > > > > > > > > we > > > > > > > > > > > > would > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > > MemoryManager > > > > > > > > > where > > > > > > > > > > > we > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > memory segments (this should mainly be batch > > > > > > operators). > > > > > > > > The > > > > > > > > > > > > addition > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > the memory reservation call to the > > MemoryManager > > > > > should > > > > > > > not > > > > > > > > > be > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > this and I would hope that this is the only > > point > > > > of > > > > > > > > > > interaction > > > > > > > > > > > a > > > > > > > > > > > > > > > > streaming job would have with the > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > setting > > > > or > > > > > > not > > > > > > > > > > setting > > > > > > > > > > > a > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory limit, I would also be > interested > > > why > > > > > > Yang > > > > > > > > Wang > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > leaving it open would be best. My concern > about > > > > this > > > > > > > would > > > > > > > > be > > > > > > > > > > > that > > > > > > > > > > > > we > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > be in a similar situation as we are now with > > the > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > the different memory pools are not clearly > > > > separated > > > > > > and > > > > > > > > can > > > > > > > > > > > spill > > > > > > > > > > > > > over > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > > understand > > > > > > > what > > > > > > > > > > > exactly > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > process to get killed for using too much > > memory. > > > > This > > > > > > > could > > > > > > > > > > then > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > lead to a similar situation what we have with > > the > > > > > > > > > cutoff-ratio. > > > > > > > > > > > So > > > > > > > > > > > > > why > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > setting a sane default value for max direct > > > memory > > > > > and > > > > > > > > giving > > > > > > > > > > the > > > > > > > > > > > > > user > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > option to increase it if he runs into an OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > lower > > > > > memory > > > > > > > > > > > utilization > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > alternative 3 where we set the direct memory > > to a > > > > > > higher > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong Song < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > I think setting a very large max direct > > memory > > > > size > > > > > > > > > > definitely > > > > > > > > > > > > has > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > direct > > > > OOM, > > > > > > and > > > > > > > > we > > > > > > > > > > > don't > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > However, there are also some down sides of > > > doing > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if a > > task > > > > > > > executor > > > > > > > > > > > > container > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > killed due to overusing memory, it could > > be > > > > hard > > > > > > for > > > > > > > > use > > > > > > > > > > to > > > > > > > > > > > > know > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > - Another down side is that the JVM > never > > > > > trigger > > > > > > GC > > > > > > > > due > > > > > > > > > > to > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory limit, because the limit > is > > > too > > > > > high > > > > > > > to > > > > > > > > be > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > means we kind of relay on heap memory to > > > > trigger > > > > > > GC > > > > > > > > and > > > > > > > > > > > > release > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > memory. That could be a problem in cases > > > where > > > > > we > > > > > > > have > > > > > > > > > > more > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > usage but not enough heap activity to > > > trigger > > > > > the > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > preferring > > > > > > > setting a > > > > > > > > > > very > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > > configuration > > > > > > > > > that > > > > > > > > > > > user > > > > > > > > > > > > > > > > > explicitly specified, I think we should > throw > > > an > > > > > > error. > > > > > > > > > > > > > > > > > I think doing checking on the client side > is > > a > > > > good > > > > > > > idea, > > > > > > > > > so > > > > > > > > > > > that > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > submitting > > > > > the > > > > > > > > Flink > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > But we can not only rely on the client side > > > > > checking, > > > > > > > > > because > > > > > > > > > > > for > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > different > > > > > machines > > > > > > > may > > > > > > > > > > have > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > configurations and the client does see > that. > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. After > > all > > > > the > > > > > > > memory > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > > control > > > > > the > > > > > > > > flink > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > memory > > > and > > > > > > native > > > > > > > > > > memory. > > > > > > > > > > > > > They > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > included in task off-heap memory. Right? > > So i > > > > > don’t > > > > > > > > think > > > > > > > > > > we > > > > > > > > > > > > > could > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > > prefer > > > > > > > leaving > > > > > > > > > it a > > > > > > > > > > > > very > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > memory(network > > > > > > memory, > > > > > > > > > > managed > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > is larger than total process memory, how > do > > > we > > > > > deal > > > > > > > > with > > > > > > > > > > this > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > we need to check the memory configuration > > in > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > 于2019年8月7日周三 > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > thread > > > on > > > > > > > > "FLIP-49: > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > > where > > > we > > > > > > > > describe > > > > > > > > > > how > > > > > > > > > > > to > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. The > > > FLIP > > > > > > > document > > > > > > > > > is > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > Configuration > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > with updates from follow-up discussions > > > both > > > > > > online > > > > > > > > and > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > shortcomings > > of > > > > > > current > > > > > > > > > > (Flink > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > Streaming > > > > and > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > - Complex and difficult > configuration > > of > > > > > > RocksDB > > > > > > > > in > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard to > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can > be > > > > > > summarized > > > > > > > > as > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > account > > > > for > > > > > > > memory > > > > > > > > > > usage > > > > > > > > > > > > by > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory is > > > > > > partitioned > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > options > > > > and > > > > > > > > > > calculations > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP > wiki > > > > > > document > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design doc > > [2] > > > is > > > > > out > > > > > > > of > > > > > > > > > > sync, > > > > > > > > > > > > and > > > > > > > > > > > > > it > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > > this > > > > > > mailing > > > > > > > > list > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > whether > > > we > > > > > can > > > > > > > > avoid > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > > network > > > > > > > memory > > > > > > > > > with > > > > > > > > > > alternative 3. But after giving it a second thought, I > > think > > > > even > > > > > > for > > > > > > > > > > alternative 3 using direct memory for off-heap managed > > memory > > > > > could > > > > > > > > cause > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > FLIP > > it > > > > to > > > > > > have > > > > > > > > > both > > > > > > > > > > off-heap managed memory and network memory allocated > > through > > > > > > > > > > Unsafe.allocate(), which means they are practically > native > > > > memory > > > > > > and > > > > > > > > not > > > > > > > > > > limited by JVM max direct memory. The only parts of > memory > > > > > limited > > > > > > by > > > > > > > > JVM > > > > > > > > > > max direct memory are task off-heap memory and JVM > > overhead, > > > > > which > > > > > > > are > > > > > > > > > > exactly alternative 2 suggests to set the JVM max direct > > > memory > > > > > to. > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand the > > two > > > > > > > > alternatives > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > things > > > > > > explicit. > > > > > > > If > > > > > > > > > we > > > > > > > > > > > don't limit the direct memory, I fear that we might end > > up > > > > in a > > > > > > > > similar > > > > > > > > > > > situation as we are currently in: The user might see > that > > > her > > > > > > > process > > > > > > > > > > gets > > > > > > > > > > > killed by the OS and does not know why this is the > case. > > > > > > > > Consequently, > > > > > > > > > > she > > > > > > > > > > > tries to decrease the process memory size (similar to > > > > > increasing > > > > > > > the > > > > > > > > > > cutoff > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > memory. > > > > > Even > > > > > > > > worse, > > > > > > > > > > she > > > > > > > > > > > tries to decrease memory budgets which are not fully > used > > > and > > > > > > hence > > > > > > > > > won't > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > Overhead): > > > > > 200MB > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > Off-Heap > > > > > Managed > > > > > > > > Memory > > > > > > > > > > and > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize to > > > 200MB. > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize to > a > > > very > > > > > > large > > > > > > > > > > value, > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > Memory > > > > and > > > > > > JVM > > > > > > > > > > > Overhead > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > alternative 3 > > > > > > should > > > > > > > > have > > > > > > > > > > the > > > > > > > > > > > > same utility. Setting larger -XX:MaxDirectMemorySize > > will > > > > not > > > > > > > > reduce > > > > > > > > > > the > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > Memory > > > > and > > > > > > JVM > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > avoid > > > > that, > > > > > > the > > > > > > > > only > > > > > > > > > > > thing > > > > > > > > > > > > user can do is to modify the configuration and > > > increase > > > > > JVM > > > > > > > > Direct > > > > > > > > > > > > Memory > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > > that > > > > user > > > > > > > > > increases > > > > > > > > > > > JVM > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the total > > > size > > > > of > > > > > > > other > > > > > > > > > > > memory > > > > > > > > > > > > pools to 750MB, given the total process memory > > remains > > > > > 1GB. > > > > > > > > > > > > - For alternative 3, there is no chance of direct > > OOM. > > > > > There > > > > > > > are > > > > > > > > > > > chances > > > > > > > > > > > > of exceeding the total process memory limit, but > > given > > > > > that > > > > > > > the > > > > > > > > > > > process > > > > > > > > > > > > may > > > > > > > > > > > > not use up all the reserved native memory > (Off-Heap > > > > > Managed > > > > > > > > > Memory, > > > > > > > > > > > > Network > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > memory > > > > usage > > > > > is > > > > > > > > > > slightly > > > > > > > > > > > > above > > > > > > > > > > > > yet very close to 200MB, user probably do not need > > to > > > > > change > > > > > > > the > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > feasible > > > > > > > > > > configuration > > > > > > > > > > > > for alternative 2 may lead to lower resource > > utilization > > > > > > compared > > > > > > > > to > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > difference > > > > > between > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to Task > > > > > Off-Heap > > > > > > > > Memory > > > > > > > > > > and > > > > > > > > > > > > JVM > > > > > > > > > > > > > Overhead. Then there is the risk that this size is > > too > > > > low > > > > > > > > > resulting > > > > > > > > > > > in a > > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > something > > > > > > larger > > > > > > > > > than > > > > > > > > > > > > > alternative 2. This would of course reduce the > sizes > > of > > > > the > > > > > > > other > > > > > > > > > > > memory > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > utilization > > > > > of > > > > > > > > > memory > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > strictly > > > > sets a > > > > > > > > higher > > > > > > > > > > max > > > > > > > > > > > > > direct memory size and we use only little, then I > > would > > > > > > expect > > > > > > > > that > > > > > > > > > > > > > alternative 3 results in memory under utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > memory > > > size > > > > > > when > > > > > > > we > > > > > > > > > do > > > > > > > > > > > not > > > > > > > > > > > > > > differentiate direct and native memory. If the > > direct > > > > > > > > > > > memory,including > > > > > > > > > > > > > user > > > > > > > > > > > > > > direct memory and framework direct memory,could > be > > > > > > calculated > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > i am in favor of setting direct memory with fixed > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need to > > > check > > > > > the > > > > > > > > > memory > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > successfully > > > > > > and > > > > > > > > > > failing > > > > > > > > > > > > in > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>于2019年8月13日 > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right that > > we > > > > > should > > > > > > > not > > > > > > > > > > > include > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > should > > > > > > > concentrate > > > > > > > > > on > > > > > > > > > > > how > > > > > > > > > > > > to > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, with > > > > minimum > > > > > > > > > > involvement > > > > > > > > > > > on > > > > > > > > > > > > > how > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 may > > not > > > > > having > > > > > > > the > > > > > > > > > > same > > > > > > > > > > > > over > > > > > > > > > > > > > > > reservation issue that alternative 2 does, but > at > > > the > > > > > > cost > > > > > > > of > > > > > > > > > > risk > > > > > > > > > > > of > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > using memory at the container level, which is > not > > > > good. > > > > > > My > > > > > > > > > point > > > > > > > > > > is > > > > > > > > > > > > > that > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM Overhead" > > are > > > > not > > > > > > easy > > > > > > > > to > > > > > > > > > > > > config. > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > alternative 2, users might configure them > higher > > > than > > > > > > what > > > > > > > > > > actually > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > alternative > > > > 3, > > > > > > > users > > > > > > > > do > > > > > > > > > > not > > > > > > > > > > > > get > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > options > > > > > > > > aggressively > > > > > > > > > > > high. > > > > > > > > > > > > > But > > > > > > > > > > > > > > > the consequences are risks of overall container > > > > memory > > > > > > > usage > > > > > > > > > > > exceeds > > > > > > > > > > > > > the > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till Rohrmann < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > good. > > > > > > > Concerning > > > > > > > > > the > > > > > > > > > > > > first > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > question about allocating memory segments, I > > was > > > > > > > wondering > > > > > > > > > > > whether > > > > > > > > > > > > > this > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > strictly necessary to do in the context of > this > > > > FLIP > > > > > or > > > > > > > > > whether > > > > > > > > > > > > this > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > > details, > > > > > I > > > > > > > > would > > > > > > > > > be > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > that we would widen the scope of this FLIP > too > > > much > > > > > > > because > > > > > > > > > we > > > > > > > > > > > > would > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > > MemoryManager > > > > > > > > > where > > > > > > > > > > > we > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > memory segments (this should mainly be batch > > > > > > operators). > > > > > > > > The > > > > > > > > > > > > addition > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > the memory reservation call to the > > MemoryManager > > > > > should > > > > > > > not > > > > > > > > > be > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > this and I would hope that this is the only > > point > > > > of > > > > > > > > > > interaction > > > > > > > > > > > a > > > > > > > > > > > > > > > > streaming job would have with the > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > setting > > > > or > > > > > > not > > > > > > > > > > setting > > > > > > > > > > > a > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory limit, I would also be > interested > > > why > > > > > > Yang > > > > > > > > Wang > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > leaving it open would be best. My concern > about > > > > this > > > > > > > would > > > > > > > > be > > > > > > > > > > > that > > > > > > > > > > > > we > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > be in a similar situation as we are now with > > the > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > the different memory pools are not clearly > > > > separated > > > > > > and > > > > > > > > can > > > > > > > > > > > spill > > > > > > > > > > > > > over > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > > understand > > > > > > > what > > > > > > > > > > > exactly > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > process to get killed for using too much > > memory. > > > > This > > > > > > > could > > > > > > > > > > then > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > lead to a similar situation what we have with > > the > > > > > > > > > cutoff-ratio. > > > > > > > > > > > So > > > > > > > > > > > > > why > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > setting a sane default value for max direct > > > memory > > > > > and > > > > > > > > giving > > > > > > > > > > the > > > > > > > > > > > > > user > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > option to increase it if he runs into an OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > lower > > > > > memory > > > > > > > > > > > utilization > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > alternative 3 where we set the direct memory > > to a > > > > > > higher > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong Song < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > I think setting a very large max direct > > memory > > > > size > > > > > > > > > > definitely > > > > > > > > > > > > has > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > direct > > > > OOM, > > > > > > and > > > > > > > > we > > > > > > > > > > > don't > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > However, there are also some down sides of > > > doing > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if a > > task > > > > > > > executor > > > > > > > > > > > > container > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > killed due to overusing memory, it could > > be > > > > hard > > > > > > for > > > > > > > > use > > > > > > > > > > to > > > > > > > > > > > > know > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > - Another down side is that the JVM > never > > > > > trigger > > > > > > GC > > > > > > > > due > > > > > > > > > > to > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory limit, because the limit > is > > > too > > > > > high > > > > > > > to > > > > > > > > be > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > means we kind of relay on heap memory to > > > > trigger > > > > > > GC > > > > > > > > and > > > > > > > > > > > > release > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > memory. That could be a problem in cases > > > where > > > > > we > > > > > > > have > > > > > > > > > > more > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > usage but not enough heap activity to > > > trigger > > > > > the > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > preferring > > > > > > > setting a > > > > > > > > > > very > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > > configuration > > > > > > > > > that > > > > > > > > > > > user > > > > > > > > > > > > > > > > > explicitly specified, I think we should > throw > > > an > > > > > > error. > > > > > > > > > > > > > > > > > I think doing checking on the client side > is > > a > > > > good > > > > > > > idea, > > > > > > > > > so > > > > > > > > > > > that > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > submitting > > > > > the > > > > > > > > Flink > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > But we can not only rely on the client side > > > > > checking, > > > > > > > > > because > > > > > > > > > > > for > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > different > > > > > machines > > > > > > > may > > > > > > > > > > have > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > configurations and the client does see > that. > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. After > > all > > > > the > > > > > > > memory > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > > control > > > > > the > > > > > > > > flink > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > memory > > > and > > > > > > native > > > > > > > > > > memory. > > > > > > > > > > > > > They > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > included in task off-heap memory. Right? > > So i > > > > > don’t > > > > > > > > think > > > > > > > > > > we > > > > > > > > > > > > > could > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > > prefer > > > > > > > leaving > > > > > > > > > it a > > > > > > > > > > > > very > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > memory(network > > > > > > memory, > > > > > > > > > > managed > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > is larger than total process memory, how > do > > > we > > > > > deal > > > > > > > > with > > > > > > > > > > this > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > we need to check the memory configuration > > in > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > 于2019年8月7日周三 > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > thread > > > on > > > > > > > > "FLIP-49: > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > > where > > > we > > > > > > > > describe > > > > > > > > > > how > > > > > > > > > > > to > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. The > > > FLIP > > > > > > > document > > > > > > > > > is > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > Configuration > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > with updates from follow-up discussions > > > both > > > > > > online > > > > > > > > and > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > shortcomings > > of > > > > > > current > > > > > > > > > > (Flink > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > Streaming > > > > and > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > - Complex and difficult > configuration > > of > > > > > > RocksDB > > > > > > > > in > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard to > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can > be > > > > > > summarized > > > > > > > > as > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > account > > > > for > > > > > > > memory > > > > > > > > > > usage > > > > > > > > > > > > by > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory is > > > > > > partitioned > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > options > > > > and > > > > > > > > > > calculations > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP > wiki > > > > > > document > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design doc > > [2] > > > is > > > > > out > > > > > > > of > > > > > > > > > > sync, > > > > > > > > > > > > and > > > > > > > > > > > > > it > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > > this > > > > > > mailing > > > > > > > > list > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Thanks for the clarification. I have some more comments:
- I would actually split the logic to compute the process memory requirements and storing the values into two things. E.g. one could name the former TaskExecutorProcessUtility and the latter TaskExecutorProcessMemory. But we can discuss this on the PR since it's just a naming detail. - Generally, I'm not opposed to making configuration values overridable by ENV variables. I think this is a very good idea and makes the configurability of Flink processes easier. However, I think that adding this functionality should not be part of this FLIP because it would simply widen the scope unnecessarily. The reasons why I believe it is unnecessary are the following: For Yarn we already create write a flink-conf.yaml which could be populated with the memory settings. For the other processes it should not make a difference whether the loaded Configuration is populated with the memory settings from ENV variables or by using TaskExecutorProcessUtility to compute the missing values from the loaded configuration. If the latter would not be possible (wrong or missing configuration values), then we should not have been able to actually start the process in the first place. - Concerning the memory reservation: I agree with you that we need the memory reservation functionality to make streaming jobs work with "managed" memory. However, w/o this functionality the whole Flip would already bring a good amount of improvements to our users when running batch jobs. Moreover, by keeping the scope smaller we can complete the FLIP faster. Hence, I would propose to address the memory reservation functionality as a follow up FLIP (which Yu is working on if I'm not mistaken). Cheers, Till On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> wrote: > Just add my 2 cents. > > Using environment variables to override the configuration for different > taskmanagers is better. > We do not need to generate dedicated flink-conf.yaml for all taskmanagers. > A common flink-conf.yam and different environment variables are enough. > By reducing the distributed cached files, it could make launching a > taskmanager faster. > > Stephan gives a good suggestion that we could move the logic into > "GlobalConfiguration.loadConfig()" method. > Maybe the client could also benefit from this. Different users do not have > to export FLINK_CONF_DIR to update few config options. > > > Best, > Yang > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > > One note on the Environment Variables and Configuration discussion. > > > > My understanding is that passed ENV variables are added to the > > configuration in the "GlobalConfiguration.loadConfig()" method (or > > similar). > > For all the code inside Flink, it looks like the data was in the config > to > > start with, just that the scripts that compute the variables can pass the > > values to the process without actually needing to write a file. > > > > For example the "GlobalConfiguration.loadConfig()" method would take any > > ENV variable prefixed with "flink" and add it as a config key. > > "flink_taskmanager_memory_size=2g" would become "taskmanager.memory.size: > > 2g". > > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <[hidden email]> > > wrote: > > > > > Thanks for the comments, Till. > > > > > > I've also seen your comments on the wiki page, but let's keep the > > > discussion here. > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about naming it > > > 'TaskExecutorResourceSpecifics'. > > > - Regarding passing memory configurations into task executors, I'm in > > favor > > > of do it via environment variables rather than configurations, with the > > > following two reasons. > > > - It is easier to keep the memory options once calculate not to be > > > changed with environment variables rather than configurations. > > > - I'm not sure whether we should write the configuration in startup > > > scripts. Writing changes into the configuration files when running the > > > startup scripts does not sounds right to me. Or we could make a copy of > > > configuration files per flink cluster, and make the task executor to > load > > > from the copy, and clean up the copy after the cluster is shutdown, > which > > > is complicated. (I think this is also what Stephan means in his comment > > on > > > the wiki page?) > > > - Regarding reserving memory, I think this change should be included in > > > this FLIP. I think a big part of motivations of this FLIP is to unify > > > memory configuration for streaming / batch and make it easy for > > configuring > > > rocksdb memory. If we don't support memory reservation, then streaming > > jobs > > > cannot use managed memory (neither on-heap or off-heap), which makes > this > > > FLIP incomplete. > > > - Regarding network memory, I think you are right. I think we probably > > > don't need to change network stack from using direct memory to using > > unsafe > > > native memory. Network memory size is deterministic, cannot be reserved > > as > > > managed memory does, and cannot be overused. I think it also works if > we > > > simply keep using direct memory for network and include it in jvm max > > > direct memory size. > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <[hidden email]> > > > wrote: > > > > > > > Hi Xintong, > > > > > > > > thanks for addressing the comments and adding a more detailed > > > > implementation plan. I have a couple of comments concerning the > > > > implementation plan: > > > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. > Choosing > > a > > > > different name could help here. > > > > - I'm not sure whether I would pass the memory configuration to the > > > > TaskExecutor via environment variables. I think it would be better to > > > write > > > > it into the configuration one uses to start the TM process. > > > > - If possible, I would exclude the memory reservation from this FLIP > > and > > > > add this as part of a dedicated FLIP. > > > > - If possible, then I would exclude changes to the network stack from > > > this > > > > FLIP. Maybe we can simply say that the direct memory needed by the > > > network > > > > stack is the framework direct memory requirement. Changing how the > > memory > > > > is allocated can happen in a second step. This would keep the scope > of > > > this > > > > FLIP smaller. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song <[hidden email]> > > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I just updated the FLIP document on wiki [1], with the following > > > changes. > > > > > > > > > > - Removed open question regarding MemorySegment allocation. As > > > > > discussed, we exclude this topic from the scope of this FLIP. > > > > > - Updated content about JVM direct memory parameter according to > > > > recent > > > > > discussions, and moved the other options to "Rejected > > Alternatives" > > > > for > > > > > the > > > > > moment. > > > > > - Added implementation steps. > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <[hidden email]> > > wrote: > > > > > > > > > > > @Xintong: Concerning "wait for memory users before task dispose > and > > > > > memory > > > > > > release": I agree, that's how it should be. Let's try it out. > > > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC when > > > > allocating > > > > > > direct memory buffer": There seems to be pretty elaborate logic > to > > > free > > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like going > > > with > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), then > I > > > > think > > > > > it > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > > "off_heap_managed_memory + direct_memory" even if we use RocksDB. > > > That > > > > > is a > > > > > > big if, though, I honestly have no idea :D Would be good to > > > understand > > > > > > this, though, because this would affect option (2) and option > > (1.2). > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > > > Let me try to summarize your points. Please correct me if I'm > > > wrong. > > > > > > > > > > > > > > - Memory consumers should always avoid returning memory > > segments > > > > to > > > > > > > memory manager while there are still un-cleaned structures / > > > > threads > > > > > > > that > > > > > > > may use the memory. Otherwise, it would cause serious > problems > > > by > > > > > > having > > > > > > > multiple consumers trying to use the same memory segment. > > > > > > > - JVM does not wait for GC when allocating direct memory > > buffer. > > > > > > > Therefore even we set proper max direct memory size limit, > we > > > may > > > > > > still > > > > > > > encounter direct memory oom if the GC cleaning memory slower > > > than > > > > > the > > > > > > > direct memory allocation. > > > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > > [hidden email] > > > > > > > .invalid> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > > > if additional threads not cleanly shut down before we can > exit > > > the > > > > > > task: > > > > > > > > In the current case of memory reuse, it has freed up the > memory > > > it > > > > > > > > uses. If this memory is used by other tasks and asynchronous > > > > threads > > > > > > > > of exited task may still be writing, there will be > concurrent > > > > > security > > > > > > > > problems, and even lead to errors in user computing results. > > > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No matter > > what > > > > the > > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > > I don't think it is a good idea, I've encountered so many > > > > situations > > > > > > > > that it's too late for GC to cause DirectMemory OOM. Release > > and > > > > > > > > allocate DirectMemory depend on the type of user job, which > is > > > > > > > > often beyond our control. > > > > > > > > > > > > > > > > Best, > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > > To:dev <[hidden email]> > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory Configuration > for > > > > > > > > TaskExecutors > > > > > > > > > > > > > > > > My main concern with option 2 (manually release memory) is > that > > > > > > segfaults > > > > > > > > in the JVM send off all sorts of alarms on user ends. So we > > need > > > to > > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures / > > algorithms > > > > > with > > > > > > > > additional threads, like hash table spill/read and sorting > > > threads. > > > > > We > > > > > > > need > > > > > > > > to ensure that these cleanly shut down before we can exit the > > > task. > > > > > > > > I am not sure that we have that guaranteed already, that's > why > > > > option > > > > > > 1.1 > > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this way > > really > > > > > makes > > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I think > it > > > is > > > > > not > > > > > > > that > > > > > > > > > difficult to keep it segfault safe for memory manager, as > > long > > > as > > > > > we > > > > > > > > always > > > > > > > > > de-allocate the memory segment when it is released from the > > > > memory > > > > > > > > > consumers. Only if the memory consumer continue using the > > > buffer > > > > of > > > > > > > > memory > > > > > > > > > segment after releasing it, in which case we do want the > job > > to > > > > > fail > > > > > > so > > > > > > > > we > > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. Not only > > > > because > > > > > > the > > > > > > > > > assumption (regular GC is enough to clean direct buffers) > may > > > not > > > > > > > always > > > > > > > > be > > > > > > > > > true, but also it makes harder for finding problems in > cases > > of > > > > > > memory > > > > > > > > > overuse. E.g., user configured some direct memory for the > > user > > > > > > > libraries. > > > > > > > > > If the library actually use more direct memory then > > configured, > > > > > which > > > > > > > > > cannot be cleaned by GC because they are still in use, may > > lead > > > > to > > > > > > > > overuse > > > > > > > > > of the total container memory. In that case, if it didn't > > touch > > > > the > > > > > > JVM > > > > > > > > > default max direct memory limit, we cannot get a direct > > memory > > > > OOM > > > > > > and > > > > > > > it > > > > > > > > > will become super hard to understand which part of the > > > > > configuration > > > > > > > need > > > > > > > > > to be updated. > > > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if the > > > > exceeded > > > > > > > direct > > > > > > > > > memory does not reach the max direct memory limit specified > > by > > > > the > > > > > > > > > dedicated parameter. I think it is slightly better than > 1.2, > > > only > > > > > > > because > > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > [hidden email] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, maybe let > > me > > > > > > > summarize > > > > > > > > > it a > > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the GC. That > > > makes > > > > > it > > > > > > > > > segfault > > > > > > > > > > safe. But then we need a way to trigger GC in case > > > > de-allocation > > > > > > and > > > > > > > > > > re-allocation of a bunch of segments happens quickly, > which > > > is > > > > > > often > > > > > > > > the > > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is one way > > to > > > do > > > > > > this > > > > > > > > > > - Another way could be to have a dedicated bookkeeping > in > > > the > > > > > > > > > > MemoryManager (option 1.2), so that this is a number > > > > independent > > > > > of > > > > > > > the > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory for > the > > > > > > > > > MemorySegments > > > > > > > > > > (option 2). That way we need not worry about triggering > GC > > by > > > > > some > > > > > > > > > > threshold or bookkeeping, but it is harder to prevent > > > > segfaults. > > > > > We > > > > > > > > need > > > > > > > > > to > > > > > > > > > > be very careful about when we release the memory segments > > > (only > > > > > in > > > > > > > the > > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > > "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > > > direct_memory" > > > > > > > > > and > > > > > > > > > > have "direct_memory" as a separate reserved memory pool. > > > > Because > > > > > if > > > > > > > we > > > > > > > > > just > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > "off_heap_managed_memory + > > > > > > > > > jvm_overhead", > > > > > > > > > > then there will be times when that entire memory is > > allocated > > > > by > > > > > > > direct > > > > > > > > > > buffers and we have nothing left for the JVM overhead. So > > we > > > > > either > > > > > > > > need > > > > > > > > > a > > > > > > > > > > way to compensate for that (again some safety margin > cutoff > > > > > value) > > > > > > or > > > > > > > > we > > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that it > takes > > > > > > elaborate > > > > > > > > > logic > > > > > > > > > > to push recycling of direct buffers without always > > > triggering a > > > > > > full > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest to do > in > > > the > > > > > > > > following > > > > > > > > > > order: > > > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory parameter, > as > > > > > > discussed > > > > > > > > > > above. We would need to find a way to set the > direct_memory > > > > > > parameter > > > > > > > > by > > > > > > > > > > default. We could start with 64 MB and see how it goes in > > > > > practice. > > > > > > > One > > > > > > > > > > danger I see is that setting this loo low can cause a > bunch > > > of > > > > > > > > additional > > > > > > > > > > GCs compared to before (we need to watch this carefully). > > > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to implement, > we > > > > could > > > > > > try > > > > > > > > how > > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > > "-XX:MaxDirectMemorySize" > > > > > > > > > parameter > > > > > > > > > > at all and assume that all the direct memory allocations > > that > > > > the > > > > > > JVM > > > > > > > > and > > > > > > > > > > Netty do are infrequent enough to be cleaned up fast > enough > > > > > through > > > > > > > > > regular > > > > > > > > > > GC. I am not sure if that is a valid assumption, though. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > whether > > > > we > > > > > > can > > > > > > > > > avoid > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > > > network > > > > > > > > memory > > > > > > > > > > with > > > > > > > > > > > alternative 3. But after giving it a second thought, I > > > think > > > > > even > > > > > > > for > > > > > > > > > > > alternative 3 using direct memory for off-heap managed > > > memory > > > > > > could > > > > > > > > > cause > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > > FLIP > > > it > > > > > to > > > > > > > have > > > > > > > > > > both > > > > > > > > > > > off-heap managed memory and network memory allocated > > > through > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > native > > > > > memory > > > > > > > and > > > > > > > > > not > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > memory > > > > > > limited > > > > > > > by > > > > > > > > > JVM > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > overhead, > > > > > > which > > > > > > > > are > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > direct > > > > memory > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand > the > > > two > > > > > > > > > alternatives > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > things > > > > > > > explicit. > > > > > > > > If > > > > > > > > > > we > > > > > > > > > > > > don't limit the direct memory, I fear that we might > end > > > up > > > > > in a > > > > > > > > > similar > > > > > > > > > > > > situation as we are currently in: The user might see > > that > > > > her > > > > > > > > process > > > > > > > > > > > gets > > > > > > > > > > > > killed by the OS and does not know why this is the > > case. > > > > > > > > > Consequently, > > > > > > > > > > > she > > > > > > > > > > > > tries to decrease the process memory size (similar to > > > > > > increasing > > > > > > > > the > > > > > > > > > > > cutoff > > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > > memory. > > > > > > Even > > > > > > > > > worse, > > > > > > > > > > > she > > > > > > > > > > > > tries to decrease memory budgets which are not fully > > used > > > > and > > > > > > > hence > > > > > > > > > > won't > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > Overhead): > > > > > > 200MB > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > Off-Heap > > > > > > Managed > > > > > > > > > Memory > > > > > > > > > > > and > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize > to > > > > 200MB. > > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize > to > > a > > > > very > > > > > > > large > > > > > > > > > > > value, > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > > Memory > > > > > and > > > > > > > JVM > > > > > > > > > > > > Overhead > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > alternative 3 > > > > > > > should > > > > > > > > > have > > > > > > > > > > > the > > > > > > > > > > > > > same utility. Setting larger > -XX:MaxDirectMemorySize > > > will > > > > > not > > > > > > > > > reduce > > > > > > > > > > > the > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > > Memory > > > > > and > > > > > > > JVM > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > > avoid > > > > > that, > > > > > > > the > > > > > > > > > only > > > > > > > > > > > > thing > > > > > > > > > > > > > user can do is to modify the configuration and > > > > increase > > > > > > JVM > > > > > > > > > Direct > > > > > > > > > > > > > Memory > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > > > that > > > > > user > > > > > > > > > > increases > > > > > > > > > > > > JVM > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > total > > > > size > > > > > of > > > > > > > > other > > > > > > > > > > > > memory > > > > > > > > > > > > > pools to 750MB, given the total process memory > > > remains > > > > > > 1GB. > > > > > > > > > > > > > - For alternative 3, there is no chance of > direct > > > OOM. > > > > > > There > > > > > > > > are > > > > > > > > > > > > chances > > > > > > > > > > > > > of exceeding the total process memory limit, but > > > given > > > > > > that > > > > > > > > the > > > > > > > > > > > > process > > > > > > > > > > > > > may > > > > > > > > > > > > > not use up all the reserved native memory > > (Off-Heap > > > > > > Managed > > > > > > > > > > Memory, > > > > > > > > > > > > > Network > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > memory > > > > > usage > > > > > > is > > > > > > > > > > > slightly > > > > > > > > > > > > > above > > > > > > > > > > > > > yet very close to 200MB, user probably do not > need > > > to > > > > > > change > > > > > > > > the > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > > feasible > > > > > > > > > > > configuration > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > utilization > > > > > > > compared > > > > > > > > > to > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > difference > > > > > > between > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to > Task > > > > > > Off-Heap > > > > > > > > > Memory > > > > > > > > > > > and > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > Overhead. Then there is the risk that this size > is > > > too > > > > > low > > > > > > > > > > resulting > > > > > > > > > > > > in a > > > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > > something > > > > > > > larger > > > > > > > > > > than > > > > > > > > > > > > > > alternative 2. This would of course reduce the > > sizes > > > of > > > > > the > > > > > > > > other > > > > > > > > > > > > memory > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > > utilization > > > > > > of > > > > > > > > > > memory > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > strictly > > > > > sets a > > > > > > > > > higher > > > > > > > > > > > max > > > > > > > > > > > > > > direct memory size and we use only little, then I > > > would > > > > > > > expect > > > > > > > > > that > > > > > > > > > > > > > > alternative 3 results in memory under > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > memory > > > > size > > > > > > > when > > > > > > > > we > > > > > > > > > > do > > > > > > > > > > > > not > > > > > > > > > > > > > > > differentiate direct and native memory. If the > > > direct > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > direct memory and framework direct memory,could > > be > > > > > > > calculated > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > i am in favor of setting direct memory with > fixed > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need > to > > > > check > > > > > > the > > > > > > > > > > memory > > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > > successfully > > > > > > > and > > > > > > > > > > > failing > > > > > > > > > > > > > in > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > >于2019年8月13日 > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right > that > > > we > > > > > > should > > > > > > > > not > > > > > > > > > > > > include > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > > should > > > > > > > > concentrate > > > > > > > > > > on > > > > > > > > > > > > how > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > with > > > > > minimum > > > > > > > > > > > involvement > > > > > > > > > > > > on > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 > may > > > not > > > > > > having > > > > > > > > the > > > > > > > > > > > same > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > reservation issue that alternative 2 does, > but > > at > > > > the > > > > > > > cost > > > > > > > > of > > > > > > > > > > > risk > > > > > > > > > > > > of > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > using memory at the container level, which is > > not > > > > > good. > > > > > > > My > > > > > > > > > > point > > > > > > > > > > > is > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > Overhead" > > > are > > > > > not > > > > > > > easy > > > > > > > > > to > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > alternative 2, users might configure them > > higher > > > > than > > > > > > > what > > > > > > > > > > > actually > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > alternative > > > > > 3, > > > > > > > > users > > > > > > > > > do > > > > > > > > > > > not > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > > options > > > > > > > > > aggressively > > > > > > > > > > > > high. > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > the consequences are risks of overall > container > > > > > memory > > > > > > > > usage > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > Rohrmann < > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > > good. > > > > > > > > Concerning > > > > > > > > > > the > > > > > > > > > > > > > first > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > question about allocating memory segments, > I > > > was > > > > > > > > wondering > > > > > > > > > > > > whether > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > strictly necessary to do in the context of > > this > > > > > FLIP > > > > > > or > > > > > > > > > > whether > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > > > details, > > > > > > I > > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > that we would widen the scope of this FLIP > > too > > > > much > > > > > > > > because > > > > > > > > > > we > > > > > > > > > > > > > would > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > > > MemoryManager > > > > > > > > > > where > > > > > > > > > > > > we > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > memory segments (this should mainly be > batch > > > > > > > operators). > > > > > > > > > The > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > the memory reservation call to the > > > MemoryManager > > > > > > should > > > > > > > > not > > > > > > > > > > be > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > this and I would hope that this is the only > > > point > > > > > of > > > > > > > > > > > interaction > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > streaming job would have with the > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > > setting > > > > > or > > > > > > > not > > > > > > > > > > > setting > > > > > > > > > > > > a > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > interested > > > > why > > > > > > > Yang > > > > > > > > > Wang > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > leaving it open would be best. My concern > > about > > > > > this > > > > > > > > would > > > > > > > > > be > > > > > > > > > > > > that > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > be in a similar situation as we are now > with > > > the > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > the different memory pools are not clearly > > > > > separated > > > > > > > and > > > > > > > > > can > > > > > > > > > > > > spill > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > > > understand > > > > > > > > what > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > process to get killed for using too much > > > memory. > > > > > This > > > > > > > > could > > > > > > > > > > > then > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > lead to a similar situation what we have > with > > > the > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > So > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > setting a sane default value for max direct > > > > memory > > > > > > and > > > > > > > > > giving > > > > > > > > > > > the > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > option to increase it if he runs into an > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > > lower > > > > > > memory > > > > > > > > > > > > utilization > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > memory > > > to a > > > > > > > higher > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > Song < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > I think setting a very large max direct > > > memory > > > > > size > > > > > > > > > > > definitely > > > > > > > > > > > > > has > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > > direct > > > > > OOM, > > > > > > > and > > > > > > > > > we > > > > > > > > > > > > don't > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > However, there are also some down sides > of > > > > doing > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if > a > > > task > > > > > > > > executor > > > > > > > > > > > > > container > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > could > > > be > > > > > hard > > > > > > > for > > > > > > > > > use > > > > > > > > > > > to > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > - Another down side is that the JVM > > never > > > > > > trigger > > > > > > > GC > > > > > > > > > due > > > > > > > > > > > to > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory limit, because the limit > > is > > > > too > > > > > > high > > > > > > > > to > > > > > > > > > be > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > means we kind of relay on heap memory > to > > > > > trigger > > > > > > > GC > > > > > > > > > and > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > memory. That could be a problem in > cases > > > > where > > > > > > we > > > > > > > > have > > > > > > > > > > > more > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > usage but not enough heap activity to > > > > trigger > > > > > > the > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > preferring > > > > > > > > setting a > > > > > > > > > > > very > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > > > configuration > > > > > > > > > > that > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > explicitly specified, I think we should > > throw > > > > an > > > > > > > error. > > > > > > > > > > > > > > > > > > I think doing checking on the client side > > is > > > a > > > > > good > > > > > > > > idea, > > > > > > > > > > so > > > > > > > > > > > > that > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > > submitting > > > > > > the > > > > > > > > > Flink > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > But we can not only rely on the client > side > > > > > > checking, > > > > > > > > > > because > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > different > > > > > > machines > > > > > > > > may > > > > > > > > > > > have > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > configurations and the client does see > > that. > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang > < > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > After > > > all > > > > > the > > > > > > > > memory > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > > > control > > > > > > the > > > > > > > > > flink > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > memory > > > > and > > > > > > > native > > > > > > > > > > > memory. > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > Right? > > > So i > > > > > > don’t > > > > > > > > > think > > > > > > > > > > > we > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > > > prefer > > > > > > > > leaving > > > > > > > > > > it a > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > memory(network > > > > > > > memory, > > > > > > > > > > > managed > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > is larger than total process memory, > how > > do > > > > we > > > > > > deal > > > > > > > > > with > > > > > > > > > > > this > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > we need to check the memory > configuration > > > in > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > > 于2019年8月7日周三 > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > > thread > > > > on > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > > > where > > > > we > > > > > > > > > describe > > > > > > > > > > > how > > > > > > > > > > > > to > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. > The > > > > FLIP > > > > > > > > document > > > > > > > > > > is > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > > Configuration > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > with updates from follow-up > discussions > > > > both > > > > > > > online > > > > > > > > > and > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > shortcomings > > > of > > > > > > > current > > > > > > > > > > > (Flink > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > Streaming > > > > > and > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > configuration > > > of > > > > > > > RocksDB > > > > > > > > > in > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard > to > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can > > be > > > > > > > summarized > > > > > > > > > as > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > > account > > > > > for > > > > > > > > memory > > > > > > > > > > > usage > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory > is > > > > > > > partitioned > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > > options > > > > > and > > > > > > > > > > > calculations > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP > > wiki > > > > > > > document > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design > doc > > > [2] > > > > is > > > > > > out > > > > > > > > of > > > > > > > > > > > sync, > > > > > > > > > > > > > and > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > > > this > > > > > > > mailing > > > > > > > > > list > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > whether > > > > we > > > > > > can > > > > > > > > > avoid > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory and > > > > network > > > > > > > > memory > > > > > > > > > > with > > > > > > > > > > > alternative 3. But after giving it a second thought, I > > > think > > > > > even > > > > > > > for > > > > > > > > > > > alternative 3 using direct memory for off-heap managed > > > memory > > > > > > could > > > > > > > > > cause > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > > FLIP > > > it > > > > > to > > > > > > > have > > > > > > > > > > both > > > > > > > > > > > off-heap managed memory and network memory allocated > > > through > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > native > > > > > memory > > > > > > > and > > > > > > > > > not > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > memory > > > > > > limited > > > > > > > by > > > > > > > > > JVM > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > overhead, > > > > > > which > > > > > > > > are > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > direct > > > > memory > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand > the > > > two > > > > > > > > > alternatives > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > things > > > > > > > explicit. > > > > > > > > If > > > > > > > > > > we > > > > > > > > > > > > don't limit the direct memory, I fear that we might > end > > > up > > > > > in a > > > > > > > > > similar > > > > > > > > > > > > situation as we are currently in: The user might see > > that > > > > her > > > > > > > > process > > > > > > > > > > > gets > > > > > > > > > > > > killed by the OS and does not know why this is the > > case. > > > > > > > > > Consequently, > > > > > > > > > > > she > > > > > > > > > > > > tries to decrease the process memory size (similar to > > > > > > increasing > > > > > > > > the > > > > > > > > > > > cutoff > > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > > memory. > > > > > > Even > > > > > > > > > worse, > > > > > > > > > > > she > > > > > > > > > > > > tries to decrease memory budgets which are not fully > > used > > > > and > > > > > > > hence > > > > > > > > > > won't > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > Overhead): > > > > > > 200MB > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > Off-Heap > > > > > > Managed > > > > > > > > > Memory > > > > > > > > > > > and > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize > to > > > > 200MB. > > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize > to > > a > > > > very > > > > > > > large > > > > > > > > > > > value, > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > > Memory > > > > > and > > > > > > > JVM > > > > > > > > > > > > Overhead > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > alternative 3 > > > > > > > should > > > > > > > > > have > > > > > > > > > > > the > > > > > > > > > > > > > same utility. Setting larger > -XX:MaxDirectMemorySize > > > will > > > > > not > > > > > > > > > reduce > > > > > > > > > > > the > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task Off-Heap > > > Memory > > > > > and > > > > > > > JVM > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > > avoid > > > > > that, > > > > > > > the > > > > > > > > > only > > > > > > > > > > > > thing > > > > > > > > > > > > > user can do is to modify the configuration and > > > > increase > > > > > > JVM > > > > > > > > > Direct > > > > > > > > > > > > > Memory > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's say > > > that > > > > > user > > > > > > > > > > increases > > > > > > > > > > > > JVM > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > total > > > > size > > > > > of > > > > > > > > other > > > > > > > > > > > > memory > > > > > > > > > > > > > pools to 750MB, given the total process memory > > > remains > > > > > > 1GB. > > > > > > > > > > > > > - For alternative 3, there is no chance of > direct > > > OOM. > > > > > > There > > > > > > > > are > > > > > > > > > > > > chances > > > > > > > > > > > > > of exceeding the total process memory limit, but > > > given > > > > > > that > > > > > > > > the > > > > > > > > > > > > process > > > > > > > > > > > > > may > > > > > > > > > > > > > not use up all the reserved native memory > > (Off-Heap > > > > > > Managed > > > > > > > > > > Memory, > > > > > > > > > > > > > Network > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > memory > > > > > usage > > > > > > is > > > > > > > > > > > slightly > > > > > > > > > > > > > above > > > > > > > > > > > > > yet very close to 200MB, user probably do not > need > > > to > > > > > > change > > > > > > > > the > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > > feasible > > > > > > > > > > > configuration > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > utilization > > > > > > > compared > > > > > > > > > to > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > difference > > > > > > between > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to > Task > > > > > > Off-Heap > > > > > > > > > Memory > > > > > > > > > > > and > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > Overhead. Then there is the risk that this size > is > > > too > > > > > low > > > > > > > > > > resulting > > > > > > > > > > > > in a > > > > > > > > > > > > > > lot of garbage collection and potentially an OOM. > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > > something > > > > > > > larger > > > > > > > > > > than > > > > > > > > > > > > > > alternative 2. This would of course reduce the > > sizes > > > of > > > > > the > > > > > > > > other > > > > > > > > > > > > memory > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > > utilization > > > > > > of > > > > > > > > > > memory > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > strictly > > > > > sets a > > > > > > > > > higher > > > > > > > > > > > max > > > > > > > > > > > > > > direct memory size and we use only little, then I > > > would > > > > > > > expect > > > > > > > > > that > > > > > > > > > > > > > > alternative 3 results in memory under > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > memory > > > > size > > > > > > > when > > > > > > > > we > > > > > > > > > > do > > > > > > > > > > > > not > > > > > > > > > > > > > > > differentiate direct and native memory. If the > > > direct > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > direct memory and framework direct memory,could > > be > > > > > > > calculated > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > i am in favor of setting direct memory with > fixed > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we need > to > > > > check > > > > > > the > > > > > > > > > > memory > > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > > successfully > > > > > > > and > > > > > > > > > > > failing > > > > > > > > > > > > > in > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > >于2019年8月13日 > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right > that > > > we > > > > > > should > > > > > > > > not > > > > > > > > > > > > include > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > > should > > > > > > > > concentrate > > > > > > > > > > on > > > > > > > > > > > > how > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > with > > > > > minimum > > > > > > > > > > > involvement > > > > > > > > > > > > on > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 > may > > > not > > > > > > having > > > > > > > > the > > > > > > > > > > > same > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > reservation issue that alternative 2 does, > but > > at > > > > the > > > > > > > cost > > > > > > > > of > > > > > > > > > > > risk > > > > > > > > > > > > of > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > using memory at the container level, which is > > not > > > > > good. > > > > > > > My > > > > > > > > > > point > > > > > > > > > > > is > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > Overhead" > > > are > > > > > not > > > > > > > easy > > > > > > > > > to > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > alternative 2, users might configure them > > higher > > > > than > > > > > > > what > > > > > > > > > > > actually > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > alternative > > > > > 3, > > > > > > > > users > > > > > > > > > do > > > > > > > > > > > not > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > > options > > > > > > > > > aggressively > > > > > > > > > > > > high. > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > the consequences are risks of overall > container > > > > > memory > > > > > > > > usage > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > Rohrmann < > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > > good. > > > > > > > > Concerning > > > > > > > > > > the > > > > > > > > > > > > > first > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > question about allocating memory segments, > I > > > was > > > > > > > > wondering > > > > > > > > > > > > whether > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > strictly necessary to do in the context of > > this > > > > > FLIP > > > > > > or > > > > > > > > > > whether > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing all > > > > > details, > > > > > > I > > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > that we would widen the scope of this FLIP > > too > > > > much > > > > > > > > because > > > > > > > > > > we > > > > > > > > > > > > > would > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > to touch all the existing call sites of the > > > > > > > MemoryManager > > > > > > > > > > where > > > > > > > > > > > > we > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > memory segments (this should mainly be > batch > > > > > > > operators). > > > > > > > > > The > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > the memory reservation call to the > > > MemoryManager > > > > > > should > > > > > > > > not > > > > > > > > > > be > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > this and I would hope that this is the only > > > point > > > > > of > > > > > > > > > > > interaction > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > streaming job would have with the > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > > setting > > > > > or > > > > > > > not > > > > > > > > > > > setting > > > > > > > > > > > > a > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > interested > > > > why > > > > > > > Yang > > > > > > > > > Wang > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > leaving it open would be best. My concern > > about > > > > > this > > > > > > > > would > > > > > > > > > be > > > > > > > > > > > > that > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > be in a similar situation as we are now > with > > > the > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > the different memory pools are not clearly > > > > > separated > > > > > > > and > > > > > > > > > can > > > > > > > > > > > > spill > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > a different pool, then it is quite hard to > > > > > understand > > > > > > > > what > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > process to get killed for using too much > > > memory. > > > > > This > > > > > > > > could > > > > > > > > > > > then > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > lead to a similar situation what we have > with > > > the > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > So > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > setting a sane default value for max direct > > > > memory > > > > > > and > > > > > > > > > giving > > > > > > > > > > > the > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > option to increase it if he runs into an > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > > lower > > > > > > memory > > > > > > > > > > > > utilization > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > memory > > > to a > > > > > > > higher > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > Song < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > I think setting a very large max direct > > > memory > > > > > size > > > > > > > > > > > definitely > > > > > > > > > > > > > has > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > > direct > > > > > OOM, > > > > > > > and > > > > > > > > > we > > > > > > > > > > > > don't > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > to allocate managed / network memory with > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > However, there are also some down sides > of > > > > doing > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that if > a > > > task > > > > > > > > executor > > > > > > > > > > > > > container > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > could > > > be > > > > > hard > > > > > > > for > > > > > > > > > use > > > > > > > > > > > to > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > - Another down side is that the JVM > > never > > > > > > trigger > > > > > > > GC > > > > > > > > > due > > > > > > > > > > > to > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory limit, because the limit > > is > > > > too > > > > > > high > > > > > > > > to > > > > > > > > > be > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > means we kind of relay on heap memory > to > > > > > trigger > > > > > > > GC > > > > > > > > > and > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > memory. That could be a problem in > cases > > > > where > > > > > > we > > > > > > > > have > > > > > > > > > > > more > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > usage but not enough heap activity to > > > > trigger > > > > > > the > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > preferring > > > > > > > > setting a > > > > > > > > > > > very > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > if there are anything else I overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > If there is any conflict between multiple > > > > > > > configuration > > > > > > > > > > that > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > explicitly specified, I think we should > > throw > > > > an > > > > > > > error. > > > > > > > > > > > > > > > > > > I think doing checking on the client side > > is > > > a > > > > > good > > > > > > > > idea, > > > > > > > > > > so > > > > > > > > > > > > that > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > > submitting > > > > > > the > > > > > > > > > Flink > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > But we can not only rely on the client > side > > > > > > checking, > > > > > > > > > > because > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > different > > > > > > machines > > > > > > > > may > > > > > > > > > > > have > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > configurations and the client does see > > that. > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang Wang > < > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > After > > > all > > > > > the > > > > > > > > memory > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > introduced, it will be more powerful to > > > > control > > > > > > the > > > > > > > > > flink > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > memory > > > > and > > > > > > > native > > > > > > > > > > > memory. > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > Right? > > > So i > > > > > > don’t > > > > > > > > > think > > > > > > > > > > > we > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize properly. I > > > > prefer > > > > > > > > leaving > > > > > > > > > > it a > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > memory(network > > > > > > > memory, > > > > > > > > > > > managed > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > is larger than total process memory, > how > > do > > > > we > > > > > > deal > > > > > > > > > with > > > > > > > > > > > this > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > we need to check the memory > configuration > > > in > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > > 于2019年8月7日周三 > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > > thread > > > > on > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > Configuration for TaskExecutors"[1], > > > where > > > > we > > > > > > > > > describe > > > > > > > > > > > how > > > > > > > > > > > > to > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. > The > > > > FLIP > > > > > > > > document > > > > > > > > > > is > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > > Configuration > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > with updates from follow-up > discussions > > > > both > > > > > > > online > > > > > > > > > and > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > shortcomings > > > of > > > > > > > current > > > > > > > > > > > (Flink > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > Streaming > > > > > and > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > configuration > > > of > > > > > > > RocksDB > > > > > > > > > in > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and hard > to > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems can > > be > > > > > > > summarized > > > > > > > > > as > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > > account > > > > > for > > > > > > > > memory > > > > > > > > > > > usage > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory > is > > > > > > > partitioned > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > > options > > > > > and > > > > > > > > > > > calculations > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP > > wiki > > > > > > > document > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design > doc > > > [2] > > > > is > > > > > > out > > > > > > > > of > > > > > > > > > > > sync, > > > > > > > > > > > > > and > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > appreciated to have the discussion in > > > this > > > > > > > mailing > > > > > > > > > list > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
When computing the values in the JVM process after it started, how would
you deal with values like Max Direct Memory, Metaspace size. native memory reservation (reduce heap size), etc? All the values that are parameters to the JVM process and that need to be supplied at process startup? On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email]> wrote: > Thanks for the clarification. I have some more comments: > > - I would actually split the logic to compute the process memory > requirements and storing the values into two things. E.g. one could name > the former TaskExecutorProcessUtility and the latter > TaskExecutorProcessMemory. But we can discuss this on the PR since it's > just a naming detail. > > - Generally, I'm not opposed to making configuration values overridable by > ENV variables. I think this is a very good idea and makes the > configurability of Flink processes easier. However, I think that adding > this functionality should not be part of this FLIP because it would simply > widen the scope unnecessarily. > > The reasons why I believe it is unnecessary are the following: For Yarn we > already create write a flink-conf.yaml which could be populated with the > memory settings. For the other processes it should not make a difference > whether the loaded Configuration is populated with the memory settings from > ENV variables or by using TaskExecutorProcessUtility to compute the missing > values from the loaded configuration. If the latter would not be possible > (wrong or missing configuration values), then we should not have been able > to actually start the process in the first place. > > - Concerning the memory reservation: I agree with you that we need the > memory reservation functionality to make streaming jobs work with "managed" > memory. However, w/o this functionality the whole Flip would already bring > a good amount of improvements to our users when running batch jobs. > Moreover, by keeping the scope smaller we can complete the FLIP faster. > Hence, I would propose to address the memory reservation functionality as a > follow up FLIP (which Yu is working on if I'm not mistaken). > > Cheers, > Till > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> wrote: > > > Just add my 2 cents. > > > > Using environment variables to override the configuration for different > > taskmanagers is better. > > We do not need to generate dedicated flink-conf.yaml for all > taskmanagers. > > A common flink-conf.yam and different environment variables are enough. > > By reducing the distributed cached files, it could make launching a > > taskmanager faster. > > > > Stephan gives a good suggestion that we could move the logic into > > "GlobalConfiguration.loadConfig()" method. > > Maybe the client could also benefit from this. Different users do not > have > > to export FLINK_CONF_DIR to update few config options. > > > > > > Best, > > Yang > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > > > > One note on the Environment Variables and Configuration discussion. > > > > > > My understanding is that passed ENV variables are added to the > > > configuration in the "GlobalConfiguration.loadConfig()" method (or > > > similar). > > > For all the code inside Flink, it looks like the data was in the config > > to > > > start with, just that the scripts that compute the variables can pass > the > > > values to the process without actually needing to write a file. > > > > > > For example the "GlobalConfiguration.loadConfig()" method would take > any > > > ENV variable prefixed with "flink" and add it as a config key. > > > "flink_taskmanager_memory_size=2g" would become > "taskmanager.memory.size: > > > 2g". > > > > > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <[hidden email]> > > > wrote: > > > > > > > Thanks for the comments, Till. > > > > > > > > I've also seen your comments on the wiki page, but let's keep the > > > > discussion here. > > > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about naming it > > > > 'TaskExecutorResourceSpecifics'. > > > > - Regarding passing memory configurations into task executors, I'm in > > > favor > > > > of do it via environment variables rather than configurations, with > the > > > > following two reasons. > > > > - It is easier to keep the memory options once calculate not to be > > > > changed with environment variables rather than configurations. > > > > - I'm not sure whether we should write the configuration in startup > > > > scripts. Writing changes into the configuration files when running > the > > > > startup scripts does not sounds right to me. Or we could make a copy > of > > > > configuration files per flink cluster, and make the task executor to > > load > > > > from the copy, and clean up the copy after the cluster is shutdown, > > which > > > > is complicated. (I think this is also what Stephan means in his > comment > > > on > > > > the wiki page?) > > > > - Regarding reserving memory, I think this change should be included > in > > > > this FLIP. I think a big part of motivations of this FLIP is to unify > > > > memory configuration for streaming / batch and make it easy for > > > configuring > > > > rocksdb memory. If we don't support memory reservation, then > streaming > > > jobs > > > > cannot use managed memory (neither on-heap or off-heap), which makes > > this > > > > FLIP incomplete. > > > > - Regarding network memory, I think you are right. I think we > probably > > > > don't need to change network stack from using direct memory to using > > > unsafe > > > > native memory. Network memory size is deterministic, cannot be > reserved > > > as > > > > managed memory does, and cannot be overused. I think it also works if > > we > > > > simply keep using direct memory for network and include it in jvm max > > > > direct memory size. > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <[hidden email]> > > > > wrote: > > > > > > > > > Hi Xintong, > > > > > > > > > > thanks for addressing the comments and adding a more detailed > > > > > implementation plan. I have a couple of comments concerning the > > > > > implementation plan: > > > > > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. > > Choosing > > > a > > > > > different name could help here. > > > > > - I'm not sure whether I would pass the memory configuration to the > > > > > TaskExecutor via environment variables. I think it would be better > to > > > > write > > > > > it into the configuration one uses to start the TM process. > > > > > - If possible, I would exclude the memory reservation from this > FLIP > > > and > > > > > add this as part of a dedicated FLIP. > > > > > - If possible, then I would exclude changes to the network stack > from > > > > this > > > > > FLIP. Maybe we can simply say that the direct memory needed by the > > > > network > > > > > stack is the framework direct memory requirement. Changing how the > > > memory > > > > > is allocated can happen in a second step. This would keep the scope > > of > > > > this > > > > > FLIP smaller. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > I just updated the FLIP document on wiki [1], with the following > > > > changes. > > > > > > > > > > > > - Removed open question regarding MemorySegment allocation. As > > > > > > discussed, we exclude this topic from the scope of this FLIP. > > > > > > - Updated content about JVM direct memory parameter according > to > > > > > recent > > > > > > discussions, and moved the other options to "Rejected > > > Alternatives" > > > > > for > > > > > > the > > > > > > moment. > > > > > > - Added implementation steps. > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <[hidden email]> > > > wrote: > > > > > > > > > > > > > @Xintong: Concerning "wait for memory users before task dispose > > and > > > > > > memory > > > > > > > release": I agree, that's how it should be. Let's try it out. > > > > > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC when > > > > > allocating > > > > > > > direct memory buffer": There seems to be pretty elaborate logic > > to > > > > free > > > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like > going > > > > with > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), > then > > I > > > > > think > > > > > > it > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > > > "off_heap_managed_memory + direct_memory" even if we use > RocksDB. > > > > That > > > > > > is a > > > > > > > big if, though, I honestly have no idea :D Would be good to > > > > understand > > > > > > > this, though, because this would affect option (2) and option > > > (1.2). > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > > > > > Let me try to summarize your points. Please correct me if I'm > > > > wrong. > > > > > > > > > > > > > > > > - Memory consumers should always avoid returning memory > > > segments > > > > > to > > > > > > > > memory manager while there are still un-cleaned > structures / > > > > > threads > > > > > > > > that > > > > > > > > may use the memory. Otherwise, it would cause serious > > problems > > > > by > > > > > > > having > > > > > > > > multiple consumers trying to use the same memory segment. > > > > > > > > - JVM does not wait for GC when allocating direct memory > > > buffer. > > > > > > > > Therefore even we set proper max direct memory size limit, > > we > > > > may > > > > > > > still > > > > > > > > encounter direct memory oom if the GC cleaning memory > slower > > > > than > > > > > > the > > > > > > > > direct memory allocation. > > > > > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > > > [hidden email] > > > > > > > > .invalid> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > > > > > if additional threads not cleanly shut down before we can > > exit > > > > the > > > > > > > task: > > > > > > > > > In the current case of memory reuse, it has freed up the > > memory > > > > it > > > > > > > > > uses. If this memory is used by other tasks and > asynchronous > > > > > threads > > > > > > > > > of exited task may still be writing, there will be > > concurrent > > > > > > security > > > > > > > > > problems, and even lead to errors in user computing > results. > > > > > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No matter > > > what > > > > > the > > > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > > > I don't think it is a good idea, I've encountered so many > > > > > situations > > > > > > > > > that it's too late for GC to cause DirectMemory OOM. > Release > > > and > > > > > > > > > allocate DirectMemory depend on the type of user job, > which > > is > > > > > > > > > often beyond our control. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > > > To:dev <[hidden email]> > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory Configuration > > for > > > > > > > > > TaskExecutors > > > > > > > > > > > > > > > > > > My main concern with option 2 (manually release memory) is > > that > > > > > > > segfaults > > > > > > > > > in the JVM send off all sorts of alarms on user ends. So we > > > need > > > > to > > > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures / > > > algorithms > > > > > > with > > > > > > > > > additional threads, like hash table spill/read and sorting > > > > threads. > > > > > > We > > > > > > > > need > > > > > > > > > to ensure that these cleanly shut down before we can exit > the > > > > task. > > > > > > > > > I am not sure that we have that guaranteed already, that's > > why > > > > > option > > > > > > > 1.1 > > > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this way > > > really > > > > > > makes > > > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I > think > > it > > > > is > > > > > > not > > > > > > > > that > > > > > > > > > > difficult to keep it segfault safe for memory manager, as > > > long > > > > as > > > > > > we > > > > > > > > > always > > > > > > > > > > de-allocate the memory segment when it is released from > the > > > > > memory > > > > > > > > > > consumers. Only if the memory consumer continue using the > > > > buffer > > > > > of > > > > > > > > > memory > > > > > > > > > > segment after releasing it, in which case we do want the > > job > > > to > > > > > > fail > > > > > > > so > > > > > > > > > we > > > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. Not > only > > > > > because > > > > > > > the > > > > > > > > > > assumption (regular GC is enough to clean direct buffers) > > may > > > > not > > > > > > > > always > > > > > > > > > be > > > > > > > > > > true, but also it makes harder for finding problems in > > cases > > > of > > > > > > > memory > > > > > > > > > > overuse. E.g., user configured some direct memory for the > > > user > > > > > > > > libraries. > > > > > > > > > > If the library actually use more direct memory then > > > configured, > > > > > > which > > > > > > > > > > cannot be cleaned by GC because they are still in use, > may > > > lead > > > > > to > > > > > > > > > overuse > > > > > > > > > > of the total container memory. In that case, if it didn't > > > touch > > > > > the > > > > > > > JVM > > > > > > > > > > default max direct memory limit, we cannot get a direct > > > memory > > > > > OOM > > > > > > > and > > > > > > > > it > > > > > > > > > > will become super hard to understand which part of the > > > > > > configuration > > > > > > > > need > > > > > > > > > > to be updated. > > > > > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if the > > > > > exceeded > > > > > > > > direct > > > > > > > > > > memory does not reach the max direct memory limit > specified > > > by > > > > > the > > > > > > > > > > dedicated parameter. I think it is slightly better than > > 1.2, > > > > only > > > > > > > > because > > > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > > [hidden email] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, maybe > let > > > me > > > > > > > > summarize > > > > > > > > > > it a > > > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the GC. > That > > > > makes > > > > > > it > > > > > > > > > > segfault > > > > > > > > > > > safe. But then we need a way to trigger GC in case > > > > > de-allocation > > > > > > > and > > > > > > > > > > > re-allocation of a bunch of segments happens quickly, > > which > > > > is > > > > > > > often > > > > > > > > > the > > > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is one > way > > > to > > > > do > > > > > > > this > > > > > > > > > > > - Another way could be to have a dedicated > bookkeeping > > in > > > > the > > > > > > > > > > > MemoryManager (option 1.2), so that this is a number > > > > > independent > > > > > > of > > > > > > > > the > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory for > > the > > > > > > > > > > MemorySegments > > > > > > > > > > > (option 2). That way we need not worry about triggering > > GC > > > by > > > > > > some > > > > > > > > > > > threshold or bookkeeping, but it is harder to prevent > > > > > segfaults. > > > > > > We > > > > > > > > > need > > > > > > > > > > to > > > > > > > > > > > be very careful about when we release the memory > segments > > > > (only > > > > > > in > > > > > > > > the > > > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > > > "-XX:MaxDirectMemorySize" to "off_heap_managed_memory + > > > > > > > > direct_memory" > > > > > > > > > > and > > > > > > > > > > > have "direct_memory" as a separate reserved memory > pool. > > > > > Because > > > > > > if > > > > > > > > we > > > > > > > > > > just > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > "off_heap_managed_memory + > > > > > > > > > > jvm_overhead", > > > > > > > > > > > then there will be times when that entire memory is > > > allocated > > > > > by > > > > > > > > direct > > > > > > > > > > > buffers and we have nothing left for the JVM overhead. > So > > > we > > > > > > either > > > > > > > > > need > > > > > > > > > > a > > > > > > > > > > > way to compensate for that (again some safety margin > > cutoff > > > > > > value) > > > > > > > or > > > > > > > > > we > > > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that it > > takes > > > > > > > elaborate > > > > > > > > > > logic > > > > > > > > > > > to push recycling of direct buffers without always > > > > triggering a > > > > > > > full > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest to > do > > in > > > > the > > > > > > > > > following > > > > > > > > > > > order: > > > > > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory > parameter, > > as > > > > > > > discussed > > > > > > > > > > > above. We would need to find a way to set the > > direct_memory > > > > > > > parameter > > > > > > > > > by > > > > > > > > > > > default. We could start with 64 MB and see how it goes > in > > > > > > practice. > > > > > > > > One > > > > > > > > > > > danger I see is that setting this loo low can cause a > > bunch > > > > of > > > > > > > > > additional > > > > > > > > > > > GCs compared to before (we need to watch this > carefully). > > > > > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to implement, > > we > > > > > could > > > > > > > try > > > > > > > > > how > > > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > > > "-XX:MaxDirectMemorySize" > > > > > > > > > > parameter > > > > > > > > > > > at all and assume that all the direct memory > allocations > > > that > > > > > the > > > > > > > JVM > > > > > > > > > and > > > > > > > > > > > Netty do are infrequent enough to be cleaned up fast > > enough > > > > > > through > > > > > > > > > > regular > > > > > > > > > > > GC. I am not sure if that is a valid assumption, > though. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > > whether > > > > > we > > > > > > > can > > > > > > > > > > avoid > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory > and > > > > > network > > > > > > > > > memory > > > > > > > > > > > with > > > > > > > > > > > > alternative 3. But after giving it a second thought, > I > > > > think > > > > > > even > > > > > > > > for > > > > > > > > > > > > alternative 3 using direct memory for off-heap > managed > > > > memory > > > > > > > could > > > > > > > > > > cause > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > > > FLIP > > > > it > > > > > > to > > > > > > > > have > > > > > > > > > > > both > > > > > > > > > > > > off-heap managed memory and network memory allocated > > > > through > > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > > native > > > > > > memory > > > > > > > > and > > > > > > > > > > not > > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > > memory > > > > > > > limited > > > > > > > > by > > > > > > > > > > JVM > > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > > overhead, > > > > > > > which > > > > > > > > > are > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > direct > > > > > memory > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand > > the > > > > two > > > > > > > > > > alternatives > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > > things > > > > > > > > explicit. > > > > > > > > > If > > > > > > > > > > > we > > > > > > > > > > > > > don't limit the direct memory, I fear that we might > > end > > > > up > > > > > > in a > > > > > > > > > > similar > > > > > > > > > > > > > situation as we are currently in: The user might > see > > > that > > > > > her > > > > > > > > > process > > > > > > > > > > > > gets > > > > > > > > > > > > > killed by the OS and does not know why this is the > > > case. > > > > > > > > > > Consequently, > > > > > > > > > > > > she > > > > > > > > > > > > > tries to decrease the process memory size (similar > to > > > > > > > increasing > > > > > > > > > the > > > > > > > > > > > > cutoff > > > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > > > memory. > > > > > > > Even > > > > > > > > > > worse, > > > > > > > > > > > > she > > > > > > > > > > > > > tries to decrease memory budgets which are not > fully > > > used > > > > > and > > > > > > > > hence > > > > > > > > > > > won't > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > Overhead): > > > > > > > 200MB > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > Off-Heap > > > > > > > Managed > > > > > > > > > > Memory > > > > > > > > > > > > and > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize > > to > > > > > 200MB. > > > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize > > to > > > a > > > > > very > > > > > > > > large > > > > > > > > > > > > value, > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > Off-Heap > > > > Memory > > > > > > and > > > > > > > > JVM > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > alternative 3 > > > > > > > > should > > > > > > > > > > have > > > > > > > > > > > > the > > > > > > > > > > > > > > same utility. Setting larger > > -XX:MaxDirectMemorySize > > > > will > > > > > > not > > > > > > > > > > reduce > > > > > > > > > > > > the > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > Off-Heap > > > > Memory > > > > > > and > > > > > > > > JVM > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > > > avoid > > > > > > that, > > > > > > > > the > > > > > > > > > > only > > > > > > > > > > > > > thing > > > > > > > > > > > > > > user can do is to modify the configuration and > > > > > increase > > > > > > > JVM > > > > > > > > > > Direct > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's > say > > > > that > > > > > > user > > > > > > > > > > > increases > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > > total > > > > > size > > > > > > of > > > > > > > > > other > > > > > > > > > > > > > memory > > > > > > > > > > > > > > pools to 750MB, given the total process memory > > > > remains > > > > > > > 1GB. > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > direct > > > > OOM. > > > > > > > There > > > > > > > > > are > > > > > > > > > > > > > chances > > > > > > > > > > > > > > of exceeding the total process memory limit, > but > > > > given > > > > > > > that > > > > > > > > > the > > > > > > > > > > > > > process > > > > > > > > > > > > > > may > > > > > > > > > > > > > > not use up all the reserved native memory > > > (Off-Heap > > > > > > > Managed > > > > > > > > > > > Memory, > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > > memory > > > > > > usage > > > > > > > is > > > > > > > > > > > > slightly > > > > > > > > > > > > > > above > > > > > > > > > > > > > > yet very close to 200MB, user probably do not > > need > > > > to > > > > > > > change > > > > > > > > > the > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > > > feasible > > > > > > > > > > > > configuration > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > utilization > > > > > > > > compared > > > > > > > > > > to > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > difference > > > > > > > between > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to > > Task > > > > > > > Off-Heap > > > > > > > > > > Memory > > > > > > > > > > > > and > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead. Then there is the risk that this size > > is > > > > too > > > > > > low > > > > > > > > > > > resulting > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > lot of garbage collection and potentially an > OOM. > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > > > something > > > > > > > > larger > > > > > > > > > > > than > > > > > > > > > > > > > > > alternative 2. This would of course reduce the > > > sizes > > > > of > > > > > > the > > > > > > > > > other > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > > > utilization > > > > > > > of > > > > > > > > > > > memory > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > strictly > > > > > > sets a > > > > > > > > > > higher > > > > > > > > > > > > max > > > > > > > > > > > > > > > direct memory size and we use only little, > then I > > > > would > > > > > > > > expect > > > > > > > > > > that > > > > > > > > > > > > > > > alternative 3 results in memory under > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > > memory > > > > > size > > > > > > > > when > > > > > > > > > we > > > > > > > > > > > do > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > differentiate direct and native memory. If > the > > > > direct > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > direct memory and framework direct > memory,could > > > be > > > > > > > > calculated > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > i am in favor of setting direct memory with > > fixed > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > need > > to > > > > > check > > > > > > > the > > > > > > > > > > > memory > > > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > > > successfully > > > > > > > > and > > > > > > > > > > > > failing > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > >于2019年8月13日 > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right > > that > > > > we > > > > > > > should > > > > > > > > > not > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > > > should > > > > > > > > > concentrate > > > > > > > > > > > on > > > > > > > > > > > > > how > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > > with > > > > > > minimum > > > > > > > > > > > > involvement > > > > > > > > > > > > > on > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 > > may > > > > not > > > > > > > having > > > > > > > > > the > > > > > > > > > > > > same > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > reservation issue that alternative 2 does, > > but > > > at > > > > > the > > > > > > > > cost > > > > > > > > > of > > > > > > > > > > > > risk > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > using memory at the container level, which > is > > > not > > > > > > good. > > > > > > > > My > > > > > > > > > > > point > > > > > > > > > > > > is > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > Overhead" > > > > are > > > > > > not > > > > > > > > easy > > > > > > > > > > to > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > alternative 2, users might configure them > > > higher > > > > > than > > > > > > > > what > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > alternative > > > > > > 3, > > > > > > > > > users > > > > > > > > > > do > > > > > > > > > > > > not > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > > > options > > > > > > > > > > aggressively > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > the consequences are risks of overall > > container > > > > > > memory > > > > > > > > > usage > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > Rohrmann < > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > > > good. > > > > > > > > > Concerning > > > > > > > > > > > the > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > question about allocating memory > segments, > > I > > > > was > > > > > > > > > wondering > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > strictly necessary to do in the context > of > > > this > > > > > > FLIP > > > > > > > or > > > > > > > > > > > whether > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing > all > > > > > > details, > > > > > > > I > > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > that we would widen the scope of this > FLIP > > > too > > > > > much > > > > > > > > > because > > > > > > > > > > > we > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > to touch all the existing call sites of > the > > > > > > > > MemoryManager > > > > > > > > > > > where > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > memory segments (this should mainly be > > batch > > > > > > > > operators). > > > > > > > > > > The > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > MemoryManager > > > > > > > should > > > > > > > > > not > > > > > > > > > > > be > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > this and I would hope that this is the > only > > > > point > > > > > > of > > > > > > > > > > > > interaction > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > > > setting > > > > > > or > > > > > > > > not > > > > > > > > > > > > setting > > > > > > > > > > > > > a > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > interested > > > > > why > > > > > > > > Yang > > > > > > > > > > Wang > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > leaving it open would be best. My concern > > > about > > > > > > this > > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > that > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > be in a similar situation as we are now > > with > > > > the > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > the different memory pools are not > clearly > > > > > > separated > > > > > > > > and > > > > > > > > > > can > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > a different pool, then it is quite hard > to > > > > > > understand > > > > > > > > > what > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > process to get killed for using too much > > > > memory. > > > > > > This > > > > > > > > > could > > > > > > > > > > > > then > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > lead to a similar situation what we have > > with > > > > the > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > So > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > setting a sane default value for max > direct > > > > > memory > > > > > > > and > > > > > > > > > > giving > > > > > > > > > > > > the > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > option to increase it if he runs into an > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > > > lower > > > > > > > memory > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > memory > > > > to a > > > > > > > > higher > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > > Song < > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > I think setting a very large max direct > > > > memory > > > > > > size > > > > > > > > > > > > definitely > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > > > direct > > > > > > OOM, > > > > > > > > and > > > > > > > > > > we > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > to allocate managed / network memory > with > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > However, there are also some down sides > > of > > > > > doing > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that > if > > a > > > > task > > > > > > > > > executor > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > > could > > > > be > > > > > > hard > > > > > > > > for > > > > > > > > > > use > > > > > > > > > > > > to > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > - Another down side is that the JVM > > > never > > > > > > > trigger > > > > > > > > GC > > > > > > > > > > due > > > > > > > > > > > > to > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > direct memory limit, because the > limit > > > is > > > > > too > > > > > > > high > > > > > > > > > to > > > > > > > > > > be > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > memory > > to > > > > > > trigger > > > > > > > > GC > > > > > > > > > > and > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > memory. That could be a problem in > > cases > > > > > where > > > > > > > we > > > > > > > > > have > > > > > > > > > > > > more > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > usage but not enough heap activity > to > > > > > trigger > > > > > > > the > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > > preferring > > > > > > > > > setting a > > > > > > > > > > > > very > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > if there are anything else I > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > If there is any conflict between > multiple > > > > > > > > configuration > > > > > > > > > > > that > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > explicitly specified, I think we should > > > throw > > > > > an > > > > > > > > error. > > > > > > > > > > > > > > > > > > > I think doing checking on the client > side > > > is > > > > a > > > > > > good > > > > > > > > > idea, > > > > > > > > > > > so > > > > > > > > > > > > > that > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > > > submitting > > > > > > > the > > > > > > > > > > Flink > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > But we can not only rely on the client > > side > > > > > > > checking, > > > > > > > > > > > because > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > different > > > > > > > machines > > > > > > > > > may > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > configurations and the client does see > > > that. > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > Wang > > < > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > > After > > > > all > > > > > > the > > > > > > > > > memory > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > introduced, it will be more powerful > to > > > > > control > > > > > > > the > > > > > > > > > > flink > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > > memory > > > > > and > > > > > > > > native > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > Right? > > > > So i > > > > > > > don’t > > > > > > > > > > think > > > > > > > > > > > > we > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > properly. I > > > > > prefer > > > > > > > > > leaving > > > > > > > > > > > it a > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > memory(network > > > > > > > > memory, > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > is larger than total process memory, > > how > > > do > > > > > we > > > > > > > deal > > > > > > > > > > with > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > we need to check the memory > > configuration > > > > in > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > > > thread > > > > > on > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > Configuration for > TaskExecutors"[1], > > > > where > > > > > we > > > > > > > > > > describe > > > > > > > > > > > > how > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. > > The > > > > > FLIP > > > > > > > > > document > > > > > > > > > > > is > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > > > Configuration > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > discussions > > > > > both > > > > > > > > online > > > > > > > > > > and > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > shortcomings > > > > of > > > > > > > > current > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > > Streaming > > > > > > and > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > configuration > > > > of > > > > > > > > RocksDB > > > > > > > > > > in > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > hard > > to > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems > can > > > be > > > > > > > > summarized > > > > > > > > > > as > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > > > account > > > > > > for > > > > > > > > > memory > > > > > > > > > > > > usage > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory > > is > > > > > > > > partitioned > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > > > options > > > > > > and > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > FLIP > > > wiki > > > > > > > > document > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design > > doc > > > > [2] > > > > > is > > > > > > > out > > > > > > > > > of > > > > > > > > > > > > sync, > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > appreciated to have the discussion > in > > > > this > > > > > > > > mailing > > > > > > > > > > list > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > > whether > > > > > we > > > > > > > can > > > > > > > > > > avoid > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory > and > > > > > network > > > > > > > > > memory > > > > > > > > > > > with > > > > > > > > > > > > alternative 3. But after giving it a second thought, > I > > > > think > > > > > > even > > > > > > > > for > > > > > > > > > > > > alternative 3 using direct memory for off-heap > managed > > > > memory > > > > > > > could > > > > > > > > > > cause > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in this > > > FLIP > > > > it > > > > > > to > > > > > > > > have > > > > > > > > > > > both > > > > > > > > > > > > off-heap managed memory and network memory allocated > > > > through > > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > > native > > > > > > memory > > > > > > > > and > > > > > > > > > > not > > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > > memory > > > > > > > limited > > > > > > > > by > > > > > > > > > > JVM > > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > > overhead, > > > > > > > which > > > > > > > > > are > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > direct > > > > > memory > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I understand > > the > > > > two > > > > > > > > > > alternatives > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > > things > > > > > > > > explicit. > > > > > > > > > If > > > > > > > > > > > we > > > > > > > > > > > > > don't limit the direct memory, I fear that we might > > end > > > > up > > > > > > in a > > > > > > > > > > similar > > > > > > > > > > > > > situation as we are currently in: The user might > see > > > that > > > > > her > > > > > > > > > process > > > > > > > > > > > > gets > > > > > > > > > > > > > killed by the OS and does not know why this is the > > > case. > > > > > > > > > > Consequently, > > > > > > > > > > > > she > > > > > > > > > > > > > tries to decrease the process memory size (similar > to > > > > > > > increasing > > > > > > > > > the > > > > > > > > > > > > cutoff > > > > > > > > > > > > > ratio) in order to accommodate for the extra direct > > > > memory. > > > > > > > Even > > > > > > > > > > worse, > > > > > > > > > > > > she > > > > > > > > > > > > > tries to decrease memory budgets which are not > fully > > > used > > > > > and > > > > > > > > hence > > > > > > > > > > > won't > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > Overhead): > > > > > > > 200MB > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > Off-Heap > > > > > > > Managed > > > > > > > > > > Memory > > > > > > > > > > > > and > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set -XX:MaxDirectMemorySize > > to > > > > > 200MB. > > > > > > > > > > > > > > For alternative 3, we set -XX:MaxDirectMemorySize > > to > > > a > > > > > very > > > > > > > > large > > > > > > > > > > > > value, > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > Off-Heap > > > > Memory > > > > > > and > > > > > > > > JVM > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > alternative 3 > > > > > > > > should > > > > > > > > > > have > > > > > > > > > > > > the > > > > > > > > > > > > > > same utility. Setting larger > > -XX:MaxDirectMemorySize > > > > will > > > > > > not > > > > > > > > > > reduce > > > > > > > > > > > > the > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > Off-Heap > > > > Memory > > > > > > and > > > > > > > > JVM > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. To > > > avoid > > > > > > that, > > > > > > > > the > > > > > > > > > > only > > > > > > > > > > > > > thing > > > > > > > > > > > > > > user can do is to modify the configuration and > > > > > increase > > > > > > > JVM > > > > > > > > > > Direct > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's > say > > > > that > > > > > > user > > > > > > > > > > > increases > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > > total > > > > > size > > > > > > of > > > > > > > > > other > > > > > > > > > > > > > memory > > > > > > > > > > > > > > pools to 750MB, given the total process memory > > > > remains > > > > > > > 1GB. > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > direct > > > > OOM. > > > > > > > There > > > > > > > > > are > > > > > > > > > > > > > chances > > > > > > > > > > > > > > of exceeding the total process memory limit, > but > > > > given > > > > > > > that > > > > > > > > > the > > > > > > > > > > > > > process > > > > > > > > > > > > > > may > > > > > > > > > > > > > > not use up all the reserved native memory > > > (Off-Heap > > > > > > > Managed > > > > > > > > > > > Memory, > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > > memory > > > > > > usage > > > > > > > is > > > > > > > > > > > > slightly > > > > > > > > > > > > > > above > > > > > > > > > > > > > > yet very close to 200MB, user probably do not > > need > > > > to > > > > > > > change > > > > > > > > > the > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's perspective, a > > > > > feasible > > > > > > > > > > > > configuration > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > utilization > > > > > > > > compared > > > > > > > > > > to > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > difference > > > > > > > between > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > and 3 wrt to memory under utilization Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize to > > Task > > > > > > > Off-Heap > > > > > > > > > > Memory > > > > > > > > > > > > and > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead. Then there is the risk that this size > > is > > > > too > > > > > > low > > > > > > > > > > > resulting > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > lot of garbage collection and potentially an > OOM. > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize to > > > > > something > > > > > > > > larger > > > > > > > > > > > than > > > > > > > > > > > > > > > alternative 2. This would of course reduce the > > > sizes > > > > of > > > > > > the > > > > > > > > > other > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an under > > > > > > utilization > > > > > > > of > > > > > > > > > > > memory > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > strictly > > > > > > sets a > > > > > > > > > > higher > > > > > > > > > > > > max > > > > > > > > > > > > > > > direct memory size and we use only little, > then I > > > > would > > > > > > > > expect > > > > > > > > > > that > > > > > > > > > > > > > > > alternative 3 results in memory under > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > > memory > > > > > size > > > > > > > > when > > > > > > > > > we > > > > > > > > > > > do > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > differentiate direct and native memory. If > the > > > > direct > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > direct memory and framework direct > memory,could > > > be > > > > > > > > calculated > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > i am in favor of setting direct memory with > > fixed > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > need > > to > > > > > check > > > > > > > the > > > > > > > > > > > memory > > > > > > > > > > > > > > > > configurations in client to avoid submitting > > > > > > successfully > > > > > > > > and > > > > > > > > > > > > failing > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > >于2019年8月13日 > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are right > > that > > > > we > > > > > > > should > > > > > > > > > not > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This FLIP > > > should > > > > > > > > > concentrate > > > > > > > > > > > on > > > > > > > > > > > > > how > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > > with > > > > > > minimum > > > > > > > > > > > > involvement > > > > > > > > > > > > > on > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative 3 > > may > > > > not > > > > > > > having > > > > > > > > > the > > > > > > > > > > > > same > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > reservation issue that alternative 2 does, > > but > > > at > > > > > the > > > > > > > > cost > > > > > > > > > of > > > > > > > > > > > > risk > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > using memory at the container level, which > is > > > not > > > > > > good. > > > > > > > > My > > > > > > > > > > > point > > > > > > > > > > > > is > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > Overhead" > > > > are > > > > > > not > > > > > > > > easy > > > > > > > > > > to > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > alternative 2, users might configure them > > > higher > > > > > than > > > > > > > > what > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > alternative > > > > > > 3, > > > > > > > > > users > > > > > > > > > > do > > > > > > > > > > > > not > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > direct OOM, so they may not config the two > > > > options > > > > > > > > > > aggressively > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > the consequences are risks of overall > > container > > > > > > memory > > > > > > > > > usage > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > Rohrmann < > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks quite > > > good. > > > > > > > > > Concerning > > > > > > > > > > > the > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > question about allocating memory > segments, > > I > > > > was > > > > > > > > > wondering > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > strictly necessary to do in the context > of > > > this > > > > > > FLIP > > > > > > > or > > > > > > > > > > > whether > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing > all > > > > > > details, > > > > > > > I > > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > that we would widen the scope of this > FLIP > > > too > > > > > much > > > > > > > > > because > > > > > > > > > > > we > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > to touch all the existing call sites of > the > > > > > > > > MemoryManager > > > > > > > > > > > where > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > memory segments (this should mainly be > > batch > > > > > > > > operators). > > > > > > > > > > The > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > MemoryManager > > > > > > > should > > > > > > > > > not > > > > > > > > > > > be > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > this and I would hope that this is the > only > > > > point > > > > > > of > > > > > > > > > > > > interaction > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question about > > > > setting > > > > > > or > > > > > > > > not > > > > > > > > > > > > setting > > > > > > > > > > > > > a > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > interested > > > > > why > > > > > > > > Yang > > > > > > > > > > Wang > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > leaving it open would be best. My concern > > > about > > > > > > this > > > > > > > > > would > > > > > > > > > > be > > > > > > > > > > > > > that > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > be in a similar situation as we are now > > with > > > > the > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > the different memory pools are not > clearly > > > > > > separated > > > > > > > > and > > > > > > > > > > can > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > a different pool, then it is quite hard > to > > > > > > understand > > > > > > > > > what > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > process to get killed for using too much > > > > memory. > > > > > > This > > > > > > > > > could > > > > > > > > > > > > then > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > lead to a similar situation what we have > > with > > > > the > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > So > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > setting a sane default value for max > direct > > > > > memory > > > > > > > and > > > > > > > > > > giving > > > > > > > > > > > > the > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > option to increase it if he runs into an > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead to > > > lower > > > > > > > memory > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > memory > > > > to a > > > > > > > > higher > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > > Song < > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > I think setting a very large max direct > > > > memory > > > > > > size > > > > > > > > > > > > definitely > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry about > > > > direct > > > > > > OOM, > > > > > > > > and > > > > > > > > > > we > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > to allocate managed / network memory > with > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > However, there are also some down sides > > of > > > > > doing > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that > if > > a > > > > task > > > > > > > > > executor > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > > could > > > > be > > > > > > hard > > > > > > > > for > > > > > > > > > > use > > > > > > > > > > > > to > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > - Another down side is that the JVM > > > never > > > > > > > trigger > > > > > > > > GC > > > > > > > > > > due > > > > > > > > > > > > to > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > direct memory limit, because the > limit > > > is > > > > > too > > > > > > > high > > > > > > > > > to > > > > > > > > > > be > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > memory > > to > > > > > > trigger > > > > > > > > GC > > > > > > > > > > and > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > memory. That could be a problem in > > cases > > > > > where > > > > > > > we > > > > > > > > > have > > > > > > > > > > > > more > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > usage but not enough heap activity > to > > > > > trigger > > > > > > > the > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > > preferring > > > > > > > > > setting a > > > > > > > > > > > > very > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > if there are anything else I > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > If there is any conflict between > multiple > > > > > > > > configuration > > > > > > > > > > > that > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > explicitly specified, I think we should > > > throw > > > > > an > > > > > > > > error. > > > > > > > > > > > > > > > > > > > I think doing checking on the client > side > > > is > > > > a > > > > > > good > > > > > > > > > idea, > > > > > > > > > > > so > > > > > > > > > > > > > that > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > K8s we can discover the problem before > > > > > submitting > > > > > > > the > > > > > > > > > > Flink > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > But we can not only rely on the client > > side > > > > > > > checking, > > > > > > > > > > > because > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > different > > > > > > > machines > > > > > > > > > may > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > configurations and the client does see > > > that. > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > Wang > > < > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > > After > > > > all > > > > > > the > > > > > > > > > memory > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > introduced, it will be more powerful > to > > > > > control > > > > > > > the > > > > > > > > > > flink > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > > memory > > > > > and > > > > > > > > native > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > Right? > > > > So i > > > > > > > don’t > > > > > > > > > > think > > > > > > > > > > > > we > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > properly. I > > > > > prefer > > > > > > > > > leaving > > > > > > > > > > > it a > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > memory(network > > > > > > > > memory, > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > is larger than total process memory, > > how > > > do > > > > > we > > > > > > > deal > > > > > > > > > > with > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > we need to check the memory > > configuration > > > > in > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]> > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > > > > thread > > > > > on > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > Configuration for > TaskExecutors"[1], > > > > where > > > > > we > > > > > > > > > > describe > > > > > > > > > > > > how > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configurations. > > The > > > > > FLIP > > > > > > > > > document > > > > > > > > > > > is > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > early design "Memory Management and > > > > > > > Configuration > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > discussions > > > > > both > > > > > > > > online > > > > > > > > > > and > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > shortcomings > > > > of > > > > > > > > current > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > > Streaming > > > > > > and > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > configuration > > > > of > > > > > > > > RocksDB > > > > > > > > > > in > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > hard > > to > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems > can > > > be > > > > > > > > summarized > > > > > > > > > > as > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to also > > > > account > > > > > > for > > > > > > > > > memory > > > > > > > > > > > > usage > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor memory > > is > > > > > > > > partitioned > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > > - Simplify memory configuration > > > > options > > > > > > and > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > FLIP > > > wiki > > > > > > > > document > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early design > > doc > > > > [2] > > > > > is > > > > > > > out > > > > > > > > > of > > > > > > > > > > > > sync, > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > appreciated to have the discussion > in > > > > this > > > > > > > > mailing > > > > > > > > > > list > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
My understanding was that before starting the Flink process we call a
utility which calculates these values. I assume that this utility will do the calculation based on a set of configured values (process memory, flink memory, network memory etc.). Assuming that these values don't differ from the values with which the JVM is started, it should be possible to recompute them in the Flink process in order to set the values. On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> wrote: > When computing the values in the JVM process after it started, how would > you deal with values like Max Direct Memory, Metaspace size. native memory > reservation (reduce heap size), etc? All the values that are parameters to > the JVM process and that need to be supplied at process startup? > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email]> > wrote: > > > Thanks for the clarification. I have some more comments: > > > > - I would actually split the logic to compute the process memory > > requirements and storing the values into two things. E.g. one could name > > the former TaskExecutorProcessUtility and the latter > > TaskExecutorProcessMemory. But we can discuss this on the PR since it's > > just a naming detail. > > > > - Generally, I'm not opposed to making configuration values overridable > by > > ENV variables. I think this is a very good idea and makes the > > configurability of Flink processes easier. However, I think that adding > > this functionality should not be part of this FLIP because it would > simply > > widen the scope unnecessarily. > > > > The reasons why I believe it is unnecessary are the following: For Yarn > we > > already create write a flink-conf.yaml which could be populated with the > > memory settings. For the other processes it should not make a difference > > whether the loaded Configuration is populated with the memory settings > from > > ENV variables or by using TaskExecutorProcessUtility to compute the > missing > > values from the loaded configuration. If the latter would not be possible > > (wrong or missing configuration values), then we should not have been > able > > to actually start the process in the first place. > > > > - Concerning the memory reservation: I agree with you that we need the > > memory reservation functionality to make streaming jobs work with > "managed" > > memory. However, w/o this functionality the whole Flip would already > bring > > a good amount of improvements to our users when running batch jobs. > > Moreover, by keeping the scope smaller we can complete the FLIP faster. > > Hence, I would propose to address the memory reservation functionality > as a > > follow up FLIP (which Yu is working on if I'm not mistaken). > > > > Cheers, > > Till > > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> > wrote: > > > > > Just add my 2 cents. > > > > > > Using environment variables to override the configuration for different > > > taskmanagers is better. > > > We do not need to generate dedicated flink-conf.yaml for all > > taskmanagers. > > > A common flink-conf.yam and different environment variables are enough. > > > By reducing the distributed cached files, it could make launching a > > > taskmanager faster. > > > > > > Stephan gives a good suggestion that we could move the logic into > > > "GlobalConfiguration.loadConfig()" method. > > > Maybe the client could also benefit from this. Different users do not > > have > > > to export FLINK_CONF_DIR to update few config options. > > > > > > > > > Best, > > > Yang > > > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > > > > > > One note on the Environment Variables and Configuration discussion. > > > > > > > > My understanding is that passed ENV variables are added to the > > > > configuration in the "GlobalConfiguration.loadConfig()" method (or > > > > similar). > > > > For all the code inside Flink, it looks like the data was in the > config > > > to > > > > start with, just that the scripts that compute the variables can pass > > the > > > > values to the process without actually needing to write a file. > > > > > > > > For example the "GlobalConfiguration.loadConfig()" method would take > > any > > > > ENV variable prefixed with "flink" and add it as a config key. > > > > "flink_taskmanager_memory_size=2g" would become > > "taskmanager.memory.size: > > > > 2g". > > > > > > > > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <[hidden email]> > > > > wrote: > > > > > > > > > Thanks for the comments, Till. > > > > > > > > > > I've also seen your comments on the wiki page, but let's keep the > > > > > discussion here. > > > > > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about naming > it > > > > > 'TaskExecutorResourceSpecifics'. > > > > > - Regarding passing memory configurations into task executors, I'm > in > > > > favor > > > > > of do it via environment variables rather than configurations, with > > the > > > > > following two reasons. > > > > > - It is easier to keep the memory options once calculate not to > be > > > > > changed with environment variables rather than configurations. > > > > > - I'm not sure whether we should write the configuration in > startup > > > > > scripts. Writing changes into the configuration files when running > > the > > > > > startup scripts does not sounds right to me. Or we could make a > copy > > of > > > > > configuration files per flink cluster, and make the task executor > to > > > load > > > > > from the copy, and clean up the copy after the cluster is shutdown, > > > which > > > > > is complicated. (I think this is also what Stephan means in his > > comment > > > > on > > > > > the wiki page?) > > > > > - Regarding reserving memory, I think this change should be > included > > in > > > > > this FLIP. I think a big part of motivations of this FLIP is to > unify > > > > > memory configuration for streaming / batch and make it easy for > > > > configuring > > > > > rocksdb memory. If we don't support memory reservation, then > > streaming > > > > jobs > > > > > cannot use managed memory (neither on-heap or off-heap), which > makes > > > this > > > > > FLIP incomplete. > > > > > - Regarding network memory, I think you are right. I think we > > probably > > > > > don't need to change network stack from using direct memory to > using > > > > unsafe > > > > > native memory. Network memory size is deterministic, cannot be > > reserved > > > > as > > > > > managed memory does, and cannot be overused. I think it also works > if > > > we > > > > > simply keep using direct memory for network and include it in jvm > max > > > > > direct memory size. > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi Xintong, > > > > > > > > > > > > thanks for addressing the comments and adding a more detailed > > > > > > implementation plan. I have a couple of comments concerning the > > > > > > implementation plan: > > > > > > > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. > > > Choosing > > > > a > > > > > > different name could help here. > > > > > > - I'm not sure whether I would pass the memory configuration to > the > > > > > > TaskExecutor via environment variables. I think it would be > better > > to > > > > > write > > > > > > it into the configuration one uses to start the TM process. > > > > > > - If possible, I would exclude the memory reservation from this > > FLIP > > > > and > > > > > > add this as part of a dedicated FLIP. > > > > > > - If possible, then I would exclude changes to the network stack > > from > > > > > this > > > > > > FLIP. Maybe we can simply say that the direct memory needed by > the > > > > > network > > > > > > stack is the framework direct memory requirement. Changing how > the > > > > memory > > > > > > is allocated can happen in a second step. This would keep the > scope > > > of > > > > > this > > > > > > FLIP smaller. > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > I just updated the FLIP document on wiki [1], with the > following > > > > > changes. > > > > > > > > > > > > > > - Removed open question regarding MemorySegment allocation. > As > > > > > > > discussed, we exclude this topic from the scope of this > FLIP. > > > > > > > - Updated content about JVM direct memory parameter > according > > to > > > > > > recent > > > > > > > discussions, and moved the other options to "Rejected > > > > Alternatives" > > > > > > for > > > > > > > the > > > > > > > moment. > > > > > > > - Added implementation steps. > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <[hidden email] > > > > > > wrote: > > > > > > > > > > > > > > > @Xintong: Concerning "wait for memory users before task > dispose > > > and > > > > > > > memory > > > > > > > > release": I agree, that's how it should be. Let's try it out. > > > > > > > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC > when > > > > > > allocating > > > > > > > > direct memory buffer": There seems to be pretty elaborate > logic > > > to > > > > > free > > > > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like > > going > > > > > with > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), > > then > > > I > > > > > > think > > > > > > > it > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > > > > "off_heap_managed_memory + direct_memory" even if we use > > RocksDB. > > > > > That > > > > > > > is a > > > > > > > > big if, though, I honestly have no idea :D Would be good to > > > > > understand > > > > > > > > this, though, because this would affect option (2) and option > > > > (1.2). > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > > > > > > > Let me try to summarize your points. Please correct me if > I'm > > > > > wrong. > > > > > > > > > > > > > > > > > > - Memory consumers should always avoid returning memory > > > > segments > > > > > > to > > > > > > > > > memory manager while there are still un-cleaned > > structures / > > > > > > threads > > > > > > > > > that > > > > > > > > > may use the memory. Otherwise, it would cause serious > > > problems > > > > > by > > > > > > > > having > > > > > > > > > multiple consumers trying to use the same memory > segment. > > > > > > > > > - JVM does not wait for GC when allocating direct memory > > > > buffer. > > > > > > > > > Therefore even we set proper max direct memory size > limit, > > > we > > > > > may > > > > > > > > still > > > > > > > > > encounter direct memory oom if the GC cleaning memory > > slower > > > > > than > > > > > > > the > > > > > > > > > direct memory allocation. > > > > > > > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > > > > [hidden email] > > > > > > > > > .invalid> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > > > > > > > if additional threads not cleanly shut down before we can > > > exit > > > > > the > > > > > > > > task: > > > > > > > > > > In the current case of memory reuse, it has freed up the > > > memory > > > > > it > > > > > > > > > > uses. If this memory is used by other tasks and > > asynchronous > > > > > > threads > > > > > > > > > > of exited task may still be writing, there will be > > > concurrent > > > > > > > security > > > > > > > > > > problems, and even lead to errors in user computing > > results. > > > > > > > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No > matter > > > > what > > > > > > the > > > > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > > > > I don't think it is a good idea, I've encountered so many > > > > > > situations > > > > > > > > > > that it's too late for GC to cause DirectMemory OOM. > > Release > > > > and > > > > > > > > > > allocate DirectMemory depend on the type of user job, > > which > > > is > > > > > > > > > > often beyond our control. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > > > > To:dev <[hidden email]> > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > Configuration > > > for > > > > > > > > > > TaskExecutors > > > > > > > > > > > > > > > > > > > > My main concern with option 2 (manually release memory) > is > > > that > > > > > > > > segfaults > > > > > > > > > > in the JVM send off all sorts of alarms on user ends. So > we > > > > need > > > > > to > > > > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures / > > > > algorithms > > > > > > > with > > > > > > > > > > additional threads, like hash table spill/read and > sorting > > > > > threads. > > > > > > > We > > > > > > > > > need > > > > > > > > > > to ensure that these cleanly shut down before we can exit > > the > > > > > task. > > > > > > > > > > I am not sure that we have that guaranteed already, > that's > > > why > > > > > > option > > > > > > > > 1.1 > > > > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this > way > > > > really > > > > > > > makes > > > > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I > > think > > > it > > > > > is > > > > > > > not > > > > > > > > > that > > > > > > > > > > > difficult to keep it segfault safe for memory manager, > as > > > > long > > > > > as > > > > > > > we > > > > > > > > > > always > > > > > > > > > > > de-allocate the memory segment when it is released from > > the > > > > > > memory > > > > > > > > > > > consumers. Only if the memory consumer continue using > the > > > > > buffer > > > > > > of > > > > > > > > > > memory > > > > > > > > > > > segment after releasing it, in which case we do want > the > > > job > > > > to > > > > > > > fail > > > > > > > > so > > > > > > > > > > we > > > > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. Not > > only > > > > > > because > > > > > > > > the > > > > > > > > > > > assumption (regular GC is enough to clean direct > buffers) > > > may > > > > > not > > > > > > > > > always > > > > > > > > > > be > > > > > > > > > > > true, but also it makes harder for finding problems in > > > cases > > > > of > > > > > > > > memory > > > > > > > > > > > overuse. E.g., user configured some direct memory for > the > > > > user > > > > > > > > > libraries. > > > > > > > > > > > If the library actually use more direct memory then > > > > configured, > > > > > > > which > > > > > > > > > > > cannot be cleaned by GC because they are still in use, > > may > > > > lead > > > > > > to > > > > > > > > > > overuse > > > > > > > > > > > of the total container memory. In that case, if it > didn't > > > > touch > > > > > > the > > > > > > > > JVM > > > > > > > > > > > default max direct memory limit, we cannot get a direct > > > > memory > > > > > > OOM > > > > > > > > and > > > > > > > > > it > > > > > > > > > > > will become super hard to understand which part of the > > > > > > > configuration > > > > > > > > > need > > > > > > > > > > > to be updated. > > > > > > > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if > the > > > > > > exceeded > > > > > > > > > direct > > > > > > > > > > > memory does not reach the max direct memory limit > > specified > > > > by > > > > > > the > > > > > > > > > > > dedicated parameter. I think it is slightly better than > > > 1.2, > > > > > only > > > > > > > > > because > > > > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > > > [hidden email] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, maybe > > let > > > > me > > > > > > > > > summarize > > > > > > > > > > > it a > > > > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the GC. > > That > > > > > makes > > > > > > > it > > > > > > > > > > > segfault > > > > > > > > > > > > safe. But then we need a way to trigger GC in case > > > > > > de-allocation > > > > > > > > and > > > > > > > > > > > > re-allocation of a bunch of segments happens quickly, > > > which > > > > > is > > > > > > > > often > > > > > > > > > > the > > > > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is one > > way > > > > to > > > > > do > > > > > > > > this > > > > > > > > > > > > - Another way could be to have a dedicated > > bookkeeping > > > in > > > > > the > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a number > > > > > > independent > > > > > > > of > > > > > > > > > the > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory > for > > > the > > > > > > > > > > > MemorySegments > > > > > > > > > > > > (option 2). That way we need not worry about > triggering > > > GC > > > > by > > > > > > > some > > > > > > > > > > > > threshold or bookkeeping, but it is harder to prevent > > > > > > segfaults. > > > > > > > We > > > > > > > > > > need > > > > > > > > > > > to > > > > > > > > > > > > be very careful about when we release the memory > > segments > > > > > (only > > > > > > > in > > > > > > > > > the > > > > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > "off_heap_managed_memory + > > > > > > > > > direct_memory" > > > > > > > > > > > and > > > > > > > > > > > > have "direct_memory" as a separate reserved memory > > pool. > > > > > > Because > > > > > > > if > > > > > > > > > we > > > > > > > > > > > just > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > > "off_heap_managed_memory + > > > > > > > > > > > jvm_overhead", > > > > > > > > > > > > then there will be times when that entire memory is > > > > allocated > > > > > > by > > > > > > > > > direct > > > > > > > > > > > > buffers and we have nothing left for the JVM > overhead. > > So > > > > we > > > > > > > either > > > > > > > > > > need > > > > > > > > > > > a > > > > > > > > > > > > way to compensate for that (again some safety margin > > > cutoff > > > > > > > value) > > > > > > > > or > > > > > > > > > > we > > > > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that it > > > takes > > > > > > > > elaborate > > > > > > > > > > > logic > > > > > > > > > > > > to push recycling of direct buffers without always > > > > > triggering a > > > > > > > > full > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest to > > do > > > in > > > > > the > > > > > > > > > > following > > > > > > > > > > > > order: > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory > > parameter, > > > as > > > > > > > > discussed > > > > > > > > > > > > above. We would need to find a way to set the > > > direct_memory > > > > > > > > parameter > > > > > > > > > > by > > > > > > > > > > > > default. We could start with 64 MB and see how it > goes > > in > > > > > > > practice. > > > > > > > > > One > > > > > > > > > > > > danger I see is that setting this loo low can cause a > > > bunch > > > > > of > > > > > > > > > > additional > > > > > > > > > > > > GCs compared to before (we need to watch this > > carefully). > > > > > > > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to > implement, > > > we > > > > > > could > > > > > > > > try > > > > > > > > > > how > > > > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > > > > "-XX:MaxDirectMemorySize" > > > > > > > > > > > parameter > > > > > > > > > > > > at all and assume that all the direct memory > > allocations > > > > that > > > > > > the > > > > > > > > JVM > > > > > > > > > > and > > > > > > > > > > > > Netty do are infrequent enough to be cleaned up fast > > > enough > > > > > > > through > > > > > > > > > > > regular > > > > > > > > > > > > GC. I am not sure if that is a valid assumption, > > though. > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > > > whether > > > > > > we > > > > > > > > can > > > > > > > > > > > avoid > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory > > and > > > > > > network > > > > > > > > > > memory > > > > > > > > > > > > with > > > > > > > > > > > > > alternative 3. But after giving it a second > thought, > > I > > > > > think > > > > > > > even > > > > > > > > > for > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > managed > > > > > memory > > > > > > > > could > > > > > > > > > > > cause > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in > this > > > > FLIP > > > > > it > > > > > > > to > > > > > > > > > have > > > > > > > > > > > > both > > > > > > > > > > > > > off-heap managed memory and network memory > allocated > > > > > through > > > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > > > native > > > > > > > memory > > > > > > > > > and > > > > > > > > > > > not > > > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > > > memory > > > > > > > > limited > > > > > > > > > by > > > > > > > > > > > JVM > > > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > > > overhead, > > > > > > > > which > > > > > > > > > > are > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > > direct > > > > > > memory > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > [hidden email]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > understand > > > the > > > > > two > > > > > > > > > > > alternatives > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > > > things > > > > > > > > > explicit. > > > > > > > > > > If > > > > > > > > > > > > we > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > might > > > end > > > > > up > > > > > > > in a > > > > > > > > > > > similar > > > > > > > > > > > > > > situation as we are currently in: The user might > > see > > > > that > > > > > > her > > > > > > > > > > process > > > > > > > > > > > > > gets > > > > > > > > > > > > > > killed by the OS and does not know why this is > the > > > > case. > > > > > > > > > > > Consequently, > > > > > > > > > > > > > she > > > > > > > > > > > > > > tries to decrease the process memory size > (similar > > to > > > > > > > > increasing > > > > > > > > > > the > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > direct > > > > > memory. > > > > > > > > Even > > > > > > > > > > > worse, > > > > > > > > > > > > > she > > > > > > > > > > > > > > tries to decrease memory budgets which are not > > fully > > > > used > > > > > > and > > > > > > > > > hence > > > > > > > > > > > > won't > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > > Overhead): > > > > > > > > 200MB > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > > Off-Heap > > > > > > > > Managed > > > > > > > > > > > Memory > > > > > > > > > > > > > and > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > -XX:MaxDirectMemorySize > > > to > > > > > > 200MB. > > > > > > > > > > > > > > > For alternative 3, we set > -XX:MaxDirectMemorySize > > > to > > > > a > > > > > > very > > > > > > > > > large > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > Off-Heap > > > > > Memory > > > > > > > and > > > > > > > > > JVM > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > alternative 3 > > > > > > > > > should > > > > > > > > > > > have > > > > > > > > > > > > > the > > > > > > > > > > > > > > > same utility. Setting larger > > > -XX:MaxDirectMemorySize > > > > > will > > > > > > > not > > > > > > > > > > > reduce > > > > > > > > > > > > > the > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > Off-Heap > > > > > Memory > > > > > > > and > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. > To > > > > avoid > > > > > > > that, > > > > > > > > > the > > > > > > > > > > > only > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > user can do is to modify the configuration > and > > > > > > increase > > > > > > > > JVM > > > > > > > > > > > Direct > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's > > say > > > > > that > > > > > > > user > > > > > > > > > > > > increases > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > > > total > > > > > > size > > > > > > > of > > > > > > > > > > other > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > pools to 750MB, given the total process > memory > > > > > remains > > > > > > > > 1GB. > > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > > direct > > > > > OOM. > > > > > > > > There > > > > > > > > > > are > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > of exceeding the total process memory limit, > > but > > > > > given > > > > > > > > that > > > > > > > > > > the > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > not use up all the reserved native memory > > > > (Off-Heap > > > > > > > > Managed > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > > > memory > > > > > > > usage > > > > > > > > is > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > yet very close to 200MB, user probably do > not > > > need > > > > > to > > > > > > > > change > > > > > > > > > > the > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > perspective, a > > > > > > feasible > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > > utilization > > > > > > > > > compared > > > > > > > > > > > to > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann > < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > > difference > > > > > > > > between > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize > to > > > Task > > > > > > > > Off-Heap > > > > > > > > > > > Memory > > > > > > > > > > > > > and > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead. Then there is the risk that this > size > > > is > > > > > too > > > > > > > low > > > > > > > > > > > > resulting > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > lot of garbage collection and potentially an > > OOM. > > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize > to > > > > > > something > > > > > > > > > larger > > > > > > > > > > > > than > > > > > > > > > > > > > > > > alternative 2. This would of course reduce > the > > > > sizes > > > > > of > > > > > > > the > > > > > > > > > > other > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > under > > > > > > > utilization > > > > > > > > of > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > > strictly > > > > > > > sets a > > > > > > > > > > > higher > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory size and we use only little, > > then I > > > > > would > > > > > > > > > expect > > > > > > > > > > > that > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > > > memory > > > > > > size > > > > > > > > > when > > > > > > > > > > we > > > > > > > > > > > > do > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > differentiate direct and native memory. If > > the > > > > > direct > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > direct memory and framework direct > > memory,could > > > > be > > > > > > > > > calculated > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > i am in favor of setting direct memory with > > > fixed > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > > need > > > to > > > > > > check > > > > > > > > the > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > configurations in client to avoid > submitting > > > > > > > successfully > > > > > > > > > and > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > >于2019年8月13日 > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > right > > > that > > > > > we > > > > > > > > should > > > > > > > > > > not > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > FLIP > > > > should > > > > > > > > > > concentrate > > > > > > > > > > > > on > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > > > with > > > > > > > minimum > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative > 3 > > > may > > > > > not > > > > > > > > having > > > > > > > > > > the > > > > > > > > > > > > > same > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > does, > > > but > > > > at > > > > > > the > > > > > > > > > cost > > > > > > > > > > of > > > > > > > > > > > > > risk > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > using memory at the container level, > which > > is > > > > not > > > > > > > good. > > > > > > > > > My > > > > > > > > > > > > point > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > Overhead" > > > > > are > > > > > > > not > > > > > > > > > easy > > > > > > > > > > > to > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > alternative 2, users might configure them > > > > higher > > > > > > than > > > > > > > > > what > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > > alternative > > > > > > > 3, > > > > > > > > > > users > > > > > > > > > > > do > > > > > > > > > > > > > not > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > direct OOM, so they may not config the > two > > > > > options > > > > > > > > > > > aggressively > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > container > > > > > > > memory > > > > > > > > > > usage > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > Rohrmann < > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > quite > > > > good. > > > > > > > > > > Concerning > > > > > > > > > > > > the > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > question about allocating memory > > segments, > > > I > > > > > was > > > > > > > > > > wondering > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > strictly necessary to do in the context > > of > > > > this > > > > > > > FLIP > > > > > > > > or > > > > > > > > > > > > whether > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing > > all > > > > > > > details, > > > > > > > > I > > > > > > > > > > > would > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > that we would widen the scope of this > > FLIP > > > > too > > > > > > much > > > > > > > > > > because > > > > > > > > > > > > we > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > to touch all the existing call sites of > > the > > > > > > > > > MemoryManager > > > > > > > > > > > > where > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > memory segments (this should mainly be > > > batch > > > > > > > > > operators). > > > > > > > > > > > The > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > MemoryManager > > > > > > > > should > > > > > > > > > > not > > > > > > > > > > > > be > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > this and I would hope that this is the > > only > > > > > point > > > > > > > of > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > about > > > > > setting > > > > > > > or > > > > > > > > > not > > > > > > > > > > > > > setting > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > > interested > > > > > > why > > > > > > > > > Yang > > > > > > > > > > > Wang > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > concern > > > > about > > > > > > > this > > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > be in a similar situation as we are now > > > with > > > > > the > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > the different memory pools are not > > clearly > > > > > > > separated > > > > > > > > > and > > > > > > > > > > > can > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > a different pool, then it is quite hard > > to > > > > > > > understand > > > > > > > > > > what > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > process to get killed for using too > much > > > > > memory. > > > > > > > This > > > > > > > > > > could > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > have > > > with > > > > > the > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > setting a sane default value for max > > direct > > > > > > memory > > > > > > > > and > > > > > > > > > > > giving > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > option to increase it if he runs into > an > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead > to > > > > lower > > > > > > > > memory > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > > memory > > > > > to a > > > > > > > > > higher > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > > > Song < > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > I think setting a very large max > direct > > > > > memory > > > > > > > size > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > about > > > > > direct > > > > > > > OOM, > > > > > > > > > and > > > > > > > > > > > we > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > to allocate managed / network memory > > with > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > However, there are also some down > sides > > > of > > > > > > doing > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that > > if > > > a > > > > > task > > > > > > > > > > executor > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > > > could > > > > > be > > > > > > > hard > > > > > > > > > for > > > > > > > > > > > use > > > > > > > > > > > > > to > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > - Another down side is that the > JVM > > > > never > > > > > > > > trigger > > > > > > > > > GC > > > > > > > > > > > due > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > direct memory limit, because the > > limit > > > > is > > > > > > too > > > > > > > > high > > > > > > > > > > to > > > > > > > > > > > be > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > memory > > > to > > > > > > > trigger > > > > > > > > > GC > > > > > > > > > > > and > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > memory. That could be a problem in > > > cases > > > > > > where > > > > > > > > we > > > > > > > > > > have > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > usage but not enough heap activity > > to > > > > > > trigger > > > > > > > > the > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > > > preferring > > > > > > > > > > setting a > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > if there are anything else I > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > multiple > > > > > > > > > configuration > > > > > > > > > > > > that > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > should > > > > throw > > > > > > an > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > I think doing checking on the client > > side > > > > is > > > > > a > > > > > > > good > > > > > > > > > > idea, > > > > > > > > > > > > so > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > before > > > > > > submitting > > > > > > > > the > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > client > > > side > > > > > > > > checking, > > > > > > > > > > > > because > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > > different > > > > > > > > machines > > > > > > > > > > may > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > configurations and the client does > see > > > > that. > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > > Wang > > > < > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > > > After > > > > > all > > > > > > > the > > > > > > > > > > memory > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > powerful > > to > > > > > > control > > > > > > > > the > > > > > > > > > > > flink > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > > > memory > > > > > > and > > > > > > > > > native > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > > Right? > > > > > So i > > > > > > > > don’t > > > > > > > > > > > think > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > properly. I > > > > > > prefer > > > > > > > > > > leaving > > > > > > > > > > > > it a > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > memory(network > > > > > > > > > memory, > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > is larger than total process > memory, > > > how > > > > do > > > > > > we > > > > > > > > deal > > > > > > > > > > > with > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > configuration > > > > > in > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > [hidden email]> > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > discussion > > > > > thread > > > > > > on > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > Configuration for > > TaskExecutors"[1], > > > > > where > > > > > > we > > > > > > > > > > > describe > > > > > > > > > > > > > how > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > configurations. > > > The > > > > > > FLIP > > > > > > > > > > document > > > > > > > > > > > > is > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > early design "Memory Management > and > > > > > > > > Configuration > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > discussions > > > > > > both > > > > > > > > > online > > > > > > > > > > > and > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > shortcomings > > > > > of > > > > > > > > > current > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > > > Streaming > > > > > > > and > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > configuration > > > > > of > > > > > > > > > RocksDB > > > > > > > > > > > in > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > > hard > > > to > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems > > can > > > > be > > > > > > > > > summarized > > > > > > > > > > > as > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > also > > > > > account > > > > > > > for > > > > > > > > > > memory > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > memory > > > is > > > > > > > > > partitioned > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > configuration > > > > > options > > > > > > > and > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > > FLIP > > > > wiki > > > > > > > > > document > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > design > > > doc > > > > > [2] > > > > > > is > > > > > > > > out > > > > > > > > > > of > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > discussion > > in > > > > > this > > > > > > > > > mailing > > > > > > > > > > > list > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was wondering > > > > whether > > > > > > we > > > > > > > > can > > > > > > > > > > > avoid > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed memory > > and > > > > > > network > > > > > > > > > > memory > > > > > > > > > > > > with > > > > > > > > > > > > > alternative 3. But after giving it a second > thought, > > I > > > > > think > > > > > > > even > > > > > > > > > for > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > managed > > > > > memory > > > > > > > > could > > > > > > > > > > > cause > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in > this > > > > FLIP > > > > > it > > > > > > > to > > > > > > > > > have > > > > > > > > > > > > both > > > > > > > > > > > > > off-heap managed memory and network memory > allocated > > > > > through > > > > > > > > > > > > > Unsafe.allocate(), which means they are practically > > > > native > > > > > > > memory > > > > > > > > > and > > > > > > > > > > > not > > > > > > > > > > > > > limited by JVM max direct memory. The only parts of > > > > memory > > > > > > > > limited > > > > > > > > > by > > > > > > > > > > > JVM > > > > > > > > > > > > > max direct memory are task off-heap memory and JVM > > > > > overhead, > > > > > > > > which > > > > > > > > > > are > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > > direct > > > > > > memory > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > [hidden email]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > understand > > > the > > > > > two > > > > > > > > > > > alternatives > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it makes > > > > things > > > > > > > > > explicit. > > > > > > > > > > If > > > > > > > > > > > > we > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > might > > > end > > > > > up > > > > > > > in a > > > > > > > > > > > similar > > > > > > > > > > > > > > situation as we are currently in: The user might > > see > > > > that > > > > > > her > > > > > > > > > > process > > > > > > > > > > > > > gets > > > > > > > > > > > > > > killed by the OS and does not know why this is > the > > > > case. > > > > > > > > > > > Consequently, > > > > > > > > > > > > > she > > > > > > > > > > > > > > tries to decrease the process memory size > (similar > > to > > > > > > > > increasing > > > > > > > > > > the > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > direct > > > > > memory. > > > > > > > > Even > > > > > > > > > > > worse, > > > > > > > > > > > > > she > > > > > > > > > > > > > > tries to decrease memory budgets which are not > > fully > > > > used > > > > > > and > > > > > > > > > hence > > > > > > > > > > > > won't > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > > Overhead): > > > > > > > > 200MB > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > > Off-Heap > > > > > > > > Managed > > > > > > > > > > > Memory > > > > > > > > > > > > > and > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > -XX:MaxDirectMemorySize > > > to > > > > > > 200MB. > > > > > > > > > > > > > > > For alternative 3, we set > -XX:MaxDirectMemorySize > > > to > > > > a > > > > > > very > > > > > > > > > large > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > Off-Heap > > > > > Memory > > > > > > > and > > > > > > > > > JVM > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > alternative 3 > > > > > > > > > should > > > > > > > > > > > have > > > > > > > > > > > > > the > > > > > > > > > > > > > > > same utility. Setting larger > > > -XX:MaxDirectMemorySize > > > > > will > > > > > > > not > > > > > > > > > > > reduce > > > > > > > > > > > > > the > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > Off-Heap > > > > > Memory > > > > > > > and > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. > To > > > > avoid > > > > > > > that, > > > > > > > > > the > > > > > > > > > > > only > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > user can do is to modify the configuration > and > > > > > > increase > > > > > > > > JVM > > > > > > > > > > > Direct > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). Let's > > say > > > > > that > > > > > > > user > > > > > > > > > > > > increases > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce the > > > total > > > > > > size > > > > > > > of > > > > > > > > > > other > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > pools to 750MB, given the total process > memory > > > > > remains > > > > > > > > 1GB. > > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > > direct > > > > > OOM. > > > > > > > > There > > > > > > > > > > are > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > of exceeding the total process memory limit, > > but > > > > > given > > > > > > > > that > > > > > > > > > > the > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > not use up all the reserved native memory > > > > (Off-Heap > > > > > > > > Managed > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual direct > > > > memory > > > > > > > usage > > > > > > > > is > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > yet very close to 200MB, user probably do > not > > > need > > > > > to > > > > > > > > change > > > > > > > > > > the > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > perspective, a > > > > > > feasible > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > > utilization > > > > > > > > > compared > > > > > > > > > > > to > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till Rohrmann > < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > > difference > > > > > > > > between > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize > to > > > Task > > > > > > > > Off-Heap > > > > > > > > > > > Memory > > > > > > > > > > > > > and > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead. Then there is the risk that this > size > > > is > > > > > too > > > > > > > low > > > > > > > > > > > > resulting > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > lot of garbage collection and potentially an > > OOM. > > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize > to > > > > > > something > > > > > > > > > larger > > > > > > > > > > > > than > > > > > > > > > > > > > > > > alternative 2. This would of course reduce > the > > > > sizes > > > > > of > > > > > > > the > > > > > > > > > > other > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > under > > > > > > > utilization > > > > > > > > of > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > > strictly > > > > > > > sets a > > > > > > > > > > > higher > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > direct memory size and we use only little, > > then I > > > > > would > > > > > > > > > expect > > > > > > > > > > > that > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max direct > > > > memory > > > > > > size > > > > > > > > > when > > > > > > > > > > we > > > > > > > > > > > > do > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > differentiate direct and native memory. If > > the > > > > > direct > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > direct memory and framework direct > > memory,could > > > > be > > > > > > > > > calculated > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > i am in favor of setting direct memory with > > > fixed > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > > need > > > to > > > > > > check > > > > > > > > the > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > configurations in client to avoid > submitting > > > > > > > successfully > > > > > > > > > and > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > >于2019年8月13日 > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > right > > > that > > > > > we > > > > > > > > should > > > > > > > > > > not > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > FLIP > > > > should > > > > > > > > > > concentrate > > > > > > > > > > > > on > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > configure memory pools for TaskExecutors, > > > with > > > > > > > minimum > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think alternative > 3 > > > may > > > > > not > > > > > > > > having > > > > > > > > > > the > > > > > > > > > > > > > same > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > does, > > > but > > > > at > > > > > > the > > > > > > > > > cost > > > > > > > > > > of > > > > > > > > > > > > > risk > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > using memory at the container level, > which > > is > > > > not > > > > > > > good. > > > > > > > > > My > > > > > > > > > > > > point > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > Overhead" > > > > > are > > > > > > > not > > > > > > > > > easy > > > > > > > > > > > to > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > alternative 2, users might configure them > > > > higher > > > > > > than > > > > > > > > > what > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > > alternative > > > > > > > 3, > > > > > > > > > > users > > > > > > > > > > > do > > > > > > > > > > > > > not > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > direct OOM, so they may not config the > two > > > > > options > > > > > > > > > > > aggressively > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > container > > > > > > > memory > > > > > > > > > > usage > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > Rohrmann < > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > quite > > > > good. > > > > > > > > > > Concerning > > > > > > > > > > > > the > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > question about allocating memory > > segments, > > > I > > > > > was > > > > > > > > > > wondering > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > strictly necessary to do in the context > > of > > > > this > > > > > > > FLIP > > > > > > > > or > > > > > > > > > > > > whether > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > be done as a follow up? Without knowing > > all > > > > > > > details, > > > > > > > > I > > > > > > > > > > > would > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > that we would widen the scope of this > > FLIP > > > > too > > > > > > much > > > > > > > > > > because > > > > > > > > > > > > we > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > to touch all the existing call sites of > > the > > > > > > > > > MemoryManager > > > > > > > > > > > > where > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > memory segments (this should mainly be > > > batch > > > > > > > > > operators). > > > > > > > > > > > The > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > MemoryManager > > > > > > > > should > > > > > > > > > > not > > > > > > > > > > > > be > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > this and I would hope that this is the > > only > > > > > point > > > > > > > of > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > about > > > > > setting > > > > > > > or > > > > > > > > > not > > > > > > > > > > > > > setting > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > > interested > > > > > > why > > > > > > > > > Yang > > > > > > > > > > > Wang > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > concern > > > > about > > > > > > > this > > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > be in a similar situation as we are now > > > with > > > > > the > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > the different memory pools are not > > clearly > > > > > > > separated > > > > > > > > > and > > > > > > > > > > > can > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > a different pool, then it is quite hard > > to > > > > > > > understand > > > > > > > > > > what > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > process to get killed for using too > much > > > > > memory. > > > > > > > This > > > > > > > > > > could > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > have > > > with > > > > > the > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > setting a sane default value for max > > direct > > > > > > memory > > > > > > > > and > > > > > > > > > > > giving > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > option to increase it if he runs into > an > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 lead > to > > > > lower > > > > > > > > memory > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > > memory > > > > > to a > > > > > > > > > higher > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM Xintong > > > Song < > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > I think setting a very large max > direct > > > > > memory > > > > > > > size > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > about > > > > > direct > > > > > > > OOM, > > > > > > > > > and > > > > > > > > > > > we > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > to allocate managed / network memory > > with > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > However, there are also some down > sides > > > of > > > > > > doing > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is that > > if > > > a > > > > > task > > > > > > > > > > executor > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > killed due to overusing memory, it > > > could > > > > > be > > > > > > > hard > > > > > > > > > for > > > > > > > > > > > use > > > > > > > > > > > > > to > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > - Another down side is that the > JVM > > > > never > > > > > > > > trigger > > > > > > > > > GC > > > > > > > > > > > due > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > direct memory limit, because the > > limit > > > > is > > > > > > too > > > > > > > > high > > > > > > > > > > to > > > > > > > > > > > be > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > memory > > > to > > > > > > > trigger > > > > > > > > > GC > > > > > > > > > > > and > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > memory. That could be a problem in > > > cases > > > > > > where > > > > > > > > we > > > > > > > > > > have > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > usage but not enough heap activity > > to > > > > > > trigger > > > > > > > > the > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons for > > > > > preferring > > > > > > > > > > setting a > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > if there are anything else I > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > multiple > > > > > > > > > configuration > > > > > > > > > > > > that > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > should > > > > throw > > > > > > an > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > I think doing checking on the client > > side > > > > is > > > > > a > > > > > > > good > > > > > > > > > > idea, > > > > > > > > > > > > so > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > before > > > > > > submitting > > > > > > > > the > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > client > > > side > > > > > > > > checking, > > > > > > > > > > > > because > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > > different > > > > > > > > machines > > > > > > > > > > may > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > configurations and the client does > see > > > > that. > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > > Wang > > > < > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed proposal. > > > After > > > > > all > > > > > > > the > > > > > > > > > > memory > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > powerful > > to > > > > > > control > > > > > > > > the > > > > > > > > > > > flink > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user direct > > > > memory > > > > > > and > > > > > > > > > native > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > > Right? > > > > > So i > > > > > > > > don’t > > > > > > > > > > > think > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > properly. I > > > > > > prefer > > > > > > > > > > leaving > > > > > > > > > > > > it a > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > memory(network > > > > > > > > > memory, > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > is larger than total process > memory, > > > how > > > > do > > > > > > we > > > > > > > > deal > > > > > > > > > > > with > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > configuration > > > > > in > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > [hidden email]> > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > discussion > > > > > thread > > > > > > on > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > Configuration for > > TaskExecutors"[1], > > > > > where > > > > > > we > > > > > > > > > > > describe > > > > > > > > > > > > > how > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > configurations. > > > The > > > > > > FLIP > > > > > > > > > > document > > > > > > > > > > > > is > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > early design "Memory Management > and > > > > > > > > Configuration > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > discussions > > > > > > both > > > > > > > > > online > > > > > > > > > > > and > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > shortcomings > > > > > of > > > > > > > > > current > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration for > > > > > Streaming > > > > > > > and > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > configuration > > > > > of > > > > > > > > > RocksDB > > > > > > > > > > > in > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > > hard > > > to > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the problems > > can > > > > be > > > > > > > > > summarized > > > > > > > > > > > as > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > also > > > > > account > > > > > > > for > > > > > > > > > > memory > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > memory > > > is > > > > > > > > > partitioned > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > memory reservations and pools. > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > configuration > > > > > options > > > > > > > and > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > > FLIP > > > > wiki > > > > > > > > > document > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > design > > > doc > > > > > [2] > > > > > > is > > > > > > > > out > > > > > > > > > > of > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > discussion > > in > > > > > this > > > > > > > > > mailing > > > > > > > > > > > list > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
I see. Under the assumption of strict determinism that should work.
The original proposal had this point "don't compute inside the TM, compute outside and supply a full config", because that sounded more intuitive. On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> wrote: > My understanding was that before starting the Flink process we call a > utility which calculates these values. I assume that this utility will do > the calculation based on a set of configured values (process memory, flink > memory, network memory etc.). Assuming that these values don't differ from > the values with which the JVM is started, it should be possible to > recompute them in the Flink process in order to set the values. > > > > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> wrote: > > > When computing the values in the JVM process after it started, how would > > you deal with values like Max Direct Memory, Metaspace size. native > memory > > reservation (reduce heap size), etc? All the values that are parameters > to > > the JVM process and that need to be supplied at process startup? > > > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > Thanks for the clarification. I have some more comments: > > > > > > - I would actually split the logic to compute the process memory > > > requirements and storing the values into two things. E.g. one could > name > > > the former TaskExecutorProcessUtility and the latter > > > TaskExecutorProcessMemory. But we can discuss this on the PR since it's > > > just a naming detail. > > > > > > - Generally, I'm not opposed to making configuration values overridable > > by > > > ENV variables. I think this is a very good idea and makes the > > > configurability of Flink processes easier. However, I think that adding > > > this functionality should not be part of this FLIP because it would > > simply > > > widen the scope unnecessarily. > > > > > > The reasons why I believe it is unnecessary are the following: For Yarn > > we > > > already create write a flink-conf.yaml which could be populated with > the > > > memory settings. For the other processes it should not make a > difference > > > whether the loaded Configuration is populated with the memory settings > > from > > > ENV variables or by using TaskExecutorProcessUtility to compute the > > missing > > > values from the loaded configuration. If the latter would not be > possible > > > (wrong or missing configuration values), then we should not have been > > able > > > to actually start the process in the first place. > > > > > > - Concerning the memory reservation: I agree with you that we need the > > > memory reservation functionality to make streaming jobs work with > > "managed" > > > memory. However, w/o this functionality the whole Flip would already > > bring > > > a good amount of improvements to our users when running batch jobs. > > > Moreover, by keeping the scope smaller we can complete the FLIP faster. > > > Hence, I would propose to address the memory reservation functionality > > as a > > > follow up FLIP (which Yu is working on if I'm not mistaken). > > > > > > Cheers, > > > Till > > > > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> > > wrote: > > > > > > > Just add my 2 cents. > > > > > > > > Using environment variables to override the configuration for > different > > > > taskmanagers is better. > > > > We do not need to generate dedicated flink-conf.yaml for all > > > taskmanagers. > > > > A common flink-conf.yam and different environment variables are > enough. > > > > By reducing the distributed cached files, it could make launching a > > > > taskmanager faster. > > > > > > > > Stephan gives a good suggestion that we could move the logic into > > > > "GlobalConfiguration.loadConfig()" method. > > > > Maybe the client could also benefit from this. Different users do not > > > have > > > > to export FLINK_CONF_DIR to update few config options. > > > > > > > > > > > > Best, > > > > Yang > > > > > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > > > > > > > > One note on the Environment Variables and Configuration discussion. > > > > > > > > > > My understanding is that passed ENV variables are added to the > > > > > configuration in the "GlobalConfiguration.loadConfig()" method (or > > > > > similar). > > > > > For all the code inside Flink, it looks like the data was in the > > config > > > > to > > > > > start with, just that the scripts that compute the variables can > pass > > > the > > > > > values to the process without actually needing to write a file. > > > > > > > > > > For example the "GlobalConfiguration.loadConfig()" method would > take > > > any > > > > > ENV variable prefixed with "flink" and add it as a config key. > > > > > "flink_taskmanager_memory_size=2g" would become > > > "taskmanager.memory.size: > > > > > 2g". > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > [hidden email]> > > > > > wrote: > > > > > > > > > > > Thanks for the comments, Till. > > > > > > > > > > > > I've also seen your comments on the wiki page, but let's keep the > > > > > > discussion here. > > > > > > > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about > naming > > it > > > > > > 'TaskExecutorResourceSpecifics'. > > > > > > - Regarding passing memory configurations into task executors, > I'm > > in > > > > > favor > > > > > > of do it via environment variables rather than configurations, > with > > > the > > > > > > following two reasons. > > > > > > - It is easier to keep the memory options once calculate not to > > be > > > > > > changed with environment variables rather than configurations. > > > > > > - I'm not sure whether we should write the configuration in > > startup > > > > > > scripts. Writing changes into the configuration files when > running > > > the > > > > > > startup scripts does not sounds right to me. Or we could make a > > copy > > > of > > > > > > configuration files per flink cluster, and make the task executor > > to > > > > load > > > > > > from the copy, and clean up the copy after the cluster is > shutdown, > > > > which > > > > > > is complicated. (I think this is also what Stephan means in his > > > comment > > > > > on > > > > > > the wiki page?) > > > > > > - Regarding reserving memory, I think this change should be > > included > > > in > > > > > > this FLIP. I think a big part of motivations of this FLIP is to > > unify > > > > > > memory configuration for streaming / batch and make it easy for > > > > > configuring > > > > > > rocksdb memory. If we don't support memory reservation, then > > > streaming > > > > > jobs > > > > > > cannot use managed memory (neither on-heap or off-heap), which > > makes > > > > this > > > > > > FLIP incomplete. > > > > > > - Regarding network memory, I think you are right. I think we > > > probably > > > > > > don't need to change network stack from using direct memory to > > using > > > > > unsafe > > > > > > native memory. Network memory size is deterministic, cannot be > > > reserved > > > > > as > > > > > > managed memory does, and cannot be overused. I think it also > works > > if > > > > we > > > > > > simply keep using direct memory for network and include it in jvm > > max > > > > > > direct memory size. > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hi Xintong, > > > > > > > > > > > > > > thanks for addressing the comments and adding a more detailed > > > > > > > implementation plan. I have a couple of comments concerning the > > > > > > > implementation plan: > > > > > > > > > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. > > > > Choosing > > > > > a > > > > > > > different name could help here. > > > > > > > - I'm not sure whether I would pass the memory configuration to > > the > > > > > > > TaskExecutor via environment variables. I think it would be > > better > > > to > > > > > > write > > > > > > > it into the configuration one uses to start the TM process. > > > > > > > - If possible, I would exclude the memory reservation from this > > > FLIP > > > > > and > > > > > > > add this as part of a dedicated FLIP. > > > > > > > - If possible, then I would exclude changes to the network > stack > > > from > > > > > > this > > > > > > > FLIP. Maybe we can simply say that the direct memory needed by > > the > > > > > > network > > > > > > > stack is the framework direct memory requirement. Changing how > > the > > > > > memory > > > > > > > is allocated can happen in a second step. This would keep the > > scope > > > > of > > > > > > this > > > > > > > FLIP smaller. > > > > > > > > > > > > > > Cheers, > > > > > > > Till > > > > > > > > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > I just updated the FLIP document on wiki [1], with the > > following > > > > > > changes. > > > > > > > > > > > > > > > > - Removed open question regarding MemorySegment > allocation. > > As > > > > > > > > discussed, we exclude this topic from the scope of this > > FLIP. > > > > > > > > - Updated content about JVM direct memory parameter > > according > > > to > > > > > > > recent > > > > > > > > discussions, and moved the other options to "Rejected > > > > > Alternatives" > > > > > > > for > > > > > > > > the > > > > > > > > moment. > > > > > > > > - Added implementation steps. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > > > > > > > @Xintong: Concerning "wait for memory users before task > > dispose > > > > and > > > > > > > > memory > > > > > > > > > release": I agree, that's how it should be. Let's try it > out. > > > > > > > > > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC > > when > > > > > > > allocating > > > > > > > > > direct memory buffer": There seems to be pretty elaborate > > logic > > > > to > > > > > > free > > > > > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works (like > > > going > > > > > > with > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at all), > > > then > > > > I > > > > > > > think > > > > > > > > it > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we use > > > RocksDB. > > > > > > That > > > > > > > > is a > > > > > > > > > big if, though, I honestly have no idea :D Would be good to > > > > > > understand > > > > > > > > > this, though, because this would affect option (2) and > option > > > > > (1.2). > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > > > > > > > > > Let me try to summarize your points. Please correct me if > > I'm > > > > > > wrong. > > > > > > > > > > > > > > > > > > > > - Memory consumers should always avoid returning > memory > > > > > segments > > > > > > > to > > > > > > > > > > memory manager while there are still un-cleaned > > > structures / > > > > > > > threads > > > > > > > > > > that > > > > > > > > > > may use the memory. Otherwise, it would cause serious > > > > problems > > > > > > by > > > > > > > > > having > > > > > > > > > > multiple consumers trying to use the same memory > > segment. > > > > > > > > > > - JVM does not wait for GC when allocating direct > memory > > > > > buffer. > > > > > > > > > > Therefore even we set proper max direct memory size > > limit, > > > > we > > > > > > may > > > > > > > > > still > > > > > > > > > > encounter direct memory oom if the GC cleaning memory > > > slower > > > > > > than > > > > > > > > the > > > > > > > > > > direct memory allocation. > > > > > > > > > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > > > > > [hidden email] > > > > > > > > > > .invalid> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > > > > > > > > > if additional threads not cleanly shut down before we > can > > > > exit > > > > > > the > > > > > > > > > task: > > > > > > > > > > > In the current case of memory reuse, it has freed up > the > > > > memory > > > > > > it > > > > > > > > > > > uses. If this memory is used by other tasks and > > > asynchronous > > > > > > > threads > > > > > > > > > > > of exited task may still be writing, there will be > > > > concurrent > > > > > > > > security > > > > > > > > > > > problems, and even lead to errors in user computing > > > results. > > > > > > > > > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No > > matter > > > > > what > > > > > > > the > > > > > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > > > > > I don't think it is a good idea, I've encountered so > many > > > > > > > situations > > > > > > > > > > > that it's too late for GC to cause DirectMemory OOM. > > > Release > > > > > and > > > > > > > > > > > allocate DirectMemory depend on the type of user job, > > > which > > > > is > > > > > > > > > > > often beyond our control. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > > > > > To:dev <[hidden email]> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > > Configuration > > > > for > > > > > > > > > > > TaskExecutors > > > > > > > > > > > > > > > > > > > > > > My main concern with option 2 (manually release memory) > > is > > > > that > > > > > > > > > segfaults > > > > > > > > > > > in the JVM send off all sorts of alarms on user ends. > So > > we > > > > > need > > > > > > to > > > > > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures / > > > > > algorithms > > > > > > > > with > > > > > > > > > > > additional threads, like hash table spill/read and > > sorting > > > > > > threads. > > > > > > > > We > > > > > > > > > > need > > > > > > > > > > > to ensure that these cleanly shut down before we can > exit > > > the > > > > > > task. > > > > > > > > > > > I am not sure that we have that guaranteed already, > > that's > > > > why > > > > > > > option > > > > > > > > > 1.1 > > > > > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in this > > way > > > > > really > > > > > > > > makes > > > > > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. I > > > think > > > > it > > > > > > is > > > > > > > > not > > > > > > > > > > that > > > > > > > > > > > > difficult to keep it segfault safe for memory > manager, > > as > > > > > long > > > > > > as > > > > > > > > we > > > > > > > > > > > always > > > > > > > > > > > > de-allocate the memory segment when it is released > from > > > the > > > > > > > memory > > > > > > > > > > > > consumers. Only if the memory consumer continue using > > the > > > > > > buffer > > > > > > > of > > > > > > > > > > > memory > > > > > > > > > > > > segment after releasing it, in which case we do want > > the > > > > job > > > > > to > > > > > > > > fail > > > > > > > > > so > > > > > > > > > > > we > > > > > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. > Not > > > only > > > > > > > because > > > > > > > > > the > > > > > > > > > > > > assumption (regular GC is enough to clean direct > > buffers) > > > > may > > > > > > not > > > > > > > > > > always > > > > > > > > > > > be > > > > > > > > > > > > true, but also it makes harder for finding problems > in > > > > cases > > > > > of > > > > > > > > > memory > > > > > > > > > > > > overuse. E.g., user configured some direct memory for > > the > > > > > user > > > > > > > > > > libraries. > > > > > > > > > > > > If the library actually use more direct memory then > > > > > configured, > > > > > > > > which > > > > > > > > > > > > cannot be cleaned by GC because they are still in > use, > > > may > > > > > lead > > > > > > > to > > > > > > > > > > > overuse > > > > > > > > > > > > of the total container memory. In that case, if it > > didn't > > > > > touch > > > > > > > the > > > > > > > > > JVM > > > > > > > > > > > > default max direct memory limit, we cannot get a > direct > > > > > memory > > > > > > > OOM > > > > > > > > > and > > > > > > > > > > it > > > > > > > > > > > > will become super hard to understand which part of > the > > > > > > > > configuration > > > > > > > > > > need > > > > > > > > > > > > to be updated. > > > > > > > > > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, if > > the > > > > > > > exceeded > > > > > > > > > > direct > > > > > > > > > > > > memory does not reach the max direct memory limit > > > specified > > > > > by > > > > > > > the > > > > > > > > > > > > dedicated parameter. I think it is slightly better > than > > > > 1.2, > > > > > > only > > > > > > > > > > because > > > > > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > > > > [hidden email] > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, > maybe > > > let > > > > > me > > > > > > > > > > summarize > > > > > > > > > > > > it a > > > > > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the > GC. > > > That > > > > > > makes > > > > > > > > it > > > > > > > > > > > > segfault > > > > > > > > > > > > > safe. But then we need a way to trigger GC in case > > > > > > > de-allocation > > > > > > > > > and > > > > > > > > > > > > > re-allocation of a bunch of segments happens > quickly, > > > > which > > > > > > is > > > > > > > > > often > > > > > > > > > > > the > > > > > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is > one > > > way > > > > > to > > > > > > do > > > > > > > > > this > > > > > > > > > > > > > - Another way could be to have a dedicated > > > bookkeeping > > > > in > > > > > > the > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a > number > > > > > > > independent > > > > > > > > of > > > > > > > > > > the > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the memory > > for > > > > the > > > > > > > > > > > > MemorySegments > > > > > > > > > > > > > (option 2). That way we need not worry about > > triggering > > > > GC > > > > > by > > > > > > > > some > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to > prevent > > > > > > > segfaults. > > > > > > > > We > > > > > > > > > > > need > > > > > > > > > > > > to > > > > > > > > > > > > > be very careful about when we release the memory > > > segments > > > > > > (only > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > "off_heap_managed_memory + > > > > > > > > > > direct_memory" > > > > > > > > > > > > and > > > > > > > > > > > > > have "direct_memory" as a separate reserved memory > > > pool. > > > > > > > Because > > > > > > > > if > > > > > > > > > > we > > > > > > > > > > > > just > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > > > "off_heap_managed_memory + > > > > > > > > > > > > jvm_overhead", > > > > > > > > > > > > > then there will be times when that entire memory is > > > > > allocated > > > > > > > by > > > > > > > > > > direct > > > > > > > > > > > > > buffers and we have nothing left for the JVM > > overhead. > > > So > > > > > we > > > > > > > > either > > > > > > > > > > > need > > > > > > > > > > > > a > > > > > > > > > > > > > way to compensate for that (again some safety > margin > > > > cutoff > > > > > > > > value) > > > > > > > > > or > > > > > > > > > > > we > > > > > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware that > it > > > > takes > > > > > > > > > elaborate > > > > > > > > > > > > logic > > > > > > > > > > > > > to push recycling of direct buffers without always > > > > > > triggering a > > > > > > > > > full > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be easiest > to > > > do > > > > in > > > > > > the > > > > > > > > > > > following > > > > > > > > > > > > > order: > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory > > > parameter, > > > > as > > > > > > > > > discussed > > > > > > > > > > > > > above. We would need to find a way to set the > > > > direct_memory > > > > > > > > > parameter > > > > > > > > > > > by > > > > > > > > > > > > > default. We could start with 64 MB and see how it > > goes > > > in > > > > > > > > practice. > > > > > > > > > > One > > > > > > > > > > > > > danger I see is that setting this loo low can > cause a > > > > bunch > > > > > > of > > > > > > > > > > > additional > > > > > > > > > > > > > GCs compared to before (we need to watch this > > > carefully). > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to > > implement, > > > > we > > > > > > > could > > > > > > > > > try > > > > > > > > > > > how > > > > > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > > > > > "-XX:MaxDirectMemorySize" > > > > > > > > > > > > parameter > > > > > > > > > > > > > at all and assume that all the direct memory > > > allocations > > > > > that > > > > > > > the > > > > > > > > > JVM > > > > > > > > > > > and > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned up > fast > > > > enough > > > > > > > > through > > > > > > > > > > > > regular > > > > > > > > > > > > > GC. I am not sure if that is a valid assumption, > > > though. > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > > [hidden email]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > wondering > > > > > whether > > > > > > > we > > > > > > > > > can > > > > > > > > > > > > avoid > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > memory > > > and > > > > > > > network > > > > > > > > > > > memory > > > > > > > > > > > > > with > > > > > > > > > > > > > > alternative 3. But after giving it a second > > thought, > > > I > > > > > > think > > > > > > > > even > > > > > > > > > > for > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > > managed > > > > > > memory > > > > > > > > > could > > > > > > > > > > > > cause > > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in > > this > > > > > FLIP > > > > > > it > > > > > > > > to > > > > > > > > > > have > > > > > > > > > > > > > both > > > > > > > > > > > > > > off-heap managed memory and network memory > > allocated > > > > > > through > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > practically > > > > > native > > > > > > > > memory > > > > > > > > > > and > > > > > > > > > > > > not > > > > > > > > > > > > > > limited by JVM max direct memory. The only parts > of > > > > > memory > > > > > > > > > limited > > > > > > > > > > by > > > > > > > > > > > > JVM > > > > > > > > > > > > > > max direct memory are task off-heap memory and > JVM > > > > > > overhead, > > > > > > > > > which > > > > > > > > > > > are > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > > > direct > > > > > > > memory > > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > understand > > > > the > > > > > > two > > > > > > > > > > > > alternatives > > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it > makes > > > > > things > > > > > > > > > > explicit. > > > > > > > > > > > If > > > > > > > > > > > > > we > > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > > might > > > > end > > > > > > up > > > > > > > > in a > > > > > > > > > > > > similar > > > > > > > > > > > > > > > situation as we are currently in: The user > might > > > see > > > > > that > > > > > > > her > > > > > > > > > > > process > > > > > > > > > > > > > > gets > > > > > > > > > > > > > > > killed by the OS and does not know why this is > > the > > > > > case. > > > > > > > > > > > > Consequently, > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > tries to decrease the process memory size > > (similar > > > to > > > > > > > > > increasing > > > > > > > > > > > the > > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > > direct > > > > > > memory. > > > > > > > > > Even > > > > > > > > > > > > worse, > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > tries to decrease memory budgets which are not > > > fully > > > > > used > > > > > > > and > > > > > > > > > > hence > > > > > > > > > > > > > won't > > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > > > Overhead): > > > > > > > > > 200MB > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > > > Off-Heap > > > > > > > > > Managed > > > > > > > > > > > > Memory > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > > -XX:MaxDirectMemorySize > > > > to > > > > > > > 200MB. > > > > > > > > > > > > > > > > For alternative 3, we set > > -XX:MaxDirectMemorySize > > > > to > > > > > a > > > > > > > very > > > > > > > > > > large > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > Off-Heap > > > > > > Memory > > > > > > > > and > > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > > alternative 3 > > > > > > > > > > should > > > > > > > > > > > > have > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > same utility. Setting larger > > > > -XX:MaxDirectMemorySize > > > > > > will > > > > > > > > not > > > > > > > > > > > > reduce > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > Off-Heap > > > > > > Memory > > > > > > > > and > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. > > To > > > > > avoid > > > > > > > > that, > > > > > > > > > > the > > > > > > > > > > > > only > > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > > user can do is to modify the configuration > > and > > > > > > > increase > > > > > > > > > JVM > > > > > > > > > > > > Direct > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). > Let's > > > say > > > > > > that > > > > > > > > user > > > > > > > > > > > > > increases > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce > the > > > > total > > > > > > > size > > > > > > > > of > > > > > > > > > > > other > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > pools to 750MB, given the total process > > memory > > > > > > remains > > > > > > > > > 1GB. > > > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > > > direct > > > > > > OOM. > > > > > > > > > There > > > > > > > > > > > are > > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > > of exceeding the total process memory > limit, > > > but > > > > > > given > > > > > > > > > that > > > > > > > > > > > the > > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > > not use up all the reserved native memory > > > > > (Off-Heap > > > > > > > > > Managed > > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > direct > > > > > memory > > > > > > > > usage > > > > > > > > > is > > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > > yet very close to 200MB, user probably do > > not > > > > need > > > > > > to > > > > > > > > > change > > > > > > > > > > > the > > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > > perspective, a > > > > > > > feasible > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > > > utilization > > > > > > > > > > compared > > > > > > > > > > > > to > > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > Rohrmann > > < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > > > difference > > > > > > > > > between > > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize > > to > > > > Task > > > > > > > > > Off-Heap > > > > > > > > > > > > Memory > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that this > > size > > > > is > > > > > > too > > > > > > > > low > > > > > > > > > > > > > resulting > > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > > lot of garbage collection and potentially > an > > > OOM. > > > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize > > to > > > > > > > something > > > > > > > > > > larger > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > alternative 2. This would of course reduce > > the > > > > > sizes > > > > > > of > > > > > > > > the > > > > > > > > > > > other > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > > under > > > > > > > > utilization > > > > > > > > > of > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > > > strictly > > > > > > > > sets a > > > > > > > > > > > > higher > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory size and we use only little, > > > then I > > > > > > would > > > > > > > > > > expect > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max > direct > > > > > memory > > > > > > > size > > > > > > > > > > when > > > > > > > > > > > we > > > > > > > > > > > > > do > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > differentiate direct and native memory. > If > > > the > > > > > > direct > > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > direct memory and framework direct > > > memory,could > > > > > be > > > > > > > > > > calculated > > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory > with > > > > fixed > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > > > need > > > > to > > > > > > > check > > > > > > > > > the > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > configurations in client to avoid > > submitting > > > > > > > > successfully > > > > > > > > > > and > > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > > >于2019年8月13日 > > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > > right > > > > that > > > > > > we > > > > > > > > > should > > > > > > > > > > > not > > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > > FLIP > > > > > should > > > > > > > > > > > concentrate > > > > > > > > > > > > > on > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > configure memory pools for > TaskExecutors, > > > > with > > > > > > > > minimum > > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think > alternative > > 3 > > > > may > > > > > > not > > > > > > > > > having > > > > > > > > > > > the > > > > > > > > > > > > > > same > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > > does, > > > > but > > > > > at > > > > > > > the > > > > > > > > > > cost > > > > > > > > > > > of > > > > > > > > > > > > > > risk > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > using memory at the container level, > > which > > > is > > > > > not > > > > > > > > good. > > > > > > > > > > My > > > > > > > > > > > > > point > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > > Overhead" > > > > > > are > > > > > > > > not > > > > > > > > > > easy > > > > > > > > > > > > to > > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > > alternative 2, users might configure > them > > > > > higher > > > > > > > than > > > > > > > > > > what > > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > > > alternative > > > > > > > > 3, > > > > > > > > > > > users > > > > > > > > > > > > do > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config the > > two > > > > > > options > > > > > > > > > > > > aggressively > > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > > container > > > > > > > > memory > > > > > > > > > > > usage > > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > > Rohrmann < > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > > quite > > > > > good. > > > > > > > > > > > Concerning > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > > question about allocating memory > > > segments, > > > > I > > > > > > was > > > > > > > > > > > wondering > > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > context > > > of > > > > > this > > > > > > > > FLIP > > > > > > > > > or > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > knowing > > > all > > > > > > > > details, > > > > > > > > > I > > > > > > > > > > > > would > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > > that we would widen the scope of this > > > FLIP > > > > > too > > > > > > > much > > > > > > > > > > > because > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > to touch all the existing call sites > of > > > the > > > > > > > > > > MemoryManager > > > > > > > > > > > > > where > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > > memory segments (this should mainly > be > > > > batch > > > > > > > > > > operators). > > > > > > > > > > > > The > > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > > MemoryManager > > > > > > > > > should > > > > > > > > > > > not > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > this and I would hope that this is > the > > > only > > > > > > point > > > > > > > > of > > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > > about > > > > > > setting > > > > > > > > or > > > > > > > > > > not > > > > > > > > > > > > > > setting > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > > > interested > > > > > > > why > > > > > > > > > > Yang > > > > > > > > > > > > Wang > > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > > concern > > > > > about > > > > > > > > this > > > > > > > > > > > would > > > > > > > > > > > > be > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > > be in a similar situation as we are > now > > > > with > > > > > > the > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > > the different memory pools are not > > > clearly > > > > > > > > separated > > > > > > > > > > and > > > > > > > > > > > > can > > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite > hard > > > to > > > > > > > > understand > > > > > > > > > > > what > > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > > process to get killed for using too > > much > > > > > > memory. > > > > > > > > This > > > > > > > > > > > could > > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > > have > > > > with > > > > > > the > > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > setting a sane default value for max > > > direct > > > > > > > memory > > > > > > > > > and > > > > > > > > > > > > giving > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > option to increase it if he runs into > > an > > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 > lead > > to > > > > > lower > > > > > > > > > memory > > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > > > memory > > > > > > to a > > > > > > > > > > higher > > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > Xintong > > > > Song < > > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > > I think setting a very large max > > direct > > > > > > memory > > > > > > > > size > > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > > about > > > > > > direct > > > > > > > > OOM, > > > > > > > > > > and > > > > > > > > > > > > we > > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > memory > > > with > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > > However, there are also some down > > sides > > > > of > > > > > > > doing > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is > that > > > if > > > > a > > > > > > task > > > > > > > > > > > executor > > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > killed due to overusing memory, > it > > > > could > > > > > > be > > > > > > > > hard > > > > > > > > > > for > > > > > > > > > > > > use > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > > - Another down side is that the > > JVM > > > > > never > > > > > > > > > trigger > > > > > > > > > > GC > > > > > > > > > > > > due > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > direct memory limit, because the > > > limit > > > > > is > > > > > > > too > > > > > > > > > high > > > > > > > > > > > to > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > > memory > > > > to > > > > > > > > trigger > > > > > > > > > > GC > > > > > > > > > > > > and > > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > memory. That could be a problem > in > > > > cases > > > > > > > where > > > > > > > > > we > > > > > > > > > > > have > > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > activity > > > to > > > > > > > trigger > > > > > > > > > the > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons > for > > > > > > preferring > > > > > > > > > > > setting a > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > > multiple > > > > > > > > > > configuration > > > > > > > > > > > > > that > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > > should > > > > > throw > > > > > > > an > > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > client > > > side > > > > > is > > > > > > a > > > > > > > > good > > > > > > > > > > > idea, > > > > > > > > > > > > > so > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > > before > > > > > > > submitting > > > > > > > > > the > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > > client > > > > side > > > > > > > > > checking, > > > > > > > > > > > > > because > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > > > different > > > > > > > > > machines > > > > > > > > > > > may > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > > configurations and the client does > > see > > > > > that. > > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > > > Wang > > > > < > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > proposal. > > > > After > > > > > > all > > > > > > > > the > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > > powerful > > > to > > > > > > > control > > > > > > > > > the > > > > > > > > > > > > flink > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > direct > > > > > memory > > > > > > > and > > > > > > > > > > native > > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > > > Right? > > > > > > So i > > > > > > > > > don’t > > > > > > > > > > > > think > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > > properly. I > > > > > > > prefer > > > > > > > > > > > leaving > > > > > > > > > > > > > it a > > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > > memory(network > > > > > > > > > > memory, > > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > > is larger than total process > > memory, > > > > how > > > > > do > > > > > > > we > > > > > > > > > deal > > > > > > > > > > > > with > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > > configuration > > > > > > in > > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > [hidden email]> > > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > discussion > > > > > > thread > > > > > > > on > > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > > TaskExecutors"[1], > > > > > > where > > > > > > > we > > > > > > > > > > > > describe > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > configurations. > > > > The > > > > > > > FLIP > > > > > > > > > > > document > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > early design "Memory Management > > and > > > > > > > > > Configuration > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > > discussions > > > > > > > both > > > > > > > > > > online > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > > shortcomings > > > > > > of > > > > > > > > > > current > > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration > for > > > > > > Streaming > > > > > > > > and > > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > > configuration > > > > > > of > > > > > > > > > > RocksDB > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > > > hard > > > > to > > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > problems > > > can > > > > > be > > > > > > > > > > summarized > > > > > > > > > > > > as > > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > > also > > > > > > account > > > > > > > > for > > > > > > > > > > > memory > > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > > memory > > > > is > > > > > > > > > > partitioned > > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > pools. > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > configuration > > > > > > options > > > > > > > > and > > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > > > FLIP > > > > > wiki > > > > > > > > > > document > > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > > design > > > > doc > > > > > > [2] > > > > > > > is > > > > > > > > > out > > > > > > > > > > > of > > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > discussion > > > in > > > > > > this > > > > > > > > > > mailing > > > > > > > > > > > > list > > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > > [hidden email]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > wondering > > > > > whether > > > > > > > we > > > > > > > > > can > > > > > > > > > > > > avoid > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > memory > > > and > > > > > > > network > > > > > > > > > > > memory > > > > > > > > > > > > > with > > > > > > > > > > > > > > alternative 3. But after giving it a second > > thought, > > > I > > > > > > think > > > > > > > > even > > > > > > > > > > for > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > > managed > > > > > > memory > > > > > > > > > could > > > > > > > > > > > > cause > > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed in > > this > > > > > FLIP > > > > > > it > > > > > > > > to > > > > > > > > > > have > > > > > > > > > > > > > both > > > > > > > > > > > > > > off-heap managed memory and network memory > > allocated > > > > > > through > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > practically > > > > > native > > > > > > > > memory > > > > > > > > > > and > > > > > > > > > > > > not > > > > > > > > > > > > > > limited by JVM max direct memory. The only parts > of > > > > > memory > > > > > > > > > limited > > > > > > > > > > by > > > > > > > > > > > > JVM > > > > > > > > > > > > > > max direct memory are task off-heap memory and > JVM > > > > > > overhead, > > > > > > > > > which > > > > > > > > > > > are > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM max > > > > direct > > > > > > > memory > > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > understand > > > > the > > > > > > two > > > > > > > > > > > > alternatives > > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it > makes > > > > > things > > > > > > > > > > explicit. > > > > > > > > > > > If > > > > > > > > > > > > > we > > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > > might > > > > end > > > > > > up > > > > > > > > in a > > > > > > > > > > > > similar > > > > > > > > > > > > > > > situation as we are currently in: The user > might > > > see > > > > > that > > > > > > > her > > > > > > > > > > > process > > > > > > > > > > > > > > gets > > > > > > > > > > > > > > > killed by the OS and does not know why this is > > the > > > > > case. > > > > > > > > > > > > Consequently, > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > tries to decrease the process memory size > > (similar > > > to > > > > > > > > > increasing > > > > > > > > > > > the > > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > > direct > > > > > > memory. > > > > > > > > > Even > > > > > > > > > > > > worse, > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > tries to decrease memory budgets which are not > > > fully > > > > > used > > > > > > > and > > > > > > > > > > hence > > > > > > > > > > > > > won't > > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong Song < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + JVM > > > > > > Overhead): > > > > > > > > > 200MB > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM Metaspace, > > > > > Off-Heap > > > > > > > > > Managed > > > > > > > > > > > > Memory > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > > -XX:MaxDirectMemorySize > > > > to > > > > > > > 200MB. > > > > > > > > > > > > > > > > For alternative 3, we set > > -XX:MaxDirectMemorySize > > > > to > > > > > a > > > > > > > very > > > > > > > > > > large > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > Off-Heap > > > > > > Memory > > > > > > > > and > > > > > > > > > > JVM > > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > > alternative 3 > > > > > > > > > > should > > > > > > > > > > > > have > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > same utility. Setting larger > > > > -XX:MaxDirectMemorySize > > > > > > will > > > > > > > > not > > > > > > > > > > > > reduce > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > Off-Heap > > > > > > Memory > > > > > > > > and > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent OOM. > > To > > > > > avoid > > > > > > > > that, > > > > > > > > > > the > > > > > > > > > > > > only > > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > > user can do is to modify the configuration > > and > > > > > > > increase > > > > > > > > > JVM > > > > > > > > > > > > Direct > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). > Let's > > > say > > > > > > that > > > > > > > > user > > > > > > > > > > > > > increases > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce > the > > > > total > > > > > > > size > > > > > > > > of > > > > > > > > > > > other > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > pools to 750MB, given the total process > > memory > > > > > > remains > > > > > > > > > 1GB. > > > > > > > > > > > > > > > > - For alternative 3, there is no chance of > > > > direct > > > > > > OOM. > > > > > > > > > There > > > > > > > > > > > are > > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > > of exceeding the total process memory > limit, > > > but > > > > > > given > > > > > > > > > that > > > > > > > > > > > the > > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > > not use up all the reserved native memory > > > > > (Off-Heap > > > > > > > > > Managed > > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > direct > > > > > memory > > > > > > > > usage > > > > > > > > > is > > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > > yet very close to 200MB, user probably do > > not > > > > need > > > > > > to > > > > > > > > > change > > > > > > > > > > > the > > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > > perspective, a > > > > > > > feasible > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > for alternative 2 may lead to lower resource > > > > > > utilization > > > > > > > > > > compared > > > > > > > > > > > > to > > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > Rohrmann > > < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand the > > > > > difference > > > > > > > > > between > > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set XX:MaxDirectMemorySize > > to > > > > Task > > > > > > > > > Off-Heap > > > > > > > > > > > > Memory > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that this > > size > > > > is > > > > > > too > > > > > > > > low > > > > > > > > > > > > > resulting > > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > > lot of garbage collection and potentially > an > > > OOM. > > > > > > > > > > > > > > > > > - Alternative 3: set XX:MaxDirectMemorySize > > to > > > > > > > something > > > > > > > > > > larger > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > alternative 2. This would of course reduce > > the > > > > > sizes > > > > > > of > > > > > > > > the > > > > > > > > > > > other > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > > under > > > > > > > > utilization > > > > > > > > > of > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > compared to alternative 3? If alternative 3 > > > > > strictly > > > > > > > > sets a > > > > > > > > > > > > higher > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > direct memory size and we use only little, > > > then I > > > > > > would > > > > > > > > > > expect > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang Wang < > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max > direct > > > > > memory > > > > > > > size > > > > > > > > > > when > > > > > > > > > > > we > > > > > > > > > > > > > do > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > differentiate direct and native memory. > If > > > the > > > > > > direct > > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > direct memory and framework direct > > > memory,could > > > > > be > > > > > > > > > > calculated > > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory > with > > > > fixed > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and k8s,we > > > need > > > > to > > > > > > > check > > > > > > > > > the > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > configurations in client to avoid > > submitting > > > > > > > > successfully > > > > > > > > > > and > > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > > >于2019年8月13日 > > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > > right > > > > that > > > > > > we > > > > > > > > > should > > > > > > > > > > > not > > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > > FLIP > > > > > should > > > > > > > > > > > concentrate > > > > > > > > > > > > > on > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > configure memory pools for > TaskExecutors, > > > > with > > > > > > > > minimum > > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think > alternative > > 3 > > > > may > > > > > > not > > > > > > > > > having > > > > > > > > > > > the > > > > > > > > > > > > > > same > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > > does, > > > > but > > > > > at > > > > > > > the > > > > > > > > > > cost > > > > > > > > > > > of > > > > > > > > > > > > > > risk > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > using memory at the container level, > > which > > > is > > > > > not > > > > > > > > good. > > > > > > > > > > My > > > > > > > > > > > > > point > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > > Overhead" > > > > > > are > > > > > > > > not > > > > > > > > > > easy > > > > > > > > > > > > to > > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > > alternative 2, users might configure > them > > > > > higher > > > > > > > than > > > > > > > > > > what > > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. For > > > > > > alternative > > > > > > > > 3, > > > > > > > > > > > users > > > > > > > > > > > > do > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config the > > two > > > > > > options > > > > > > > > > > > > aggressively > > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > > container > > > > > > > > memory > > > > > > > > > > > usage > > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > > Rohrmann < > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > > quite > > > > > good. > > > > > > > > > > > Concerning > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > > question about allocating memory > > > segments, > > > > I > > > > > > was > > > > > > > > > > > wondering > > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > context > > > of > > > > > this > > > > > > > > FLIP > > > > > > > > > or > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > knowing > > > all > > > > > > > > details, > > > > > > > > > I > > > > > > > > > > > > would > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > > that we would widen the scope of this > > > FLIP > > > > > too > > > > > > > much > > > > > > > > > > > because > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > to touch all the existing call sites > of > > > the > > > > > > > > > > MemoryManager > > > > > > > > > > > > > where > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > > memory segments (this should mainly > be > > > > batch > > > > > > > > > > operators). > > > > > > > > > > > > The > > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > > MemoryManager > > > > > > > > > should > > > > > > > > > > > not > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > this and I would hope that this is > the > > > only > > > > > > point > > > > > > > > of > > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > > about > > > > > > setting > > > > > > > > or > > > > > > > > > > not > > > > > > > > > > > > > > setting > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also be > > > > > interested > > > > > > > why > > > > > > > > > > Yang > > > > > > > > > > > > Wang > > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > > concern > > > > > about > > > > > > > > this > > > > > > > > > > > would > > > > > > > > > > > > be > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > > be in a similar situation as we are > now > > > > with > > > > > > the > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > > the different memory pools are not > > > clearly > > > > > > > > separated > > > > > > > > > > and > > > > > > > > > > > > can > > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite > hard > > > to > > > > > > > > understand > > > > > > > > > > > what > > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > > process to get killed for using too > > much > > > > > > memory. > > > > > > > > This > > > > > > > > > > > could > > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > > have > > > > with > > > > > > the > > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > setting a sane default value for max > > > direct > > > > > > > memory > > > > > > > > > and > > > > > > > > > > > > giving > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > option to increase it if he runs into > > an > > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 > lead > > to > > > > > lower > > > > > > > > > memory > > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the direct > > > > memory > > > > > > to a > > > > > > > > > > higher > > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > Xintong > > > > Song < > > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > > I think setting a very large max > > direct > > > > > > memory > > > > > > > > size > > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > > about > > > > > > direct > > > > > > > > OOM, > > > > > > > > > > and > > > > > > > > > > > > we > > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > memory > > > with > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > > However, there are also some down > > sides > > > > of > > > > > > > doing > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is > that > > > if > > > > a > > > > > > task > > > > > > > > > > > executor > > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > killed due to overusing memory, > it > > > > could > > > > > > be > > > > > > > > hard > > > > > > > > > > for > > > > > > > > > > > > use > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > > - Another down side is that the > > JVM > > > > > never > > > > > > > > > trigger > > > > > > > > > > GC > > > > > > > > > > > > due > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > direct memory limit, because the > > > limit > > > > > is > > > > > > > too > > > > > > > > > high > > > > > > > > > > > to > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > > memory > > > > to > > > > > > > > trigger > > > > > > > > > > GC > > > > > > > > > > > > and > > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > memory. That could be a problem > in > > > > cases > > > > > > > where > > > > > > > > > we > > > > > > > > > > > have > > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > activity > > > to > > > > > > > trigger > > > > > > > > > the > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons > for > > > > > > preferring > > > > > > > > > > > setting a > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > > multiple > > > > > > > > > > configuration > > > > > > > > > > > > > that > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > > should > > > > > throw > > > > > > > an > > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > client > > > side > > > > > is > > > > > > a > > > > > > > > good > > > > > > > > > > > idea, > > > > > > > > > > > > > so > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > > before > > > > > > > submitting > > > > > > > > > the > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > > client > > > > side > > > > > > > > > checking, > > > > > > > > > > > > > because > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers on > > > > > different > > > > > > > > > machines > > > > > > > > > > > may > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > > configurations and the client does > > see > > > > > that. > > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM Yang > > > Wang > > > > < > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > proposal. > > > > After > > > > > > all > > > > > > > > the > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > > powerful > > > to > > > > > > > control > > > > > > > > > the > > > > > > > > > > > > flink > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > > just have few questions about it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > direct > > > > > memory > > > > > > > and > > > > > > > > > > native > > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > > included in task off-heap memory. > > > > Right? > > > > > > So i > > > > > > > > > don’t > > > > > > > > > > > > think > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > > properly. I > > > > > > > prefer > > > > > > > > > > > leaving > > > > > > > > > > > > > it a > > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > > memory(network > > > > > > > > > > memory, > > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > > is larger than total process > > memory, > > > > how > > > > > do > > > > > > > we > > > > > > > > > deal > > > > > > > > > > > > with > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > > configuration > > > > > > in > > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > [hidden email]> > > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > discussion > > > > > > thread > > > > > > > on > > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > > TaskExecutors"[1], > > > > > > where > > > > > > > we > > > > > > > > > > > > describe > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > configurations. > > > > The > > > > > > > FLIP > > > > > > > > > > > document > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > early design "Memory Management > > and > > > > > > > > > Configuration > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > > discussions > > > > > > > both > > > > > > > > > > online > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > > shortcomings > > > > > > of > > > > > > > > > > current > > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration > for > > > > > > Streaming > > > > > > > > and > > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > > configuration > > > > > > of > > > > > > > > > > RocksDB > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain and > > > hard > > > > to > > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > problems > > > can > > > > > be > > > > > > > > > > summarized > > > > > > > > > > > > as > > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > > also > > > > > > account > > > > > > > > for > > > > > > > > > > > memory > > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > > memory > > > > is > > > > > > > > > > partitioned > > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > pools. > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > configuration > > > > > > options > > > > > > > > and > > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the > > > FLIP > > > > > wiki > > > > > > > > > > document > > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > > design > > > > doc > > > > > > [2] > > > > > > > is > > > > > > > > > out > > > > > > > > > > > of > > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > discussion > > > in > > > > > > this > > > > > > > > > > mailing > > > > > > > > > > > > list > > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
I think our goal should be that the configuration is fully specified when
the process is started. By considering the internal calculation step to be rather validate existing values and calculate missing ones, these two proposal shouldn't even conflict (given determinism). Since we don't want to change an existing flink-conf.yaml, specifying the full configuration would require to pass in the options differently. One way could be the ENV variables approach. The reason why I'm trying to exclude this feature from the FLIP is that I believe it needs a bit more discussion. Just some questions which come to my mind: What would be the exact format (FLINK_KEY_NAME)? Would we support a dot separator which is supported by some systems (FLINK.KEY.NAME)? If we accept the dot separator what would be the order of precedence if there are two ENV variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the precedence of env variable vs. dynamic configuration value specified via -D? Another approach could be to pass in the dynamic configuration values via `-Dkey=value` to the Flink process. For that we don't have to change anything because the functionality already exists. Cheers, Till On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> wrote: > I see. Under the assumption of strict determinism that should work. > > The original proposal had this point "don't compute inside the TM, compute > outside and supply a full config", because that sounded more intuitive. > > On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> > wrote: > > > My understanding was that before starting the Flink process we call a > > utility which calculates these values. I assume that this utility will do > > the calculation based on a set of configured values (process memory, > flink > > memory, network memory etc.). Assuming that these values don't differ > from > > the values with which the JVM is started, it should be possible to > > recompute them in the Flink process in order to set the values. > > > > > > > > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> wrote: > > > > > When computing the values in the JVM process after it started, how > would > > > you deal with values like Max Direct Memory, Metaspace size. native > > memory > > > reservation (reduce heap size), etc? All the values that are parameters > > to > > > the JVM process and that need to be supplied at process startup? > > > > > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email]> > > > wrote: > > > > > > > Thanks for the clarification. I have some more comments: > > > > > > > > - I would actually split the logic to compute the process memory > > > > requirements and storing the values into two things. E.g. one could > > name > > > > the former TaskExecutorProcessUtility and the latter > > > > TaskExecutorProcessMemory. But we can discuss this on the PR since > it's > > > > just a naming detail. > > > > > > > > - Generally, I'm not opposed to making configuration values > overridable > > > by > > > > ENV variables. I think this is a very good idea and makes the > > > > configurability of Flink processes easier. However, I think that > adding > > > > this functionality should not be part of this FLIP because it would > > > simply > > > > widen the scope unnecessarily. > > > > > > > > The reasons why I believe it is unnecessary are the following: For > Yarn > > > we > > > > already create write a flink-conf.yaml which could be populated with > > the > > > > memory settings. For the other processes it should not make a > > difference > > > > whether the loaded Configuration is populated with the memory > settings > > > from > > > > ENV variables or by using TaskExecutorProcessUtility to compute the > > > missing > > > > values from the loaded configuration. If the latter would not be > > possible > > > > (wrong or missing configuration values), then we should not have been > > > able > > > > to actually start the process in the first place. > > > > > > > > - Concerning the memory reservation: I agree with you that we need > the > > > > memory reservation functionality to make streaming jobs work with > > > "managed" > > > > memory. However, w/o this functionality the whole Flip would already > > > bring > > > > a good amount of improvements to our users when running batch jobs. > > > > Moreover, by keeping the scope smaller we can complete the FLIP > faster. > > > > Hence, I would propose to address the memory reservation > functionality > > > as a > > > > follow up FLIP (which Yu is working on if I'm not mistaken). > > > > > > > > Cheers, > > > > Till > > > > > > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> > > > wrote: > > > > > > > > > Just add my 2 cents. > > > > > > > > > > Using environment variables to override the configuration for > > different > > > > > taskmanagers is better. > > > > > We do not need to generate dedicated flink-conf.yaml for all > > > > taskmanagers. > > > > > A common flink-conf.yam and different environment variables are > > enough. > > > > > By reducing the distributed cached files, it could make launching a > > > > > taskmanager faster. > > > > > > > > > > Stephan gives a good suggestion that we could move the logic into > > > > > "GlobalConfiguration.loadConfig()" method. > > > > > Maybe the client could also benefit from this. Different users do > not > > > > have > > > > > to export FLINK_CONF_DIR to update few config options. > > > > > > > > > > > > > > > Best, > > > > > Yang > > > > > > > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > > > > > > > > > > One note on the Environment Variables and Configuration > discussion. > > > > > > > > > > > > My understanding is that passed ENV variables are added to the > > > > > > configuration in the "GlobalConfiguration.loadConfig()" method > (or > > > > > > similar). > > > > > > For all the code inside Flink, it looks like the data was in the > > > config > > > > > to > > > > > > start with, just that the scripts that compute the variables can > > pass > > > > the > > > > > > values to the process without actually needing to write a file. > > > > > > > > > > > > For example the "GlobalConfiguration.loadConfig()" method would > > take > > > > any > > > > > > ENV variable prefixed with "flink" and add it as a config key. > > > > > > "flink_taskmanager_memory_size=2g" would become > > > > "taskmanager.memory.size: > > > > > > 2g". > > > > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Thanks for the comments, Till. > > > > > > > > > > > > > > I've also seen your comments on the wiki page, but let's keep > the > > > > > > > discussion here. > > > > > > > > > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about > > naming > > > it > > > > > > > 'TaskExecutorResourceSpecifics'. > > > > > > > - Regarding passing memory configurations into task executors, > > I'm > > > in > > > > > > favor > > > > > > > of do it via environment variables rather than configurations, > > with > > > > the > > > > > > > following two reasons. > > > > > > > - It is easier to keep the memory options once calculate not > to > > > be > > > > > > > changed with environment variables rather than configurations. > > > > > > > - I'm not sure whether we should write the configuration in > > > startup > > > > > > > scripts. Writing changes into the configuration files when > > running > > > > the > > > > > > > startup scripts does not sounds right to me. Or we could make a > > > copy > > > > of > > > > > > > configuration files per flink cluster, and make the task > executor > > > to > > > > > load > > > > > > > from the copy, and clean up the copy after the cluster is > > shutdown, > > > > > which > > > > > > > is complicated. (I think this is also what Stephan means in his > > > > comment > > > > > > on > > > > > > > the wiki page?) > > > > > > > - Regarding reserving memory, I think this change should be > > > included > > > > in > > > > > > > this FLIP. I think a big part of motivations of this FLIP is to > > > unify > > > > > > > memory configuration for streaming / batch and make it easy for > > > > > > configuring > > > > > > > rocksdb memory. If we don't support memory reservation, then > > > > streaming > > > > > > jobs > > > > > > > cannot use managed memory (neither on-heap or off-heap), which > > > makes > > > > > this > > > > > > > FLIP incomplete. > > > > > > > - Regarding network memory, I think you are right. I think we > > > > probably > > > > > > > don't need to change network stack from using direct memory to > > > using > > > > > > unsafe > > > > > > > native memory. Network memory size is deterministic, cannot be > > > > reserved > > > > > > as > > > > > > > managed memory does, and cannot be overused. I think it also > > works > > > if > > > > > we > > > > > > > simply keep using direct memory for network and include it in > jvm > > > max > > > > > > > direct memory size. > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Xintong, > > > > > > > > > > > > > > > > thanks for addressing the comments and adding a more detailed > > > > > > > > implementation plan. I have a couple of comments concerning > the > > > > > > > > implementation plan: > > > > > > > > > > > > > > > > - The name `TaskExecutorSpecifics` is not really descriptive. > > > > > Choosing > > > > > > a > > > > > > > > different name could help here. > > > > > > > > - I'm not sure whether I would pass the memory configuration > to > > > the > > > > > > > > TaskExecutor via environment variables. I think it would be > > > better > > > > to > > > > > > > write > > > > > > > > it into the configuration one uses to start the TM process. > > > > > > > > - If possible, I would exclude the memory reservation from > this > > > > FLIP > > > > > > and > > > > > > > > add this as part of a dedicated FLIP. > > > > > > > > - If possible, then I would exclude changes to the network > > stack > > > > from > > > > > > > this > > > > > > > > FLIP. Maybe we can simply say that the direct memory needed > by > > > the > > > > > > > network > > > > > > > > stack is the framework direct memory requirement. Changing > how > > > the > > > > > > memory > > > > > > > > is allocated can happen in a second step. This would keep the > > > scope > > > > > of > > > > > > > this > > > > > > > > FLIP smaller. > > > > > > > > > > > > > > > > Cheers, > > > > > > > > Till > > > > > > > > > > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > I just updated the FLIP document on wiki [1], with the > > > following > > > > > > > changes. > > > > > > > > > > > > > > > > > > - Removed open question regarding MemorySegment > > allocation. > > > As > > > > > > > > > discussed, we exclude this topic from the scope of this > > > FLIP. > > > > > > > > > - Updated content about JVM direct memory parameter > > > according > > > > to > > > > > > > > recent > > > > > > > > > discussions, and moved the other options to "Rejected > > > > > > Alternatives" > > > > > > > > for > > > > > > > > > the > > > > > > > > > moment. > > > > > > > > > - Added implementation steps. > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > > [hidden email] > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > @Xintong: Concerning "wait for memory users before task > > > dispose > > > > > and > > > > > > > > > memory > > > > > > > > > > release": I agree, that's how it should be. Let's try it > > out. > > > > > > > > > > > > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for GC > > > when > > > > > > > > allocating > > > > > > > > > > direct memory buffer": There seems to be pretty elaborate > > > logic > > > > > to > > > > > > > free > > > > > > > > > > buffers when allocating new ones. See > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > > > > > > > > > > > > > > > @Till: Maybe. If we assume that the JVM default works > (like > > > > going > > > > > > > with > > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at > all), > > > > then > > > > > I > > > > > > > > think > > > > > > > > > it > > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we use > > > > RocksDB. > > > > > > > That > > > > > > > > > is a > > > > > > > > > > big if, though, I honestly have no idea :D Would be good > to > > > > > > > understand > > > > > > > > > > this, though, because this would affect option (2) and > > option > > > > > > (1.2). > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > > > > > > > > > > > > > > > > > Let me try to summarize your points. Please correct me > if > > > I'm > > > > > > > wrong. > > > > > > > > > > > > > > > > > > > > > > - Memory consumers should always avoid returning > > memory > > > > > > segments > > > > > > > > to > > > > > > > > > > > memory manager while there are still un-cleaned > > > > structures / > > > > > > > > threads > > > > > > > > > > > that > > > > > > > > > > > may use the memory. Otherwise, it would cause > serious > > > > > problems > > > > > > > by > > > > > > > > > > having > > > > > > > > > > > multiple consumers trying to use the same memory > > > segment. > > > > > > > > > > > - JVM does not wait for GC when allocating direct > > memory > > > > > > buffer. > > > > > > > > > > > Therefore even we set proper max direct memory size > > > limit, > > > > > we > > > > > > > may > > > > > > > > > > still > > > > > > > > > > > encounter direct memory oom if the GC cleaning > memory > > > > slower > > > > > > > than > > > > > > > > > the > > > > > > > > > > > direct memory allocation. > > > > > > > > > > > > > > > > > > > > > > Am I understanding this correctly? > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > > > > > > [hidden email] > > > > > > > > > > > .invalid> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi stephan: > > > > > > > > > > > > > > > > > > > > > > > > About option 2: > > > > > > > > > > > > > > > > > > > > > > > > if additional threads not cleanly shut down before we > > can > > > > > exit > > > > > > > the > > > > > > > > > > task: > > > > > > > > > > > > In the current case of memory reuse, it has freed up > > the > > > > > memory > > > > > > > it > > > > > > > > > > > > uses. If this memory is used by other tasks and > > > > asynchronous > > > > > > > > threads > > > > > > > > > > > > of exited task may still be writing, there will be > > > > > concurrent > > > > > > > > > security > > > > > > > > > > > > problems, and even lead to errors in user computing > > > > results. > > > > > > > > > > > > > > > > > > > > > > > > So I think this is a serious and intolerable bug, No > > > matter > > > > > > what > > > > > > > > the > > > > > > > > > > > > option is, it should be avoided. > > > > > > > > > > > > > > > > > > > > > > > > About direct memory cleaned by GC: > > > > > > > > > > > > I don't think it is a good idea, I've encountered so > > many > > > > > > > > situations > > > > > > > > > > > > that it's too late for GC to cause DirectMemory OOM. > > > > Release > > > > > > and > > > > > > > > > > > > allocate DirectMemory depend on the type of user > job, > > > > which > > > > > is > > > > > > > > > > > > often beyond our control. > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Jingsong Lee > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > > > > > > > To:dev <[hidden email]> > > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > > > Configuration > > > > > for > > > > > > > > > > > > TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > My main concern with option 2 (manually release > memory) > > > is > > > > > that > > > > > > > > > > segfaults > > > > > > > > > > > > in the JVM send off all sorts of alarms on user ends. > > So > > > we > > > > > > need > > > > > > > to > > > > > > > > > > > > guarantee that this never happens. > > > > > > > > > > > > > > > > > > > > > > > > The trickyness is in tasks that uses data structures > / > > > > > > algorithms > > > > > > > > > with > > > > > > > > > > > > additional threads, like hash table spill/read and > > > sorting > > > > > > > threads. > > > > > > > > > We > > > > > > > > > > > need > > > > > > > > > > > > to ensure that these cleanly shut down before we can > > exit > > > > the > > > > > > > task. > > > > > > > > > > > > I am not sure that we have that guaranteed already, > > > that's > > > > > why > > > > > > > > option > > > > > > > > > > 1.1 > > > > > > > > > > > > seemed simpler to me. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in > this > > > way > > > > > > really > > > > > > > > > makes > > > > > > > > > > > > > things easier to understand. > > > > > > > > > > > > > > > > > > > > > > > > > > I'm in favor of option 2, at least for the moment. > I > > > > think > > > > > it > > > > > > > is > > > > > > > > > not > > > > > > > > > > > that > > > > > > > > > > > > > difficult to keep it segfault safe for memory > > manager, > > > as > > > > > > long > > > > > > > as > > > > > > > > > we > > > > > > > > > > > > always > > > > > > > > > > > > > de-allocate the memory segment when it is released > > from > > > > the > > > > > > > > memory > > > > > > > > > > > > > consumers. Only if the memory consumer continue > using > > > the > > > > > > > buffer > > > > > > > > of > > > > > > > > > > > > memory > > > > > > > > > > > > > segment after releasing it, in which case we do > want > > > the > > > > > job > > > > > > to > > > > > > > > > fail > > > > > > > > > > so > > > > > > > > > > > > we > > > > > > > > > > > > > detect the memory leak early. > > > > > > > > > > > > > > > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. > > Not > > > > only > > > > > > > > because > > > > > > > > > > the > > > > > > > > > > > > > assumption (regular GC is enough to clean direct > > > buffers) > > > > > may > > > > > > > not > > > > > > > > > > > always > > > > > > > > > > > > be > > > > > > > > > > > > > true, but also it makes harder for finding problems > > in > > > > > cases > > > > > > of > > > > > > > > > > memory > > > > > > > > > > > > > overuse. E.g., user configured some direct memory > for > > > the > > > > > > user > > > > > > > > > > > libraries. > > > > > > > > > > > > > If the library actually use more direct memory then > > > > > > configured, > > > > > > > > > which > > > > > > > > > > > > > cannot be cleaned by GC because they are still in > > use, > > > > may > > > > > > lead > > > > > > > > to > > > > > > > > > > > > overuse > > > > > > > > > > > > > of the total container memory. In that case, if it > > > didn't > > > > > > touch > > > > > > > > the > > > > > > > > > > JVM > > > > > > > > > > > > > default max direct memory limit, we cannot get a > > direct > > > > > > memory > > > > > > > > OOM > > > > > > > > > > and > > > > > > > > > > > it > > > > > > > > > > > > > will become super hard to understand which part of > > the > > > > > > > > > configuration > > > > > > > > > > > need > > > > > > > > > > > > > to be updated. > > > > > > > > > > > > > > > > > > > > > > > > > > For option 1.1, it has the similar problem as 1.2, > if > > > the > > > > > > > > exceeded > > > > > > > > > > > direct > > > > > > > > > > > > > memory does not reach the max direct memory limit > > > > specified > > > > > > by > > > > > > > > the > > > > > > > > > > > > > dedicated parameter. I think it is slightly better > > than > > > > > 1.2, > > > > > > > only > > > > > > > > > > > because > > > > > > > > > > > > > we can tune the parameter. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, > > maybe > > > > let > > > > > > me > > > > > > > > > > > summarize > > > > > > > > > > > > > it a > > > > > > > > > > > > > > bit differently: > > > > > > > > > > > > > > > > > > > > > > > > > > > > We have the following two options: > > > > > > > > > > > > > > > > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the > > GC. > > > > That > > > > > > > makes > > > > > > > > > it > > > > > > > > > > > > > segfault > > > > > > > > > > > > > > safe. But then we need a way to trigger GC in > case > > > > > > > > de-allocation > > > > > > > > > > and > > > > > > > > > > > > > > re-allocation of a bunch of segments happens > > quickly, > > > > > which > > > > > > > is > > > > > > > > > > often > > > > > > > > > > > > the > > > > > > > > > > > > > > case during batch scheduling or task restart. > > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) is > > one > > > > way > > > > > > to > > > > > > > do > > > > > > > > > > this > > > > > > > > > > > > > > - Another way could be to have a dedicated > > > > bookkeeping > > > > > in > > > > > > > the > > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a > > number > > > > > > > > independent > > > > > > > > > of > > > > > > > > > > > the > > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > > > > > > > > > > > > > > > > > > > > > > > > > > > (2) We manually allocate and de-allocate the > memory > > > for > > > > > the > > > > > > > > > > > > > MemorySegments > > > > > > > > > > > > > > (option 2). That way we need not worry about > > > triggering > > > > > GC > > > > > > by > > > > > > > > > some > > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to > > prevent > > > > > > > > segfaults. > > > > > > > > > We > > > > > > > > > > > > need > > > > > > > > > > > > > to > > > > > > > > > > > > > > be very careful about when we release the memory > > > > segments > > > > > > > (only > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > > > cleanup phase of the main thread). > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.1, we probably need to set > > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > > "off_heap_managed_memory + > > > > > > > > > > > direct_memory" > > > > > > > > > > > > > and > > > > > > > > > > > > > > have "direct_memory" as a separate reserved > memory > > > > pool. > > > > > > > > Because > > > > > > > > > if > > > > > > > > > > > we > > > > > > > > > > > > > just > > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > > > > "off_heap_managed_memory + > > > > > > > > > > > > > jvm_overhead", > > > > > > > > > > > > > > then there will be times when that entire memory > is > > > > > > allocated > > > > > > > > by > > > > > > > > > > > direct > > > > > > > > > > > > > > buffers and we have nothing left for the JVM > > > overhead. > > > > So > > > > > > we > > > > > > > > > either > > > > > > > > > > > > need > > > > > > > > > > > > > a > > > > > > > > > > > > > > way to compensate for that (again some safety > > margin > > > > > cutoff > > > > > > > > > value) > > > > > > > > > > or > > > > > > > > > > > > we > > > > > > > > > > > > > > will exceed container memory. > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we go with option 1.2, we need to be aware > that > > it > > > > > takes > > > > > > > > > > elaborate > > > > > > > > > > > > > logic > > > > > > > > > > > > > > to push recycling of direct buffers without > always > > > > > > > triggering a > > > > > > > > > > full > > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My first guess is that the options will be > easiest > > to > > > > do > > > > > in > > > > > > > the > > > > > > > > > > > > following > > > > > > > > > > > > > > order: > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory > > > > parameter, > > > > > as > > > > > > > > > > discussed > > > > > > > > > > > > > > above. We would need to find a way to set the > > > > > direct_memory > > > > > > > > > > parameter > > > > > > > > > > > > by > > > > > > > > > > > > > > default. We could start with 64 MB and see how it > > > goes > > > > in > > > > > > > > > practice. > > > > > > > > > > > One > > > > > > > > > > > > > > danger I see is that setting this loo low can > > cause a > > > > > bunch > > > > > > > of > > > > > > > > > > > > additional > > > > > > > > > > > > > > GCs compared to before (we need to watch this > > > > carefully). > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 2. It is actually quite simple to > > > implement, > > > > > we > > > > > > > > could > > > > > > > > > > try > > > > > > > > > > > > how > > > > > > > > > > > > > > segfault safe we are at the moment. > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Option 1.2: We would not touch the > > > > > > > > "-XX:MaxDirectMemorySize" > > > > > > > > > > > > > parameter > > > > > > > > > > > > > > at all and assume that all the direct memory > > > > allocations > > > > > > that > > > > > > > > the > > > > > > > > > > JVM > > > > > > > > > > > > and > > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned up > > fast > > > > > enough > > > > > > > > > through > > > > > > > > > > > > > regular > > > > > > > > > > > > > > GC. I am not sure if that is a valid assumption, > > > > though. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > > wondering > > > > > > whether > > > > > > > > we > > > > > > > > > > can > > > > > > > > > > > > > avoid > > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > > memory > > > > and > > > > > > > > network > > > > > > > > > > > > memory > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > alternative 3. But after giving it a second > > > thought, > > > > I > > > > > > > think > > > > > > > > > even > > > > > > > > > > > for > > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > > > managed > > > > > > > memory > > > > > > > > > > could > > > > > > > > > > > > > cause > > > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed > in > > > this > > > > > > FLIP > > > > > > > it > > > > > > > > > to > > > > > > > > > > > have > > > > > > > > > > > > > > both > > > > > > > > > > > > > > > off-heap managed memory and network memory > > > allocated > > > > > > > through > > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > practically > > > > > > native > > > > > > > > > memory > > > > > > > > > > > and > > > > > > > > > > > > > not > > > > > > > > > > > > > > > limited by JVM max direct memory. The only > parts > > of > > > > > > memory > > > > > > > > > > limited > > > > > > > > > > > by > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > max direct memory are task off-heap memory and > > JVM > > > > > > > overhead, > > > > > > > > > > which > > > > > > > > > > > > are > > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM > max > > > > > direct > > > > > > > > memory > > > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > > understand > > > > > the > > > > > > > two > > > > > > > > > > > > > alternatives > > > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it > > makes > > > > > > things > > > > > > > > > > > explicit. > > > > > > > > > > > > If > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > > > might > > > > > end > > > > > > > up > > > > > > > > > in a > > > > > > > > > > > > > similar > > > > > > > > > > > > > > > > situation as we are currently in: The user > > might > > > > see > > > > > > that > > > > > > > > her > > > > > > > > > > > > process > > > > > > > > > > > > > > > gets > > > > > > > > > > > > > > > > killed by the OS and does not know why this > is > > > the > > > > > > case. > > > > > > > > > > > > > Consequently, > > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > > tries to decrease the process memory size > > > (similar > > > > to > > > > > > > > > > increasing > > > > > > > > > > > > the > > > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > > > direct > > > > > > > memory. > > > > > > > > > > Even > > > > > > > > > > > > > worse, > > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > > tries to decrease memory budgets which are > not > > > > fully > > > > > > used > > > > > > > > and > > > > > > > > > > > hence > > > > > > > > > > > > > > won't > > > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > Song < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > > > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + > JVM > > > > > > > Overhead): > > > > > > > > > > 200MB > > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > Metaspace, > > > > > > Off-Heap > > > > > > > > > > Managed > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > > > -XX:MaxDirectMemorySize > > > > > to > > > > > > > > 200MB. > > > > > > > > > > > > > > > > > For alternative 3, we set > > > -XX:MaxDirectMemorySize > > > > > to > > > > > > a > > > > > > > > very > > > > > > > > > > > large > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > > > alternative 3 > > > > > > > > > > > should > > > > > > > > > > > > > have > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > same utility. Setting larger > > > > > -XX:MaxDirectMemorySize > > > > > > > will > > > > > > > > > not > > > > > > > > > > > > > reduce > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent > OOM. > > > To > > > > > > avoid > > > > > > > > > that, > > > > > > > > > > > the > > > > > > > > > > > > > only > > > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > > > user can do is to modify the > configuration > > > and > > > > > > > > increase > > > > > > > > > > JVM > > > > > > > > > > > > > Direct > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). > > Let's > > > > say > > > > > > > that > > > > > > > > > user > > > > > > > > > > > > > > increases > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce > > the > > > > > total > > > > > > > > size > > > > > > > > > of > > > > > > > > > > > > other > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > pools to 750MB, given the total process > > > memory > > > > > > > remains > > > > > > > > > > 1GB. > > > > > > > > > > > > > > > > > - For alternative 3, there is no chance > of > > > > > direct > > > > > > > OOM. > > > > > > > > > > There > > > > > > > > > > > > are > > > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > > > of exceeding the total process memory > > limit, > > > > but > > > > > > > given > > > > > > > > > > that > > > > > > > > > > > > the > > > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > > > not use up all the reserved native > memory > > > > > > (Off-Heap > > > > > > > > > > Managed > > > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > > direct > > > > > > memory > > > > > > > > > usage > > > > > > > > > > is > > > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > > > yet very close to 200MB, user probably > do > > > not > > > > > need > > > > > > > to > > > > > > > > > > change > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > > > perspective, a > > > > > > > > feasible > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > resource > > > > > > > utilization > > > > > > > > > > > compared > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > > Rohrmann > > > < > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand > the > > > > > > difference > > > > > > > > > > between > > > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set > XX:MaxDirectMemorySize > > > to > > > > > Task > > > > > > > > > > Off-Heap > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that > this > > > size > > > > > is > > > > > > > too > > > > > > > > > low > > > > > > > > > > > > > > resulting > > > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > > > lot of garbage collection and potentially > > an > > > > OOM. > > > > > > > > > > > > > > > > > > - Alternative 3: set > XX:MaxDirectMemorySize > > > to > > > > > > > > something > > > > > > > > > > > larger > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > alternative 2. This would of course > reduce > > > the > > > > > > sizes > > > > > > > of > > > > > > > > > the > > > > > > > > > > > > other > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > > > under > > > > > > > > > utilization > > > > > > > > > > of > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > compared to alternative 3? If > alternative 3 > > > > > > strictly > > > > > > > > > sets a > > > > > > > > > > > > > higher > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory size and we use only > little, > > > > then I > > > > > > > would > > > > > > > > > > > expect > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > Wang < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max > > direct > > > > > > memory > > > > > > > > size > > > > > > > > > > > when > > > > > > > > > > > > we > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > differentiate direct and native memory. > > If > > > > the > > > > > > > direct > > > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > direct memory and framework direct > > > > memory,could > > > > > > be > > > > > > > > > > > calculated > > > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory > > with > > > > > fixed > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > k8s,we > > > > need > > > > > to > > > > > > > > check > > > > > > > > > > the > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > configurations in client to avoid > > > submitting > > > > > > > > > successfully > > > > > > > > > > > and > > > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > > > >于2019年8月13日 > > > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > > > right > > > > > that > > > > > > > we > > > > > > > > > > should > > > > > > > > > > > > not > > > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > > > FLIP > > > > > > should > > > > > > > > > > > > concentrate > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > configure memory pools for > > TaskExecutors, > > > > > with > > > > > > > > > minimum > > > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think > > alternative > > > 3 > > > > > may > > > > > > > not > > > > > > > > > > having > > > > > > > > > > > > the > > > > > > > > > > > > > > > same > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > > > does, > > > > > but > > > > > > at > > > > > > > > the > > > > > > > > > > > cost > > > > > > > > > > > > of > > > > > > > > > > > > > > > risk > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > using memory at the container level, > > > which > > > > is > > > > > > not > > > > > > > > > good. > > > > > > > > > > > My > > > > > > > > > > > > > > point > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > > > Overhead" > > > > > > > are > > > > > > > > > not > > > > > > > > > > > easy > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > > > alternative 2, users might configure > > them > > > > > > higher > > > > > > > > than > > > > > > > > > > > what > > > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. > For > > > > > > > alternative > > > > > > > > > 3, > > > > > > > > > > > > users > > > > > > > > > > > > > do > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config > the > > > two > > > > > > > options > > > > > > > > > > > > > aggressively > > > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > > > container > > > > > > > > > memory > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > > > Rohrmann < > > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > > > quite > > > > > > good. > > > > > > > > > > > > Concerning > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > > > question about allocating memory > > > > segments, > > > > > I > > > > > > > was > > > > > > > > > > > > wondering > > > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > > context > > > > of > > > > > > this > > > > > > > > > FLIP > > > > > > > > > > or > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > > knowing > > > > all > > > > > > > > > details, > > > > > > > > > > I > > > > > > > > > > > > > would > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > > > that we would widen the scope of > this > > > > FLIP > > > > > > too > > > > > > > > much > > > > > > > > > > > > because > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > > to touch all the existing call > sites > > of > > > > the > > > > > > > > > > > MemoryManager > > > > > > > > > > > > > > where > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > > > memory segments (this should mainly > > be > > > > > batch > > > > > > > > > > > operators). > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > > > MemoryManager > > > > > > > > > > should > > > > > > > > > > > > not > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > this and I would hope that this is > > the > > > > only > > > > > > > point > > > > > > > > > of > > > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > > > about > > > > > > > setting > > > > > > > > > or > > > > > > > > > > > not > > > > > > > > > > > > > > > setting > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also > be > > > > > > interested > > > > > > > > why > > > > > > > > > > > Yang > > > > > > > > > > > > > Wang > > > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > > > concern > > > > > > about > > > > > > > > > this > > > > > > > > > > > > would > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > > > be in a similar situation as we are > > now > > > > > with > > > > > > > the > > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > > > the different memory pools are not > > > > clearly > > > > > > > > > separated > > > > > > > > > > > and > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite > > hard > > > > to > > > > > > > > > understand > > > > > > > > > > > > what > > > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > > > process to get killed for using too > > > much > > > > > > > memory. > > > > > > > > > This > > > > > > > > > > > > could > > > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > > > have > > > > > with > > > > > > > the > > > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > setting a sane default value for > max > > > > direct > > > > > > > > memory > > > > > > > > > > and > > > > > > > > > > > > > giving > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > option to increase it if he runs > into > > > an > > > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 > > lead > > > to > > > > > > lower > > > > > > > > > > memory > > > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > direct > > > > > memory > > > > > > > to a > > > > > > > > > > > higher > > > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > > Xintong > > > > > Song < > > > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > > > I think setting a very large max > > > direct > > > > > > > memory > > > > > > > > > size > > > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > > > about > > > > > > > direct > > > > > > > > > OOM, > > > > > > > > > > > and > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > > memory > > > > with > > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > > > However, there are also some down > > > sides > > > > > of > > > > > > > > doing > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is > > that > > > > if > > > > > a > > > > > > > task > > > > > > > > > > > > executor > > > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > killed due to overusing > memory, > > it > > > > > could > > > > > > > be > > > > > > > > > hard > > > > > > > > > > > for > > > > > > > > > > > > > use > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > > > - Another down side is that > the > > > JVM > > > > > > never > > > > > > > > > > trigger > > > > > > > > > > > GC > > > > > > > > > > > > > due > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > > direct memory limit, because > the > > > > limit > > > > > > is > > > > > > > > too > > > > > > > > > > high > > > > > > > > > > > > to > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > > > memory > > > > > to > > > > > > > > > trigger > > > > > > > > > > > GC > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > > memory. That could be a > problem > > in > > > > > cases > > > > > > > > where > > > > > > > > > > we > > > > > > > > > > > > have > > > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > > activity > > > > to > > > > > > > > trigger > > > > > > > > > > the > > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons > > for > > > > > > > preferring > > > > > > > > > > > > setting a > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > > > multiple > > > > > > > > > > > configuration > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > > > should > > > > > > throw > > > > > > > > an > > > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > > client > > > > side > > > > > > is > > > > > > > a > > > > > > > > > good > > > > > > > > > > > > idea, > > > > > > > > > > > > > > so > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > > > before > > > > > > > > submitting > > > > > > > > > > the > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > > > client > > > > > side > > > > > > > > > > checking, > > > > > > > > > > > > > > because > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers > on > > > > > > different > > > > > > > > > > machines > > > > > > > > > > > > may > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > > > configurations and the client > does > > > see > > > > > > that. > > > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM > Yang > > > > Wang > > > > > < > > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > > proposal. > > > > > After > > > > > > > all > > > > > > > > > the > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > > > powerful > > > > to > > > > > > > > control > > > > > > > > > > the > > > > > > > > > > > > > flink > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > > > just have few questions about > it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > > direct > > > > > > memory > > > > > > > > and > > > > > > > > > > > native > > > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > memory. > > > > > Right? > > > > > > > So i > > > > > > > > > > don’t > > > > > > > > > > > > > think > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > > > properly. I > > > > > > > > prefer > > > > > > > > > > > > leaving > > > > > > > > > > > > > > it a > > > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > > > memory(network > > > > > > > > > > > memory, > > > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > > > is larger than total process > > > memory, > > > > > how > > > > > > do > > > > > > > > we > > > > > > > > > > deal > > > > > > > > > > > > > with > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > > > configuration > > > > > > > in > > > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > [hidden email]> > > > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > > discussion > > > > > > > thread > > > > > > > > on > > > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > > > TaskExecutors"[1], > > > > > > > where > > > > > > > > we > > > > > > > > > > > > > describe > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > > configurations. > > > > > The > > > > > > > > FLIP > > > > > > > > > > > > document > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > Management > > > and > > > > > > > > > > Configuration > > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > > > discussions > > > > > > > > both > > > > > > > > > > > online > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > > > shortcomings > > > > > > > of > > > > > > > > > > > current > > > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration > > for > > > > > > > Streaming > > > > > > > > > and > > > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > > > configuration > > > > > > > of > > > > > > > > > > > RocksDB > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain > and > > > > hard > > > > > to > > > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > > problems > > > > can > > > > > > be > > > > > > > > > > > summarized > > > > > > > > > > > > > as > > > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > > > also > > > > > > > account > > > > > > > > > for > > > > > > > > > > > > memory > > > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > > > memory > > > > > is > > > > > > > > > > > partitioned > > > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > > pools. > > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > > configuration > > > > > > > options > > > > > > > > > and > > > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in > the > > > > FLIP > > > > > > wiki > > > > > > > > > > > document > > > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > > > design > > > > > doc > > > > > > > [2] > > > > > > > > is > > > > > > > > > > out > > > > > > > > > > > > of > > > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > > discussion > > > > in > > > > > > > this > > > > > > > > > > > mailing > > > > > > > > > > > > > list > > > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > > wondering > > > > > > whether > > > > > > > > we > > > > > > > > > > can > > > > > > > > > > > > > avoid > > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > > memory > > > > and > > > > > > > > network > > > > > > > > > > > > memory > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > alternative 3. But after giving it a second > > > thought, > > > > I > > > > > > > think > > > > > > > > > even > > > > > > > > > > > for > > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap > > > > managed > > > > > > > memory > > > > > > > > > > could > > > > > > > > > > > > > cause > > > > > > > > > > > > > > > problems. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yang, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your concern, I think what proposed > in > > > this > > > > > > FLIP > > > > > > > it > > > > > > > > > to > > > > > > > > > > > have > > > > > > > > > > > > > > both > > > > > > > > > > > > > > > off-heap managed memory and network memory > > > allocated > > > > > > > through > > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > practically > > > > > > native > > > > > > > > > memory > > > > > > > > > > > and > > > > > > > > > > > > > not > > > > > > > > > > > > > > > limited by JVM max direct memory. The only > parts > > of > > > > > > memory > > > > > > > > > > limited > > > > > > > > > > > by > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > max direct memory are task off-heap memory and > > JVM > > > > > > > overhead, > > > > > > > > > > which > > > > > > > > > > > > are > > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM > max > > > > > direct > > > > > > > > memory > > > > > > > > > > to. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > > understand > > > > > the > > > > > > > two > > > > > > > > > > > > > alternatives > > > > > > > > > > > > > > > > now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I would be in favour of option 2 because it > > makes > > > > > > things > > > > > > > > > > > explicit. > > > > > > > > > > > > If > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > don't limit the direct memory, I fear that we > > > might > > > > > end > > > > > > > up > > > > > > > > > in a > > > > > > > > > > > > > similar > > > > > > > > > > > > > > > > situation as we are currently in: The user > > might > > > > see > > > > > > that > > > > > > > > her > > > > > > > > > > > > process > > > > > > > > > > > > > > > gets > > > > > > > > > > > > > > > > killed by the OS and does not know why this > is > > > the > > > > > > case. > > > > > > > > > > > > > Consequently, > > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > > tries to decrease the process memory size > > > (similar > > > > to > > > > > > > > > > increasing > > > > > > > > > > > > the > > > > > > > > > > > > > > > cutoff > > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra > > > direct > > > > > > > memory. > > > > > > > > > > Even > > > > > > > > > > > > > worse, > > > > > > > > > > > > > > > she > > > > > > > > > > > > > > > > tries to decrease memory budgets which are > not > > > > fully > > > > > > used > > > > > > > > and > > > > > > > > > > > hence > > > > > > > > > > > > > > won't > > > > > > > > > > > > > > > > change the overall memory consumption. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > Song < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let me explain this with a concrete example > > > Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Let's say we have the following scenario. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + > JVM > > > > > > > Overhead): > > > > > > > > > > 200MB > > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > Metaspace, > > > > > > Off-Heap > > > > > > > > > > Managed > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For alternative 2, we set > > > -XX:MaxDirectMemorySize > > > > > to > > > > > > > > 200MB. > > > > > > > > > > > > > > > > > For alternative 3, we set > > > -XX:MaxDirectMemorySize > > > > > to > > > > > > a > > > > > > > > very > > > > > > > > > > > large > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > Overhead > > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 and > > > > > > > alternative 3 > > > > > > > > > > > should > > > > > > > > > > > > > have > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > same utility. Setting larger > > > > > -XX:MaxDirectMemorySize > > > > > > > will > > > > > > > > > not > > > > > > > > > > > > > reduce > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > sizes of the other memory pools. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the actual direct memory usage of Task > > > > Off-Heap > > > > > > > Memory > > > > > > > > > and > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent > OOM. > > > To > > > > > > avoid > > > > > > > > > that, > > > > > > > > > > > the > > > > > > > > > > > > > only > > > > > > > > > > > > > > > > thing > > > > > > > > > > > > > > > > > user can do is to modify the > configuration > > > and > > > > > > > > increase > > > > > > > > > > JVM > > > > > > > > > > > > > Direct > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). > > Let's > > > > say > > > > > > > that > > > > > > > > > user > > > > > > > > > > > > > > increases > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will reduce > > the > > > > > total > > > > > > > > size > > > > > > > > > of > > > > > > > > > > > > other > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > pools to 750MB, given the total process > > > memory > > > > > > > remains > > > > > > > > > > 1GB. > > > > > > > > > > > > > > > > > - For alternative 3, there is no chance > of > > > > > direct > > > > > > > OOM. > > > > > > > > > > There > > > > > > > > > > > > are > > > > > > > > > > > > > > > > chances > > > > > > > > > > > > > > > > > of exceeding the total process memory > > limit, > > > > but > > > > > > > given > > > > > > > > > > that > > > > > > > > > > > > the > > > > > > > > > > > > > > > > process > > > > > > > > > > > > > > > > > may > > > > > > > > > > > > > > > > > not use up all the reserved native > memory > > > > > > (Off-Heap > > > > > > > > > > Managed > > > > > > > > > > > > > > Memory, > > > > > > > > > > > > > > > > > Network > > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > > direct > > > > > > memory > > > > > > > > > usage > > > > > > > > > > is > > > > > > > > > > > > > > > slightly > > > > > > > > > > > > > > > > > above > > > > > > > > > > > > > > > > > yet very close to 200MB, user probably > do > > > not > > > > > need > > > > > > > to > > > > > > > > > > change > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > configurations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Therefore, I think from the user's > > > perspective, a > > > > > > > > feasible > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > resource > > > > > > > utilization > > > > > > > > > > > compared > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > alternative 3. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > > Rohrmann > > > < > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I guess you have to help me understand > the > > > > > > difference > > > > > > > > > > between > > > > > > > > > > > > > > > > > alternative 2 > > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Alternative 2: set > XX:MaxDirectMemorySize > > > to > > > > > Task > > > > > > > > > > Off-Heap > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > JVM > > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that > this > > > size > > > > > is > > > > > > > too > > > > > > > > > low > > > > > > > > > > > > > > resulting > > > > > > > > > > > > > > > > in a > > > > > > > > > > > > > > > > > > lot of garbage collection and potentially > > an > > > > OOM. > > > > > > > > > > > > > > > > > > - Alternative 3: set > XX:MaxDirectMemorySize > > > to > > > > > > > > something > > > > > > > > > > > larger > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > alternative 2. This would of course > reduce > > > the > > > > > > sizes > > > > > > > of > > > > > > > > > the > > > > > > > > > > > > other > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > types. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How would alternative 2 now result in an > > > under > > > > > > > > > utilization > > > > > > > > > > of > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > compared to alternative 3? If > alternative 3 > > > > > > strictly > > > > > > > > > sets a > > > > > > > > > > > > > higher > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > direct memory size and we use only > little, > > > > then I > > > > > > > would > > > > > > > > > > > expect > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > alternative 3 results in memory under > > > > > utilization. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > Wang < > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My point is setting a very large max > > direct > > > > > > memory > > > > > > > > size > > > > > > > > > > > when > > > > > > > > > > > > we > > > > > > > > > > > > > > do > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > differentiate direct and native memory. > > If > > > > the > > > > > > > direct > > > > > > > > > > > > > > > > memory,including > > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > direct memory and framework direct > > > > memory,could > > > > > > be > > > > > > > > > > > calculated > > > > > > > > > > > > > > > > > > > correctly,then > > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory > > with > > > > > fixed > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > k8s,we > > > > need > > > > > to > > > > > > > > check > > > > > > > > > > the > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > configurations in client to avoid > > > submitting > > > > > > > > > successfully > > > > > > > > > > > and > > > > > > > > > > > > > > > failing > > > > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > > the flink master. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > > > > > >于2019年8月13日 > > > > > > > > > > 周二22:07写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are > > > right > > > > > that > > > > > > > we > > > > > > > > > > should > > > > > > > > > > > > not > > > > > > > > > > > > > > > > include > > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. This > > > FLIP > > > > > > should > > > > > > > > > > > > concentrate > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > configure memory pools for > > TaskExecutors, > > > > > with > > > > > > > > > minimum > > > > > > > > > > > > > > > involvement > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > > > > > memory consumers use it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > About direct memory, I think > > alternative > > > 3 > > > > > may > > > > > > > not > > > > > > > > > > having > > > > > > > > > > > > the > > > > > > > > > > > > > > > same > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 > > > does, > > > > > but > > > > > > at > > > > > > > > the > > > > > > > > > > > cost > > > > > > > > > > > > of > > > > > > > > > > > > > > > risk > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > using memory at the container level, > > > which > > > > is > > > > > > not > > > > > > > > > good. > > > > > > > > > > > My > > > > > > > > > > > > > > point > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM > > > > > Overhead" > > > > > > > are > > > > > > > > > not > > > > > > > > > > > easy > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > config. > > > > > > > > > > > > > > > > > > > For > > > > > > > > > > > > > > > > > > > > alternative 2, users might configure > > them > > > > > > higher > > > > > > > > than > > > > > > > > > > > what > > > > > > > > > > > > > > > actually > > > > > > > > > > > > > > > > > > > needed, > > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. > For > > > > > > > alternative > > > > > > > > > 3, > > > > > > > > > > > > users > > > > > > > > > > > > > do > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > get > > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config > the > > > two > > > > > > > options > > > > > > > > > > > > > aggressively > > > > > > > > > > > > > > > > high. > > > > > > > > > > > > > > > > > > But > > > > > > > > > > > > > > > > > > > > the consequences are risks of overall > > > > > container > > > > > > > > > memory > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > exceeds > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > > > budget. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till > > > > > Rohrmann < > > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > > Xintong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > All in all I think it already looks > > > quite > > > > > > good. > > > > > > > > > > > > Concerning > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > first > > > > > > > > > > > > > > > > > > > open > > > > > > > > > > > > > > > > > > > > > question about allocating memory > > > > segments, > > > > > I > > > > > > > was > > > > > > > > > > > > wondering > > > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > > context > > > > of > > > > > > this > > > > > > > > > FLIP > > > > > > > > > > or > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > > knowing > > > > all > > > > > > > > > details, > > > > > > > > > > I > > > > > > > > > > > > > would > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > > concerned > > > > > > > > > > > > > > > > > > > > > that we would widen the scope of > this > > > > FLIP > > > > > > too > > > > > > > > much > > > > > > > > > > > > because > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > > to touch all the existing call > sites > > of > > > > the > > > > > > > > > > > MemoryManager > > > > > > > > > > > > > > where > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > allocate > > > > > > > > > > > > > > > > > > > > > memory segments (this should mainly > > be > > > > > batch > > > > > > > > > > > operators). > > > > > > > > > > > > > The > > > > > > > > > > > > > > > > > addition > > > > > > > > > > > > > > > > > > > of > > > > > > > > > > > > > > > > > > > > > the memory reservation call to the > > > > > > > MemoryManager > > > > > > > > > > should > > > > > > > > > > > > not > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > affected > > > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > this and I would hope that this is > > the > > > > only > > > > > > > point > > > > > > > > > of > > > > > > > > > > > > > > > interaction > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > > > > streaming job would have with the > > > > > > > MemoryManager. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Concerning the second open question > > > about > > > > > > > setting > > > > > > > > > or > > > > > > > > > > > not > > > > > > > > > > > > > > > setting > > > > > > > > > > > > > > > > a > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also > be > > > > > > interested > > > > > > > > why > > > > > > > > > > > Yang > > > > > > > > > > > > > Wang > > > > > > > > > > > > > > > > > thinks > > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My > > > concern > > > > > > about > > > > > > > > > this > > > > > > > > > > > > would > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > > > > > > > be in a similar situation as we are > > now > > > > > with > > > > > > > the > > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > > > > > > > > > > > > > > If > > > > > > > > > > > > > > > > > > > > > the different memory pools are not > > > > clearly > > > > > > > > > separated > > > > > > > > > > > and > > > > > > > > > > > > > can > > > > > > > > > > > > > > > > spill > > > > > > > > > > > > > > > > > > over > > > > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite > > hard > > > > to > > > > > > > > > understand > > > > > > > > > > > > what > > > > > > > > > > > > > > > > exactly > > > > > > > > > > > > > > > > > > > > causes a > > > > > > > > > > > > > > > > > > > > > process to get killed for using too > > > much > > > > > > > memory. > > > > > > > > > This > > > > > > > > > > > > could > > > > > > > > > > > > > > > then > > > > > > > > > > > > > > > > > > easily > > > > > > > > > > > > > > > > > > > > > lead to a similar situation what we > > > have > > > > > with > > > > > > > the > > > > > > > > > > > > > > cutoff-ratio. > > > > > > > > > > > > > > > > So > > > > > > > > > > > > > > > > > > why > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > setting a sane default value for > max > > > > direct > > > > > > > > memory > > > > > > > > > > and > > > > > > > > > > > > > giving > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > option to increase it if he runs > into > > > an > > > > > OOM. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 > > lead > > > to > > > > > > lower > > > > > > > > > > memory > > > > > > > > > > > > > > > > utilization > > > > > > > > > > > > > > > > > > than > > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > direct > > > > > memory > > > > > > > to a > > > > > > > > > > > higher > > > > > > > > > > > > > > value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > > Xintong > > > > > Song < > > > > > > > > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > > > > > > > > > > > > > > > > > > > > > I think setting a very large max > > > direct > > > > > > > memory > > > > > > > > > size > > > > > > > > > > > > > > > definitely > > > > > > > > > > > > > > > > > has > > > > > > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not worry > > > about > > > > > > > direct > > > > > > > > > OOM, > > > > > > > > > > > and > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > don't > > > > > > > > > > > > > > > > > > even > > > > > > > > > > > > > > > > > > > > > need > > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > > memory > > > > with > > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > > > > > > > > > > > > > > > > > However, there are also some down > > > sides > > > > > of > > > > > > > > doing > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is > > that > > > > if > > > > > a > > > > > > > task > > > > > > > > > > > > executor > > > > > > > > > > > > > > > > > container > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > killed due to overusing > memory, > > it > > > > > could > > > > > > > be > > > > > > > > > hard > > > > > > > > > > > for > > > > > > > > > > > > > use > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > know > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > > part > > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > > > > > > > > > > > > > > > > > > > > > > - Another down side is that > the > > > JVM > > > > > > never > > > > > > > > > > trigger > > > > > > > > > > > GC > > > > > > > > > > > > > due > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > reaching > > > > > > > > > > > > > > > > > > > > > max > > > > > > > > > > > > > > > > > > > > > > direct memory limit, because > the > > > > limit > > > > > > is > > > > > > > > too > > > > > > > > > > high > > > > > > > > > > > > to > > > > > > > > > > > > > be > > > > > > > > > > > > > > > > > > reached. > > > > > > > > > > > > > > > > > > > > That > > > > > > > > > > > > > > > > > > > > > > means we kind of relay on heap > > > > memory > > > > > to > > > > > > > > > trigger > > > > > > > > > > > GC > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > release > > > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > > memory. That could be a > problem > > in > > > > > cases > > > > > > > > where > > > > > > > > > > we > > > > > > > > > > > > have > > > > > > > > > > > > > > > more > > > > > > > > > > > > > > > > > > direct > > > > > > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > > activity > > > > to > > > > > > > > trigger > > > > > > > > > > the > > > > > > > > > > > > GC. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons > > for > > > > > > > preferring > > > > > > > > > > > > setting a > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > value, > > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > > > overlooked. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > > > > > > > > > > > > > > > > > > > > > If there is any conflict between > > > > multiple > > > > > > > > > > > configuration > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > user > > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we > > > should > > > > > > throw > > > > > > > > an > > > > > > > > > > > error. > > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > > client > > > > side > > > > > > is > > > > > > > a > > > > > > > > > good > > > > > > > > > > > > idea, > > > > > > > > > > > > > > so > > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem > > > before > > > > > > > > submitting > > > > > > > > > > the > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > > > > > cluster, > > > > > > > > > > > > > > > > > > > > > which > > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the > > > client > > > > > side > > > > > > > > > > checking, > > > > > > > > > > > > > > because > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers > on > > > > > > different > > > > > > > > > > machines > > > > > > > > > > > > may > > > > > > > > > > > > > > > have > > > > > > > > > > > > > > > > > > > > different > > > > > > > > > > > > > > > > > > > > > > configurations and the client > does > > > see > > > > > > that. > > > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM > Yang > > > > Wang > > > > > < > > > > > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > > proposal. > > > > > After > > > > > > > all > > > > > > > > > the > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > > > powerful > > > > to > > > > > > > > control > > > > > > > > > > the > > > > > > > > > > > > > flink > > > > > > > > > > > > > > > > > memory > > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > > > > > > > > > > > > > > > > > > just have few questions about > it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > > direct > > > > > > memory > > > > > > > > and > > > > > > > > > > > native > > > > > > > > > > > > > > > memory. > > > > > > > > > > > > > > > > > > They > > > > > > > > > > > > > > > > > > > > are > > > > > > > > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > memory. > > > > > Right? > > > > > > > So i > > > > > > > > > > don’t > > > > > > > > > > > > > think > > > > > > > > > > > > > > > we > > > > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > > > > > > not > > > > > > > > > > > > > > > > > > > > > > set > > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > > > > properly. I > > > > > > > > prefer > > > > > > > > > > > > leaving > > > > > > > > > > > > > > it a > > > > > > > > > > > > > > > > > very > > > > > > > > > > > > > > > > > > > > large > > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained > > > > > > > memory(network > > > > > > > > > > > memory, > > > > > > > > > > > > > > > managed > > > > > > > > > > > > > > > > > > > memory, > > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > > > > > > > > > > > > > > > > > > is larger than total process > > > memory, > > > > > how > > > > > > do > > > > > > > > we > > > > > > > > > > deal > > > > > > > > > > > > > with > > > > > > > > > > > > > > > this > > > > > > > > > > > > > > > > > > > > > situation? > > > > > > > > > > > > > > > > > > > > > > Do > > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > > > > > configuration > > > > > > > in > > > > > > > > > > > client? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > [hidden email]> > > > > > > > > > > 于2019年8月7日周三 > > > > > > > > > > > > > > > 下午10:14写道: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > > discussion > > > > > > > thread > > > > > > > > on > > > > > > > > > > > > > "FLIP-49: > > > > > > > > > > > > > > > > > Unified > > > > > > > > > > > > > > > > > > > > > Memory > > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > > > TaskExecutors"[1], > > > > > > > where > > > > > > > > we > > > > > > > > > > > > > describe > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > > > improve > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > > configurations. > > > > > The > > > > > > > > FLIP > > > > > > > > > > > > document > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > mostly > > > > > > > > > > > > > > > > > > > > based > > > > > > > > > > > > > > > > > > > > > > on > > > > > > > > > > > > > > > > > > > > > > > an > > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > Management > > > and > > > > > > > > > > Configuration > > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up > > > > > discussions > > > > > > > > both > > > > > > > > > > > online > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > offline. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several > > > > > > shortcomings > > > > > > > of > > > > > > > > > > > current > > > > > > > > > > > > > > > (Flink > > > > > > > > > > > > > > > > > 1.9) > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > > configuration. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Different configuration > > for > > > > > > > Streaming > > > > > > > > > and > > > > > > > > > > > > Batch. > > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > > > > > > configuration > > > > > > > of > > > > > > > > > > > RocksDB > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > Streaming. > > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain > and > > > > hard > > > > > to > > > > > > > > > > > understand. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > > problems > > > > can > > > > > > be > > > > > > > > > > > summarized > > > > > > > > > > > > > as > > > > > > > > > > > > > > > > > follows. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager to > > > also > > > > > > > account > > > > > > > > > for > > > > > > > > > > > > memory > > > > > > > > > > > > > > > usage > > > > > > > > > > > > > > > > > by > > > > > > > > > > > > > > > > > > > > state > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor > > > memory > > > > > is > > > > > > > > > > > partitioned > > > > > > > > > > > > > > > > accounted > > > > > > > > > > > > > > > > > > > > > individual > > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > > pools. > > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > > configuration > > > > > > > options > > > > > > > > > and > > > > > > > > > > > > > > > calculations > > > > > > > > > > > > > > > > > > > logics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in > the > > > > FLIP > > > > > > wiki > > > > > > > > > > > document > > > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early > > > design > > > > > doc > > > > > > > [2] > > > > > > > > is > > > > > > > > > > out > > > > > > > > > > > > of > > > > > > > > > > > > > > > sync, > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > > > > > > is > > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > > discussion > > > > in > > > > > > > this > > > > > > > > > > > mailing > > > > > > > > > > > > > list > > > > > > > > > > > > > > > > > > thread.) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > > feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
What I forgot to add is that we could tackle specifying the configuration
fully in an incremental way and that the full specification should be the desired end state. On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> wrote: > I think our goal should be that the configuration is fully specified when > the process is started. By considering the internal calculation step to be > rather validate existing values and calculate missing ones, these two > proposal shouldn't even conflict (given determinism). > > Since we don't want to change an existing flink-conf.yaml, specifying the > full configuration would require to pass in the options differently. > > One way could be the ENV variables approach. The reason why I'm trying to > exclude this feature from the FLIP is that I believe it needs a bit more > discussion. Just some questions which come to my mind: What would be the > exact format (FLINK_KEY_NAME)? Would we support a dot separator which is > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > separator what would be the order of precedence if there are two ENV > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > precedence of env variable vs. dynamic configuration value specified via -D? > > Another approach could be to pass in the dynamic configuration values via > `-Dkey=value` to the Flink process. For that we don't have to change > anything because the functionality already exists. > > Cheers, > Till > > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> wrote: > >> I see. Under the assumption of strict determinism that should work. >> >> The original proposal had this point "don't compute inside the TM, compute >> outside and supply a full config", because that sounded more intuitive. >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> >> wrote: >> >> > My understanding was that before starting the Flink process we call a >> > utility which calculates these values. I assume that this utility will >> do >> > the calculation based on a set of configured values (process memory, >> flink >> > memory, network memory etc.). Assuming that these values don't differ >> from >> > the values with which the JVM is started, it should be possible to >> > recompute them in the Flink process in order to set the values. >> > >> > >> > >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> wrote: >> > >> > > When computing the values in the JVM process after it started, how >> would >> > > you deal with values like Max Direct Memory, Metaspace size. native >> > memory >> > > reservation (reduce heap size), etc? All the values that are >> parameters >> > to >> > > the JVM process and that need to be supplied at process startup? >> > > >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email]> >> > > wrote: >> > > >> > > > Thanks for the clarification. I have some more comments: >> > > > >> > > > - I would actually split the logic to compute the process memory >> > > > requirements and storing the values into two things. E.g. one could >> > name >> > > > the former TaskExecutorProcessUtility and the latter >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR since >> it's >> > > > just a naming detail. >> > > > >> > > > - Generally, I'm not opposed to making configuration values >> overridable >> > > by >> > > > ENV variables. I think this is a very good idea and makes the >> > > > configurability of Flink processes easier. However, I think that >> adding >> > > > this functionality should not be part of this FLIP because it would >> > > simply >> > > > widen the scope unnecessarily. >> > > > >> > > > The reasons why I believe it is unnecessary are the following: For >> Yarn >> > > we >> > > > already create write a flink-conf.yaml which could be populated with >> > the >> > > > memory settings. For the other processes it should not make a >> > difference >> > > > whether the loaded Configuration is populated with the memory >> settings >> > > from >> > > > ENV variables or by using TaskExecutorProcessUtility to compute the >> > > missing >> > > > values from the loaded configuration. If the latter would not be >> > possible >> > > > (wrong or missing configuration values), then we should not have >> been >> > > able >> > > > to actually start the process in the first place. >> > > > >> > > > - Concerning the memory reservation: I agree with you that we need >> the >> > > > memory reservation functionality to make streaming jobs work with >> > > "managed" >> > > > memory. However, w/o this functionality the whole Flip would already >> > > bring >> > > > a good amount of improvements to our users when running batch jobs. >> > > > Moreover, by keeping the scope smaller we can complete the FLIP >> faster. >> > > > Hence, I would propose to address the memory reservation >> functionality >> > > as a >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). >> > > > >> > > > Cheers, >> > > > Till >> > > > >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email]> >> > > wrote: >> > > > >> > > > > Just add my 2 cents. >> > > > > >> > > > > Using environment variables to override the configuration for >> > different >> > > > > taskmanagers is better. >> > > > > We do not need to generate dedicated flink-conf.yaml for all >> > > > taskmanagers. >> > > > > A common flink-conf.yam and different environment variables are >> > enough. >> > > > > By reducing the distributed cached files, it could make launching >> a >> > > > > taskmanager faster. >> > > > > >> > > > > Stephan gives a good suggestion that we could move the logic into >> > > > > "GlobalConfiguration.loadConfig()" method. >> > > > > Maybe the client could also benefit from this. Different users do >> not >> > > > have >> > > > > to export FLINK_CONF_DIR to update few config options. >> > > > > >> > > > > >> > > > > Best, >> > > > > Yang >> > > > > >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: >> > > > > >> > > > > > One note on the Environment Variables and Configuration >> discussion. >> > > > > > >> > > > > > My understanding is that passed ENV variables are added to the >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" method >> (or >> > > > > > similar). >> > > > > > For all the code inside Flink, it looks like the data was in the >> > > config >> > > > > to >> > > > > > start with, just that the scripts that compute the variables can >> > pass >> > > > the >> > > > > > values to the process without actually needing to write a file. >> > > > > > >> > > > > > For example the "GlobalConfiguration.loadConfig()" method would >> > take >> > > > any >> > > > > > ENV variable prefixed with "flink" and add it as a config key. >> > > > > > "flink_taskmanager_memory_size=2g" would become >> > > > "taskmanager.memory.size: >> > > > > > 2g". >> > > > > > >> > > > > > >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < >> > [hidden email]> >> > > > > > wrote: >> > > > > > >> > > > > > > Thanks for the comments, Till. >> > > > > > > >> > > > > > > I've also seen your comments on the wiki page, but let's keep >> the >> > > > > > > discussion here. >> > > > > > > >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about >> > naming >> > > it >> > > > > > > 'TaskExecutorResourceSpecifics'. >> > > > > > > - Regarding passing memory configurations into task executors, >> > I'm >> > > in >> > > > > > favor >> > > > > > > of do it via environment variables rather than configurations, >> > with >> > > > the >> > > > > > > following two reasons. >> > > > > > > - It is easier to keep the memory options once calculate >> not to >> > > be >> > > > > > > changed with environment variables rather than configurations. >> > > > > > > - I'm not sure whether we should write the configuration in >> > > startup >> > > > > > > scripts. Writing changes into the configuration files when >> > running >> > > > the >> > > > > > > startup scripts does not sounds right to me. Or we could make >> a >> > > copy >> > > > of >> > > > > > > configuration files per flink cluster, and make the task >> executor >> > > to >> > > > > load >> > > > > > > from the copy, and clean up the copy after the cluster is >> > shutdown, >> > > > > which >> > > > > > > is complicated. (I think this is also what Stephan means in >> his >> > > > comment >> > > > > > on >> > > > > > > the wiki page?) >> > > > > > > - Regarding reserving memory, I think this change should be >> > > included >> > > > in >> > > > > > > this FLIP. I think a big part of motivations of this FLIP is >> to >> > > unify >> > > > > > > memory configuration for streaming / batch and make it easy >> for >> > > > > > configuring >> > > > > > > rocksdb memory. If we don't support memory reservation, then >> > > > streaming >> > > > > > jobs >> > > > > > > cannot use managed memory (neither on-heap or off-heap), which >> > > makes >> > > > > this >> > > > > > > FLIP incomplete. >> > > > > > > - Regarding network memory, I think you are right. I think we >> > > > probably >> > > > > > > don't need to change network stack from using direct memory to >> > > using >> > > > > > unsafe >> > > > > > > native memory. Network memory size is deterministic, cannot be >> > > > reserved >> > > > > > as >> > > > > > > managed memory does, and cannot be overused. I think it also >> > works >> > > if >> > > > > we >> > > > > > > simply keep using direct memory for network and include it in >> jvm >> > > max >> > > > > > > direct memory size. >> > > > > > > >> > > > > > > Thank you~ >> > > > > > > >> > > > > > > Xintong Song >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < >> > > [hidden email]> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Hi Xintong, >> > > > > > > > >> > > > > > > > thanks for addressing the comments and adding a more >> detailed >> > > > > > > > implementation plan. I have a couple of comments concerning >> the >> > > > > > > > implementation plan: >> > > > > > > > >> > > > > > > > - The name `TaskExecutorSpecifics` is not really >> descriptive. >> > > > > Choosing >> > > > > > a >> > > > > > > > different name could help here. >> > > > > > > > - I'm not sure whether I would pass the memory >> configuration to >> > > the >> > > > > > > > TaskExecutor via environment variables. I think it would be >> > > better >> > > > to >> > > > > > > write >> > > > > > > > it into the configuration one uses to start the TM process. >> > > > > > > > - If possible, I would exclude the memory reservation from >> this >> > > > FLIP >> > > > > > and >> > > > > > > > add this as part of a dedicated FLIP. >> > > > > > > > - If possible, then I would exclude changes to the network >> > stack >> > > > from >> > > > > > > this >> > > > > > > > FLIP. Maybe we can simply say that the direct memory needed >> by >> > > the >> > > > > > > network >> > > > > > > > stack is the framework direct memory requirement. Changing >> how >> > > the >> > > > > > memory >> > > > > > > > is allocated can happen in a second step. This would keep >> the >> > > scope >> > > > > of >> > > > > > > this >> > > > > > > > FLIP smaller. >> > > > > > > > >> > > > > > > > Cheers, >> > > > > > > > Till >> > > > > > > > >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < >> > > > [hidden email]> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Hi everyone, >> > > > > > > > > >> > > > > > > > > I just updated the FLIP document on wiki [1], with the >> > > following >> > > > > > > changes. >> > > > > > > > > >> > > > > > > > > - Removed open question regarding MemorySegment >> > allocation. >> > > As >> > > > > > > > > discussed, we exclude this topic from the scope of this >> > > FLIP. >> > > > > > > > > - Updated content about JVM direct memory parameter >> > > according >> > > > to >> > > > > > > > recent >> > > > > > > > > discussions, and moved the other options to "Rejected >> > > > > > Alternatives" >> > > > > > > > for >> > > > > > > > > the >> > > > > > > > > moment. >> > > > > > > > > - Added implementation steps. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Thank you~ >> > > > > > > > > >> > > > > > > > > Xintong Song >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > [1] >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > > > > > > > > >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < >> > [hidden email] >> > > > >> > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > @Xintong: Concerning "wait for memory users before task >> > > dispose >> > > > > and >> > > > > > > > > memory >> > > > > > > > > > release": I agree, that's how it should be. Let's try it >> > out. >> > > > > > > > > > >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for >> GC >> > > when >> > > > > > > > allocating >> > > > > > > > > > direct memory buffer": There seems to be pretty >> elaborate >> > > logic >> > > > > to >> > > > > > > free >> > > > > > > > > > buffers when allocating new ones. See >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 >> > > > > > > > > > >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default works >> (like >> > > > going >> > > > > > > with >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at >> all), >> > > > then >> > > > > I >> > > > > > > > think >> > > > > > > > > it >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we use >> > > > RocksDB. >> > > > > > > That >> > > > > > > > > is a >> > > > > > > > > > big if, though, I honestly have no idea :D Would be >> good to >> > > > > > > understand >> > > > > > > > > > this, though, because this would affect option (2) and >> > option >> > > > > > (1.2). >> > > > > > > > > > >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < >> > > > > > [hidden email]> >> > > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > Thanks for the inputs, Jingsong. >> > > > > > > > > > > >> > > > > > > > > > > Let me try to summarize your points. Please correct >> me if >> > > I'm >> > > > > > > wrong. >> > > > > > > > > > > >> > > > > > > > > > > - Memory consumers should always avoid returning >> > memory >> > > > > > segments >> > > > > > > > to >> > > > > > > > > > > memory manager while there are still un-cleaned >> > > > structures / >> > > > > > > > threads >> > > > > > > > > > > that >> > > > > > > > > > > may use the memory. Otherwise, it would cause >> serious >> > > > > problems >> > > > > > > by >> > > > > > > > > > having >> > > > > > > > > > > multiple consumers trying to use the same memory >> > > segment. >> > > > > > > > > > > - JVM does not wait for GC when allocating direct >> > memory >> > > > > > buffer. >> > > > > > > > > > > Therefore even we set proper max direct memory size >> > > limit, >> > > > > we >> > > > > > > may >> > > > > > > > > > still >> > > > > > > > > > > encounter direct memory oom if the GC cleaning >> memory >> > > > slower >> > > > > > > than >> > > > > > > > > the >> > > > > > > > > > > direct memory allocation. >> > > > > > > > > > > >> > > > > > > > > > > Am I understanding this correctly? >> > > > > > > > > > > >> > > > > > > > > > > Thank you~ >> > > > > > > > > > > >> > > > > > > > > > > Xintong Song >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < >> > > > > > > [hidden email] >> > > > > > > > > > > .invalid> >> > > > > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > > Hi stephan: >> > > > > > > > > > > > >> > > > > > > > > > > > About option 2: >> > > > > > > > > > > > >> > > > > > > > > > > > if additional threads not cleanly shut down before >> we >> > can >> > > > > exit >> > > > > > > the >> > > > > > > > > > task: >> > > > > > > > > > > > In the current case of memory reuse, it has freed up >> > the >> > > > > memory >> > > > > > > it >> > > > > > > > > > > > uses. If this memory is used by other tasks and >> > > > asynchronous >> > > > > > > > threads >> > > > > > > > > > > > of exited task may still be writing, there will be >> > > > > concurrent >> > > > > > > > > security >> > > > > > > > > > > > problems, and even lead to errors in user computing >> > > > results. >> > > > > > > > > > > > >> > > > > > > > > > > > So I think this is a serious and intolerable bug, No >> > > matter >> > > > > > what >> > > > > > > > the >> > > > > > > > > > > > option is, it should be avoided. >> > > > > > > > > > > > >> > > > > > > > > > > > About direct memory cleaned by GC: >> > > > > > > > > > > > I don't think it is a good idea, I've encountered so >> > many >> > > > > > > > situations >> > > > > > > > > > > > that it's too late for GC to cause DirectMemory >> OOM. >> > > > Release >> > > > > > and >> > > > > > > > > > > > allocate DirectMemory depend on the type of user >> job, >> > > > which >> > > > > is >> > > > > > > > > > > > often beyond our control. >> > > > > > > > > > > > >> > > > > > > > > > > > Best, >> > > > > > > > > > > > Jingsong Lee >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > >> > ------------------------------------------------------------------ >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 >> > > > > > > > > > > > To:dev <[hidden email]> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory >> > > Configuration >> > > > > for >> > > > > > > > > > > > TaskExecutors >> > > > > > > > > > > > >> > > > > > > > > > > > My main concern with option 2 (manually release >> memory) >> > > is >> > > > > that >> > > > > > > > > > segfaults >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user >> ends. >> > So >> > > we >> > > > > > need >> > > > > > > to >> > > > > > > > > > > > guarantee that this never happens. >> > > > > > > > > > > > >> > > > > > > > > > > > The trickyness is in tasks that uses data >> structures / >> > > > > > algorithms >> > > > > > > > > with >> > > > > > > > > > > > additional threads, like hash table spill/read and >> > > sorting >> > > > > > > threads. >> > > > > > > > > We >> > > > > > > > > > > need >> > > > > > > > > > > > to ensure that these cleanly shut down before we can >> > exit >> > > > the >> > > > > > > task. >> > > > > > > > > > > > I am not sure that we have that guaranteed already, >> > > that's >> > > > > why >> > > > > > > > option >> > > > > > > > > > 1.1 >> > > > > > > > > > > > seemed simpler to me. >> > > > > > > > > > > > >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < >> > > > > > > > [hidden email]> >> > > > > > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in >> this >> > > way >> > > > > > really >> > > > > > > > > makes >> > > > > > > > > > > > > things easier to understand. >> > > > > > > > > > > > > >> > > > > > > > > > > > > I'm in favor of option 2, at least for the >> moment. I >> > > > think >> > > > > it >> > > > > > > is >> > > > > > > > > not >> > > > > > > > > > > that >> > > > > > > > > > > > > difficult to keep it segfault safe for memory >> > manager, >> > > as >> > > > > > long >> > > > > > > as >> > > > > > > > > we >> > > > > > > > > > > > always >> > > > > > > > > > > > > de-allocate the memory segment when it is released >> > from >> > > > the >> > > > > > > > memory >> > > > > > > > > > > > > consumers. Only if the memory consumer continue >> using >> > > the >> > > > > > > buffer >> > > > > > > > of >> > > > > > > > > > > > memory >> > > > > > > > > > > > > segment after releasing it, in which case we do >> want >> > > the >> > > > > job >> > > > > > to >> > > > > > > > > fail >> > > > > > > > > > so >> > > > > > > > > > > > we >> > > > > > > > > > > > > detect the memory leak early. >> > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.2, I don't think this is a good idea. >> > Not >> > > > only >> > > > > > > > because >> > > > > > > > > > the >> > > > > > > > > > > > > assumption (regular GC is enough to clean direct >> > > buffers) >> > > > > may >> > > > > > > not >> > > > > > > > > > > always >> > > > > > > > > > > > be >> > > > > > > > > > > > > true, but also it makes harder for finding >> problems >> > in >> > > > > cases >> > > > > > of >> > > > > > > > > > memory >> > > > > > > > > > > > > overuse. E.g., user configured some direct memory >> for >> > > the >> > > > > > user >> > > > > > > > > > > libraries. >> > > > > > > > > > > > > If the library actually use more direct memory >> then >> > > > > > configured, >> > > > > > > > > which >> > > > > > > > > > > > > cannot be cleaned by GC because they are still in >> > use, >> > > > may >> > > > > > lead >> > > > > > > > to >> > > > > > > > > > > > overuse >> > > > > > > > > > > > > of the total container memory. In that case, if it >> > > didn't >> > > > > > touch >> > > > > > > > the >> > > > > > > > > > JVM >> > > > > > > > > > > > > default max direct memory limit, we cannot get a >> > direct >> > > > > > memory >> > > > > > > > OOM >> > > > > > > > > > and >> > > > > > > > > > > it >> > > > > > > > > > > > > will become super hard to understand which part of >> > the >> > > > > > > > > configuration >> > > > > > > > > > > need >> > > > > > > > > > > > > to be updated. >> > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.1, it has the similar problem as >> 1.2, if >> > > the >> > > > > > > > exceeded >> > > > > > > > > > > direct >> > > > > > > > > > > > > memory does not reach the max direct memory limit >> > > > specified >> > > > > > by >> > > > > > > > the >> > > > > > > > > > > > > dedicated parameter. I think it is slightly better >> > than >> > > > > 1.2, >> > > > > > > only >> > > > > > > > > > > because >> > > > > > > > > > > > > we can tune the parameter. >> > > > > > > > > > > > > >> > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > >> > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < >> > > > > > [hidden email] >> > > > > > > > >> > > > > > > > > > wrote: >> > > > > > > > > > > > > >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion, >> > maybe >> > > > let >> > > > > > me >> > > > > > > > > > > summarize >> > > > > > > > > > > > > it a >> > > > > > > > > > > > > > bit differently: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > We have the following two options: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the >> > GC. >> > > > That >> > > > > > > makes >> > > > > > > > > it >> > > > > > > > > > > > > segfault >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC in >> case >> > > > > > > > de-allocation >> > > > > > > > > > and >> > > > > > > > > > > > > > re-allocation of a bunch of segments happens >> > quickly, >> > > > > which >> > > > > > > is >> > > > > > > > > > often >> > > > > > > > > > > > the >> > > > > > > > > > > > > > case during batch scheduling or task restart. >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) >> is >> > one >> > > > way >> > > > > > to >> > > > > > > do >> > > > > > > > > > this >> > > > > > > > > > > > > > - Another way could be to have a dedicated >> > > > bookkeeping >> > > > > in >> > > > > > > the >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a >> > number >> > > > > > > > independent >> > > > > > > > > of >> > > > > > > > > > > the >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the >> memory >> > > for >> > > > > the >> > > > > > > > > > > > > MemorySegments >> > > > > > > > > > > > > > (option 2). That way we need not worry about >> > > triggering >> > > > > GC >> > > > > > by >> > > > > > > > > some >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to >> > prevent >> > > > > > > > segfaults. >> > > > > > > > > We >> > > > > > > > > > > > need >> > > > > > > > > > > > > to >> > > > > > > > > > > > > > be very careful about when we release the memory >> > > > segments >> > > > > > > (only >> > > > > > > > > in >> > > > > > > > > > > the >> > > > > > > > > > > > > > cleanup phase of the main thread). >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.1, we probably need to >> set >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to >> > > "off_heap_managed_memory + >> > > > > > > > > > > direct_memory" >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved >> memory >> > > > pool. >> > > > > > > > Because >> > > > > > > > > if >> > > > > > > > > > > we >> > > > > > > > > > > > > just >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to >> > > > > "off_heap_managed_memory + >> > > > > > > > > > > > > jvm_overhead", >> > > > > > > > > > > > > > then there will be times when that entire >> memory is >> > > > > > allocated >> > > > > > > > by >> > > > > > > > > > > direct >> > > > > > > > > > > > > > buffers and we have nothing left for the JVM >> > > overhead. >> > > > So >> > > > > > we >> > > > > > > > > either >> > > > > > > > > > > > need >> > > > > > > > > > > > > a >> > > > > > > > > > > > > > way to compensate for that (again some safety >> > margin >> > > > > cutoff >> > > > > > > > > value) >> > > > > > > > > > or >> > > > > > > > > > > > we >> > > > > > > > > > > > > > will exceed container memory. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.2, we need to be aware >> that >> > it >> > > > > takes >> > > > > > > > > > elaborate >> > > > > > > > > > > > > logic >> > > > > > > > > > > > > > to push recycling of direct buffers without >> always >> > > > > > > triggering a >> > > > > > > > > > full >> > > > > > > > > > > > GC. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > My first guess is that the options will be >> easiest >> > to >> > > > do >> > > > > in >> > > > > > > the >> > > > > > > > > > > > following >> > > > > > > > > > > > > > order: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory >> > > > parameter, >> > > > > as >> > > > > > > > > > discussed >> > > > > > > > > > > > > > above. We would need to find a way to set the >> > > > > direct_memory >> > > > > > > > > > parameter >> > > > > > > > > > > > by >> > > > > > > > > > > > > > default. We could start with 64 MB and see how >> it >> > > goes >> > > > in >> > > > > > > > > practice. >> > > > > > > > > > > One >> > > > > > > > > > > > > > danger I see is that setting this loo low can >> > cause a >> > > > > bunch >> > > > > > > of >> > > > > > > > > > > > additional >> > > > > > > > > > > > > > GCs compared to before (we need to watch this >> > > > carefully). >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 2. It is actually quite simple to >> > > implement, >> > > > > we >> > > > > > > > could >> > > > > > > > > > try >> > > > > > > > > > > > how >> > > > > > > > > > > > > > segfault safe we are at the moment. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.2: We would not touch the >> > > > > > > > "-XX:MaxDirectMemorySize" >> > > > > > > > > > > > > parameter >> > > > > > > > > > > > > > at all and assume that all the direct memory >> > > > allocations >> > > > > > that >> > > > > > > > the >> > > > > > > > > > JVM >> > > > > > > > > > > > and >> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned up >> > fast >> > > > > enough >> > > > > > > > > through >> > > > > > > > > > > > > regular >> > > > > > > > > > > > > > GC. I am not sure if that is a valid assumption, >> > > > though. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > Stephan >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < >> > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was >> > wondering >> > > > > > whether >> > > > > > > > we >> > > > > > > > > > can >> > > > > > > > > > > > > avoid >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed >> > memory >> > > > and >> > > > > > > > network >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > with >> > > > > > > > > > > > > > > alternative 3. But after giving it a second >> > > thought, >> > > > I >> > > > > > > think >> > > > > > > > > even >> > > > > > > > > > > for >> > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap >> > > > managed >> > > > > > > memory >> > > > > > > > > > could >> > > > > > > > > > > > > cause >> > > > > > > > > > > > > > > problems. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what proposed >> in >> > > this >> > > > > > FLIP >> > > > > > > it >> > > > > > > > > to >> > > > > > > > > > > have >> > > > > > > > > > > > > > both >> > > > > > > > > > > > > > > off-heap managed memory and network memory >> > > allocated >> > > > > > > through >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are >> > practically >> > > > > > native >> > > > > > > > > memory >> > > > > > > > > > > and >> > > > > > > > > > > > > not >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only >> parts >> > of >> > > > > > memory >> > > > > > > > > > limited >> > > > > > > > > > > by >> > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > max direct memory are task off-heap memory and >> > JVM >> > > > > > > overhead, >> > > > > > > > > > which >> > > > > > > > > > > > are >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM >> max >> > > > > direct >> > > > > > > > memory >> > > > > > > > > > to. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann >> < >> > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I >> > > understand >> > > > > the >> > > > > > > two >> > > > > > > > > > > > > alternatives >> > > > > > > > > > > > > > > > now. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because it >> > makes >> > > > > > things >> > > > > > > > > > > explicit. >> > > > > > > > > > > > If >> > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that >> we >> > > might >> > > > > end >> > > > > > > up >> > > > > > > > > in a >> > > > > > > > > > > > > similar >> > > > > > > > > > > > > > > > situation as we are currently in: The user >> > might >> > > > see >> > > > > > that >> > > > > > > > her >> > > > > > > > > > > > process >> > > > > > > > > > > > > > > gets >> > > > > > > > > > > > > > > > killed by the OS and does not know why this >> is >> > > the >> > > > > > case. >> > > > > > > > > > > > > Consequently, >> > > > > > > > > > > > > > > she >> > > > > > > > > > > > > > > > tries to decrease the process memory size >> > > (similar >> > > > to >> > > > > > > > > > increasing >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > cutoff >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra >> > > direct >> > > > > > > memory. >> > > > > > > > > > Even >> > > > > > > > > > > > > worse, >> > > > > > > > > > > > > > > she >> > > > > > > > > > > > > > > > tries to decrease memory budgets which are >> not >> > > > fully >> > > > > > used >> > > > > > > > and >> > > > > > > > > > > hence >> > > > > > > > > > > > > > won't >> > > > > > > > > > > > > > > > change the overall memory consumption. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong >> Song < >> > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete >> example >> > > Till. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following scenario. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + >> JVM >> > > > > > > Overhead): >> > > > > > > > > > 200MB >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> Metaspace, >> > > > > > Off-Heap >> > > > > > > > > > Managed >> > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set >> > > -XX:MaxDirectMemorySize >> > > > > to >> > > > > > > > 200MB. >> > > > > > > > > > > > > > > > > For alternative 3, we set >> > > -XX:MaxDirectMemorySize >> > > > > to >> > > > > > a >> > > > > > > > very >> > > > > > > > > > > large >> > > > > > > > > > > > > > > value, >> > > > > > > > > > > > > > > > > let's say 1TB. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task >> > > > Off-Heap >> > > > > > > Memory >> > > > > > > > > and >> > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > Overhead >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 >> and >> > > > > > > alternative 3 >> > > > > > > > > > > should >> > > > > > > > > > > > > have >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > same utility. Setting larger >> > > > > -XX:MaxDirectMemorySize >> > > > > > > will >> > > > > > > > > not >> > > > > > > > > > > > > reduce >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task >> > > > Off-Heap >> > > > > > > Memory >> > > > > > > > > and >> > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent >> OOM. >> > > To >> > > > > > avoid >> > > > > > > > > that, >> > > > > > > > > > > the >> > > > > > > > > > > > > only >> > > > > > > > > > > > > > > > thing >> > > > > > > > > > > > > > > > > user can do is to modify the >> configuration >> > > and >> > > > > > > > increase >> > > > > > > > > > JVM >> > > > > > > > > > > > > Direct >> > > > > > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). >> > Let's >> > > > say >> > > > > > > that >> > > > > > > > > user >> > > > > > > > > > > > > > increases >> > > > > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will >> reduce >> > the >> > > > > total >> > > > > > > > size >> > > > > > > > > of >> > > > > > > > > > > > other >> > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > pools to 750MB, given the total process >> > > memory >> > > > > > > remains >> > > > > > > > > > 1GB. >> > > > > > > > > > > > > > > > > - For alternative 3, there is no >> chance of >> > > > > direct >> > > > > > > OOM. >> > > > > > > > > > There >> > > > > > > > > > > > are >> > > > > > > > > > > > > > > > chances >> > > > > > > > > > > > > > > > > of exceeding the total process memory >> > limit, >> > > > but >> > > > > > > given >> > > > > > > > > > that >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > process >> > > > > > > > > > > > > > > > > may >> > > > > > > > > > > > > > > > > not use up all the reserved native >> memory >> > > > > > (Off-Heap >> > > > > > > > > > Managed >> > > > > > > > > > > > > > Memory, >> > > > > > > > > > > > > > > > > Network >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual >> > direct >> > > > > > memory >> > > > > > > > > usage >> > > > > > > > > > is >> > > > > > > > > > > > > > > slightly >> > > > > > > > > > > > > > > > > above >> > > > > > > > > > > > > > > > > yet very close to 200MB, user probably >> do >> > > not >> > > > > need >> > > > > > > to >> > > > > > > > > > change >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > configurations. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> > > perspective, a >> > > > > > > > feasible >> > > > > > > > > > > > > > > configuration >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower >> resource >> > > > > > > utilization >> > > > > > > > > > > compared >> > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > alternative 3. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till >> > Rohrmann >> > > < >> > > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me understand >> the >> > > > > > difference >> > > > > > > > > > between >> > > > > > > > > > > > > > > > > alternative 2 >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization >> > > Xintong. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> XX:MaxDirectMemorySize >> > > to >> > > > > Task >> > > > > > > > > > Off-Heap >> > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that >> this >> > > size >> > > > > is >> > > > > > > too >> > > > > > > > > low >> > > > > > > > > > > > > > resulting >> > > > > > > > > > > > > > > > in a >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> potentially >> > an >> > > > OOM. >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> XX:MaxDirectMemorySize >> > > to >> > > > > > > > something >> > > > > > > > > > > larger >> > > > > > > > > > > > > > than >> > > > > > > > > > > > > > > > > > alternative 2. This would of course >> reduce >> > > the >> > > > > > sizes >> > > > > > > of >> > > > > > > > > the >> > > > > > > > > > > > other >> > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > types. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result in an >> > > under >> > > > > > > > > utilization >> > > > > > > > > > of >> > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> alternative 3 >> > > > > > strictly >> > > > > > > > > sets a >> > > > > > > > > > > > > higher >> > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > direct memory size and we use only >> little, >> > > > then I >> > > > > > > would >> > > > > > > > > > > expect >> > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under >> > > > > utilization. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang >> Wang < >> > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large max >> > direct >> > > > > > memory >> > > > > > > > size >> > > > > > > > > > > when >> > > > > > > > > > > > we >> > > > > > > > > > > > > > do >> > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > differentiate direct and native >> memory. >> > If >> > > > the >> > > > > > > direct >> > > > > > > > > > > > > > > > memory,including >> > > > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > direct memory and framework direct >> > > > memory,could >> > > > > > be >> > > > > > > > > > > calculated >> > > > > > > > > > > > > > > > > > > correctly,then >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory >> > with >> > > > > fixed >> > > > > > > > > value. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and >> k8s,we >> > > > need >> > > > > to >> > > > > > > > check >> > > > > > > > > > the >> > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > configurations in client to avoid >> > > submitting >> > > > > > > > > successfully >> > > > > > > > > > > and >> > > > > > > > > > > > > > > failing >> > > > > > > > > > > > > > > > > in >> > > > > > > > > > > > > > > > > > > the flink master. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] >> > > > > >于2019年8月13日 >> > > > > > > > > > 周二22:07写道: >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are >> > > right >> > > > > that >> > > > > > > we >> > > > > > > > > > should >> > > > > > > > > > > > not >> > > > > > > > > > > > > > > > include >> > > > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. >> This >> > > FLIP >> > > > > > should >> > > > > > > > > > > > concentrate >> > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> > TaskExecutors, >> > > > > with >> > > > > > > > > minimum >> > > > > > > > > > > > > > > involvement >> > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> > alternative >> > > 3 >> > > > > may >> > > > > > > not >> > > > > > > > > > having >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > same >> > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 >> > > does, >> > > > > but >> > > > > > at >> > > > > > > > the >> > > > > > > > > > > cost >> > > > > > > > > > > > of >> > > > > > > > > > > > > > > risk >> > > > > > > > > > > > > > > > of >> > > > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > using memory at the container level, >> > > which >> > > > is >> > > > > > not >> > > > > > > > > good. >> > > > > > > > > > > My >> > > > > > > > > > > > > > point >> > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM >> > > > > Overhead" >> > > > > > > are >> > > > > > > > > not >> > > > > > > > > > > easy >> > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > config. >> > > > > > > > > > > > > > > > > > > For >> > > > > > > > > > > > > > > > > > > > alternative 2, users might configure >> > them >> > > > > > higher >> > > > > > > > than >> > > > > > > > > > > what >> > > > > > > > > > > > > > > actually >> > > > > > > > > > > > > > > > > > > needed, >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. >> For >> > > > > > > alternative >> > > > > > > > > 3, >> > > > > > > > > > > > users >> > > > > > > > > > > > > do >> > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > get >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config >> the >> > > two >> > > > > > > options >> > > > > > > > > > > > > aggressively >> > > > > > > > > > > > > > > > high. >> > > > > > > > > > > > > > > > > > But >> > > > > > > > > > > > > > > > > > > > the consequences are risks of >> overall >> > > > > container >> > > > > > > > > memory >> > > > > > > > > > > > usage >> > > > > > > > > > > > > > > > exceeds >> > > > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > budget. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till >> > > > > Rohrmann < >> > > > > > > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP >> > Xintong. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already >> looks >> > > quite >> > > > > > good. >> > > > > > > > > > > > Concerning >> > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > first >> > > > > > > > > > > > > > > > > > > open >> > > > > > > > > > > > > > > > > > > > > question about allocating memory >> > > > segments, >> > > > > I >> > > > > > > was >> > > > > > > > > > > > wondering >> > > > > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the >> > context >> > > > of >> > > > > > this >> > > > > > > > > FLIP >> > > > > > > > > > or >> > > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > could >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without >> > knowing >> > > > all >> > > > > > > > > details, >> > > > > > > > > > I >> > > > > > > > > > > > > would >> > > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > > concerned >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of >> this >> > > > FLIP >> > > > > > too >> > > > > > > > much >> > > > > > > > > > > > because >> > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > > > > > have >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call >> sites >> > of >> > > > the >> > > > > > > > > > > MemoryManager >> > > > > > > > > > > > > > where >> > > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > > > allocate >> > > > > > > > > > > > > > > > > > > > > memory segments (this should >> mainly >> > be >> > > > > batch >> > > > > > > > > > > operators). >> > > > > > > > > > > > > The >> > > > > > > > > > > > > > > > > addition >> > > > > > > > > > > > > > > > > > > of >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to the >> > > > > > > MemoryManager >> > > > > > > > > > should >> > > > > > > > > > > > not >> > > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > affected >> > > > > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > > this and I would hope that this is >> > the >> > > > only >> > > > > > > point >> > > > > > > > > of >> > > > > > > > > > > > > > > interaction >> > > > > > > > > > > > > > > > a >> > > > > > > > > > > > > > > > > > > > > streaming job would have with the >> > > > > > > MemoryManager. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> question >> > > about >> > > > > > > setting >> > > > > > > > > or >> > > > > > > > > > > not >> > > > > > > > > > > > > > > setting >> > > > > > > > > > > > > > > > a >> > > > > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also >> be >> > > > > > interested >> > > > > > > > why >> > > > > > > > > > > Yang >> > > > > > > > > > > > > Wang >> > > > > > > > > > > > > > > > > thinks >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My >> > > concern >> > > > > > about >> > > > > > > > > this >> > > > > > > > > > > > would >> > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we >> are >> > now >> > > > > with >> > > > > > > the >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> > > > > > > > > > > > > > > > > > > If >> > > > > > > > > > > > > > > > > > > > > the different memory pools are not >> > > > clearly >> > > > > > > > > separated >> > > > > > > > > > > and >> > > > > > > > > > > > > can >> > > > > > > > > > > > > > > > spill >> > > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite >> > hard >> > > > to >> > > > > > > > > understand >> > > > > > > > > > > > what >> > > > > > > > > > > > > > > > exactly >> > > > > > > > > > > > > > > > > > > > causes a >> > > > > > > > > > > > > > > > > > > > > process to get killed for using >> too >> > > much >> > > > > > > memory. >> > > > > > > > > This >> > > > > > > > > > > > could >> > > > > > > > > > > > > > > then >> > > > > > > > > > > > > > > > > > easily >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what >> we >> > > have >> > > > > with >> > > > > > > the >> > > > > > > > > > > > > > cutoff-ratio. >> > > > > > > > > > > > > > > > So >> > > > > > > > > > > > > > > > > > why >> > > > > > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > > > setting a sane default value for >> max >> > > > direct >> > > > > > > > memory >> > > > > > > > > > and >> > > > > > > > > > > > > giving >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > an >> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs >> into >> > > an >> > > > > OOM. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 >> > lead >> > > to >> > > > > > lower >> > > > > > > > > > memory >> > > > > > > > > > > > > > > > utilization >> > > > > > > > > > > > > > > > > > than >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the >> direct >> > > > > memory >> > > > > > > to a >> > > > > > > > > > > higher >> > > > > > > > > > > > > > value? >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM >> > Xintong >> > > > > Song < >> > > > > > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large max >> > > direct >> > > > > > > memory >> > > > > > > > > size >> > > > > > > > > > > > > > > definitely >> > > > > > > > > > > > > > > > > has >> > > > > > > > > > > > > > > > > > > some >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not >> worry >> > > about >> > > > > > > direct >> > > > > > > > > OOM, >> > > > > > > > > > > and >> > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > don't >> > > > > > > > > > > > > > > > > > even >> > > > > > > > > > > > > > > > > > > > > need >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network >> > memory >> > > > with >> > > > > > > > > > > > > > Unsafe.allocate() . >> > > > > > > > > > > > > > > > > > > > > > However, there are also some >> down >> > > sides >> > > > > of >> > > > > > > > doing >> > > > > > > > > > > this. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is >> > that >> > > > if >> > > > > a >> > > > > > > task >> > > > > > > > > > > > executor >> > > > > > > > > > > > > > > > > container >> > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing >> memory, >> > it >> > > > > could >> > > > > > > be >> > > > > > > > > hard >> > > > > > > > > > > for >> > > > > > > > > > > > > use >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > know >> > > > > > > > > > > > > > > > > > > > which >> > > > > > > > > > > > > > > > > > > > > > part >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. >> > > > > > > > > > > > > > > > > > > > > > - Another down side is that >> the >> > > JVM >> > > > > > never >> > > > > > > > > > trigger >> > > > > > > > > > > GC >> > > > > > > > > > > > > due >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > reaching >> > > > > > > > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, because >> the >> > > > limit >> > > > > > is >> > > > > > > > too >> > > > > > > > > > high >> > > > > > > > > > > > to >> > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > reached. >> > > > > > > > > > > > > > > > > > > > That >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on >> heap >> > > > memory >> > > > > to >> > > > > > > > > trigger >> > > > > > > > > > > GC >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > release >> > > > > > > > > > > > > > > > > > > > direct >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a >> problem >> > in >> > > > > cases >> > > > > > > > where >> > > > > > > > > > we >> > > > > > > > > > > > have >> > > > > > > > > > > > > > > more >> > > > > > > > > > > > > > > > > > direct >> > > > > > > > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap >> > activity >> > > > to >> > > > > > > > trigger >> > > > > > > > > > the >> > > > > > > > > > > > GC. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons >> > for >> > > > > > > preferring >> > > > > > > > > > > > setting a >> > > > > > > > > > > > > > > very >> > > > > > > > > > > > > > > > > > large >> > > > > > > > > > > > > > > > > > > > > value, >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I >> > > > overlooked. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict between >> > > > multiple >> > > > > > > > > > > configuration >> > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we >> > > should >> > > > > > throw >> > > > > > > > an >> > > > > > > > > > > error. >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the >> > client >> > > > side >> > > > > > is >> > > > > > > a >> > > > > > > > > good >> > > > > > > > > > > > idea, >> > > > > > > > > > > > > > so >> > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > > > > Yarn / >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem >> > > before >> > > > > > > > submitting >> > > > > > > > > > the >> > > > > > > > > > > > > Flink >> > > > > > > > > > > > > > > > > > cluster, >> > > > > > > > > > > > > > > > > > > > > which >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the >> > > client >> > > > > side >> > > > > > > > > > checking, >> > > > > > > > > > > > > > because >> > > > > > > > > > > > > > > > for >> > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers >> on >> > > > > > different >> > > > > > > > > > machines >> > > > > > > > > > > > may >> > > > > > > > > > > > > > > have >> > > > > > > > > > > > > > > > > > > > different >> > > > > > > > > > > > > > > > > > > > > > configurations and the client >> does >> > > see >> > > > > > that. >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM >> Yang >> > > > Wang >> > > > > < >> > > > > > > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed >> > proposal. >> > > > > After >> > > > > > > all >> > > > > > > > > the >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > configuration >> > > > > > > > > > > > > > > > > > > > > are >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more >> > > powerful >> > > > to >> > > > > > > > control >> > > > > > > > > > the >> > > > > > > > > > > > > flink >> > > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > > > usage. I >> > > > > > > > > > > > > > > > > > > > > > > just have few questions about >> it. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user >> > direct >> > > > > > memory >> > > > > > > > and >> > > > > > > > > > > native >> > > > > > > > > > > > > > > memory. >> > > > > > > > > > > > > > > > > > They >> > > > > > > > > > > > > > > > > > > > are >> > > > > > > > > > > > > > > > > > > > > > all >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap >> memory. >> > > > > Right? >> > > > > > > So i >> > > > > > > > > > don’t >> > > > > > > > > > > > > think >> > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > could >> > > > > > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > > > > set >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize >> > > > properly. I >> > > > > > > > prefer >> > > > > > > > > > > > leaving >> > > > > > > > > > > > > > it a >> > > > > > > > > > > > > > > > > very >> > > > > > > > > > > > > > > > > > > > large >> > > > > > > > > > > > > > > > > > > > > > > value. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained >> > > > > > > memory(network >> > > > > > > > > > > memory, >> > > > > > > > > > > > > > > managed >> > > > > > > > > > > > > > > > > > > memory, >> > > > > > > > > > > > > > > > > > > > > > etc.) >> > > > > > > > > > > > > > > > > > > > > > > is larger than total process >> > > memory, >> > > > > how >> > > > > > do >> > > > > > > > we >> > > > > > > > > > deal >> > > > > > > > > > > > > with >> > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > > situation? >> > > > > > > > > > > > > > > > > > > > > > Do >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory >> > > > > configuration >> > > > > > > in >> > > > > > > > > > > client? >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> > > [hidden email]> >> > > > > > > > > > 于2019年8月7日周三 >> > > > > > > > > > > > > > > 下午10:14写道: >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a >> > > discussion >> > > > > > > thread >> > > > > > > > on >> > > > > > > > > > > > > "FLIP-49: >> > > > > > > > > > > > > > > > > Unified >> > > > > > > > > > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> > > > TaskExecutors"[1], >> > > > > > > where >> > > > > > > > we >> > > > > > > > > > > > > describe >> > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > improve >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > > configurations. >> > > > > The >> > > > > > > > FLIP >> > > > > > > > > > > > document >> > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > mostly >> > > > > > > > > > > > > > > > > > > > based >> > > > > > > > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > > > > > > an >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> Management >> > > and >> > > > > > > > > > Configuration >> > > > > > > > > > > > > > > > > Reloaded"[2] >> > > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up >> > > > > discussions >> > > > > > > > both >> > > > > > > > > > > online >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > > offline. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several >> > > > > > shortcomings >> > > > > > > of >> > > > > > > > > > > current >> > > > > > > > > > > > > > > (Flink >> > > > > > > > > > > > > > > > > 1.9) >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > > configuration. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different configuration >> > for >> > > > > > > Streaming >> > > > > > > > > and >> > > > > > > > > > > > Batch. >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult >> > > > > > configuration >> > > > > > > of >> > > > > > > > > > > RocksDB >> > > > > > > > > > > > > in >> > > > > > > > > > > > > > > > > > Streaming. >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain >> and >> > > > hard >> > > > > to >> > > > > > > > > > > understand. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the >> > problems >> > > > can >> > > > > > be >> > > > > > > > > > > summarized >> > > > > > > > > > > > > as >> > > > > > > > > > > > > > > > > follows. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager >> to >> > > also >> > > > > > > account >> > > > > > > > > for >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > > usage >> > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > state >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor >> > > memory >> > > > > is >> > > > > > > > > > > partitioned >> > > > > > > > > > > > > > > > accounted >> > > > > > > > > > > > > > > > > > > > > individual >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations and >> > pools. >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> > > configuration >> > > > > > > options >> > > > > > > > > and >> > > > > > > > > > > > > > > calculations >> > > > > > > > > > > > > > > > > > > logics. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details in >> the >> > > > FLIP >> > > > > > wiki >> > > > > > > > > > > document >> > > > > > > > > > > > > [1]. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early >> > > design >> > > > > doc >> > > > > > > [2] >> > > > > > > > is >> > > > > > > > > > out >> > > > > > > > > > > > of >> > > > > > > > > > > > > > > sync, >> > > > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > > it >> > > > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the >> > > discussion >> > > > in >> > > > > > > this >> > > > > > > > > > > mailing >> > > > > > > > > > > > > list >> > > > > > > > > > > > > > > > > > thread.) >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your >> > > feedbacks. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < >> > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was >> > wondering >> > > > > > whether >> > > > > > > > we >> > > > > > > > > > can >> > > > > > > > > > > > > avoid >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed >> > memory >> > > > and >> > > > > > > > network >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > with >> > > > > > > > > > > > > > > alternative 3. But after giving it a second >> > > thought, >> > > > I >> > > > > > > think >> > > > > > > > > even >> > > > > > > > > > > for >> > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap >> > > > managed >> > > > > > > memory >> > > > > > > > > > could >> > > > > > > > > > > > > cause >> > > > > > > > > > > > > > > problems. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what proposed >> in >> > > this >> > > > > > FLIP >> > > > > > > it >> > > > > > > > > to >> > > > > > > > > > > have >> > > > > > > > > > > > > > both >> > > > > > > > > > > > > > > off-heap managed memory and network memory >> > > allocated >> > > > > > > through >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are >> > practically >> > > > > > native >> > > > > > > > > memory >> > > > > > > > > > > and >> > > > > > > > > > > > > not >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only >> parts >> > of >> > > > > > memory >> > > > > > > > > > limited >> > > > > > > > > > > by >> > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > max direct memory are task off-heap memory and >> > JVM >> > > > > > > overhead, >> > > > > > > > > > which >> > > > > > > > > > > > are >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM >> max >> > > > > direct >> > > > > > > > memory >> > > > > > > > > > to. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann >> < >> > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I >> > > understand >> > > > > the >> > > > > > > two >> > > > > > > > > > > > > alternatives >> > > > > > > > > > > > > > > > now. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because it >> > makes >> > > > > > things >> > > > > > > > > > > explicit. >> > > > > > > > > > > > If >> > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that >> we >> > > might >> > > > > end >> > > > > > > up >> > > > > > > > > in a >> > > > > > > > > > > > > similar >> > > > > > > > > > > > > > > > situation as we are currently in: The user >> > might >> > > > see >> > > > > > that >> > > > > > > > her >> > > > > > > > > > > > process >> > > > > > > > > > > > > > > gets >> > > > > > > > > > > > > > > > killed by the OS and does not know why this >> is >> > > the >> > > > > > case. >> > > > > > > > > > > > > Consequently, >> > > > > > > > > > > > > > > she >> > > > > > > > > > > > > > > > tries to decrease the process memory size >> > > (similar >> > > > to >> > > > > > > > > > increasing >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > cutoff >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra >> > > direct >> > > > > > > memory. >> > > > > > > > > > Even >> > > > > > > > > > > > > worse, >> > > > > > > > > > > > > > > she >> > > > > > > > > > > > > > > > tries to decrease memory budgets which are >> not >> > > > fully >> > > > > > used >> > > > > > > > and >> > > > > > > > > > > hence >> > > > > > > > > > > > > > won't >> > > > > > > > > > > > > > > > change the overall memory consumption. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong >> Song < >> > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete >> example >> > > Till. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following scenario. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory + >> JVM >> > > > > > > Overhead): >> > > > > > > > > > 200MB >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> Metaspace, >> > > > > > Off-Heap >> > > > > > > > > > Managed >> > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set >> > > -XX:MaxDirectMemorySize >> > > > > to >> > > > > > > > 200MB. >> > > > > > > > > > > > > > > > > For alternative 3, we set >> > > -XX:MaxDirectMemorySize >> > > > > to >> > > > > > a >> > > > > > > > very >> > > > > > > > > > > large >> > > > > > > > > > > > > > > value, >> > > > > > > > > > > > > > > > > let's say 1TB. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task >> > > > Off-Heap >> > > > > > > Memory >> > > > > > > > > and >> > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > Overhead >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 >> and >> > > > > > > alternative 3 >> > > > > > > > > > > should >> > > > > > > > > > > > > have >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > same utility. Setting larger >> > > > > -XX:MaxDirectMemorySize >> > > > > > > will >> > > > > > > > > not >> > > > > > > > > > > > > reduce >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task >> > > > Off-Heap >> > > > > > > Memory >> > > > > > > > > and >> > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent >> OOM. >> > > To >> > > > > > avoid >> > > > > > > > > that, >> > > > > > > > > > > the >> > > > > > > > > > > > > only >> > > > > > > > > > > > > > > > thing >> > > > > > > > > > > > > > > > > user can do is to modify the >> configuration >> > > and >> > > > > > > > increase >> > > > > > > > > > JVM >> > > > > > > > > > > > > Direct >> > > > > > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM Overhead). >> > Let's >> > > > say >> > > > > > > that >> > > > > > > > > user >> > > > > > > > > > > > > > increases >> > > > > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will >> reduce >> > the >> > > > > total >> > > > > > > > size >> > > > > > > > > of >> > > > > > > > > > > > other >> > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > pools to 750MB, given the total process >> > > memory >> > > > > > > remains >> > > > > > > > > > 1GB. >> > > > > > > > > > > > > > > > > - For alternative 3, there is no >> chance of >> > > > > direct >> > > > > > > OOM. >> > > > > > > > > > There >> > > > > > > > > > > > are >> > > > > > > > > > > > > > > > chances >> > > > > > > > > > > > > > > > > of exceeding the total process memory >> > limit, >> > > > but >> > > > > > > given >> > > > > > > > > > that >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > process >> > > > > > > > > > > > > > > > > may >> > > > > > > > > > > > > > > > > not use up all the reserved native >> memory >> > > > > > (Off-Heap >> > > > > > > > > > Managed >> > > > > > > > > > > > > > Memory, >> > > > > > > > > > > > > > > > > Network >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual >> > direct >> > > > > > memory >> > > > > > > > > usage >> > > > > > > > > > is >> > > > > > > > > > > > > > > slightly >> > > > > > > > > > > > > > > > > above >> > > > > > > > > > > > > > > > > yet very close to 200MB, user probably >> do >> > > not >> > > > > need >> > > > > > > to >> > > > > > > > > > change >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > configurations. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> > > perspective, a >> > > > > > > > feasible >> > > > > > > > > > > > > > > configuration >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower >> resource >> > > > > > > utilization >> > > > > > > > > > > compared >> > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > alternative 3. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till >> > Rohrmann >> > > < >> > > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me understand >> the >> > > > > > difference >> > > > > > > > > > between >> > > > > > > > > > > > > > > > > alternative 2 >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization >> > > Xintong. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> XX:MaxDirectMemorySize >> > > to >> > > > > Task >> > > > > > > > > > Off-Heap >> > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > JVM >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that >> this >> > > size >> > > > > is >> > > > > > > too >> > > > > > > > > low >> > > > > > > > > > > > > > resulting >> > > > > > > > > > > > > > > > in a >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> potentially >> > an >> > > > OOM. >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> XX:MaxDirectMemorySize >> > > to >> > > > > > > > something >> > > > > > > > > > > larger >> > > > > > > > > > > > > > than >> > > > > > > > > > > > > > > > > > alternative 2. This would of course >> reduce >> > > the >> > > > > > sizes >> > > > > > > of >> > > > > > > > > the >> > > > > > > > > > > > other >> > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > types. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result in an >> > > under >> > > > > > > > > utilization >> > > > > > > > > > of >> > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> alternative 3 >> > > > > > strictly >> > > > > > > > > sets a >> > > > > > > > > > > > > higher >> > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > direct memory size and we use only >> little, >> > > > then I >> > > > > > > would >> > > > > > > > > > > expect >> > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under >> > > > > utilization. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang >> Wang < >> > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large max >> > direct >> > > > > > memory >> > > > > > > > size >> > > > > > > > > > > when >> > > > > > > > > > > > we >> > > > > > > > > > > > > > do >> > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > differentiate direct and native >> memory. >> > If >> > > > the >> > > > > > > direct >> > > > > > > > > > > > > > > > memory,including >> > > > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > direct memory and framework direct >> > > > memory,could >> > > > > > be >> > > > > > > > > > > calculated >> > > > > > > > > > > > > > > > > > > correctly,then >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory >> > with >> > > > > fixed >> > > > > > > > > value. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and >> k8s,we >> > > > need >> > > > > to >> > > > > > > > check >> > > > > > > > > > the >> > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > configurations in client to avoid >> > > submitting >> > > > > > > > > successfully >> > > > > > > > > > > and >> > > > > > > > > > > > > > > failing >> > > > > > > > > > > > > > > > > in >> > > > > > > > > > > > > > > > > > > the flink master. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] >> > > > > >于2019年8月13日 >> > > > > > > > > > 周二22:07写道: >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are >> > > right >> > > > > that >> > > > > > > we >> > > > > > > > > > should >> > > > > > > > > > > > not >> > > > > > > > > > > > > > > > include >> > > > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. >> This >> > > FLIP >> > > > > > should >> > > > > > > > > > > > concentrate >> > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> > TaskExecutors, >> > > > > with >> > > > > > > > > minimum >> > > > > > > > > > > > > > > involvement >> > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> > alternative >> > > 3 >> > > > > may >> > > > > > > not >> > > > > > > > > > having >> > > > > > > > > > > > the >> > > > > > > > > > > > > > > same >> > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2 >> > > does, >> > > > > but >> > > > > > at >> > > > > > > > the >> > > > > > > > > > > cost >> > > > > > > > > > > > of >> > > > > > > > > > > > > > > risk >> > > > > > > > > > > > > > > > of >> > > > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > using memory at the container level, >> > > which >> > > > is >> > > > > > not >> > > > > > > > > good. >> > > > > > > > > > > My >> > > > > > > > > > > > > > point >> > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM >> > > > > Overhead" >> > > > > > > are >> > > > > > > > > not >> > > > > > > > > > > easy >> > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > config. >> > > > > > > > > > > > > > > > > > > For >> > > > > > > > > > > > > > > > > > > > alternative 2, users might configure >> > them >> > > > > > higher >> > > > > > > > than >> > > > > > > > > > > what >> > > > > > > > > > > > > > > actually >> > > > > > > > > > > > > > > > > > > needed, >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM. >> For >> > > > > > > alternative >> > > > > > > > > 3, >> > > > > > > > > > > > users >> > > > > > > > > > > > > do >> > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > get >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config >> the >> > > two >> > > > > > > options >> > > > > > > > > > > > > aggressively >> > > > > > > > > > > > > > > > high. >> > > > > > > > > > > > > > > > > > But >> > > > > > > > > > > > > > > > > > > > the consequences are risks of >> overall >> > > > > container >> > > > > > > > > memory >> > > > > > > > > > > > usage >> > > > > > > > > > > > > > > > exceeds >> > > > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > > budget. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till >> > > > > Rohrmann < >> > > > > > > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP >> > Xintong. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already >> looks >> > > quite >> > > > > > good. >> > > > > > > > > > > > Concerning >> > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > first >> > > > > > > > > > > > > > > > > > > open >> > > > > > > > > > > > > > > > > > > > > question about allocating memory >> > > > segments, >> > > > > I >> > > > > > > was >> > > > > > > > > > > > wondering >> > > > > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the >> > context >> > > > of >> > > > > > this >> > > > > > > > > FLIP >> > > > > > > > > > or >> > > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > could >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without >> > knowing >> > > > all >> > > > > > > > > details, >> > > > > > > > > > I >> > > > > > > > > > > > > would >> > > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > > concerned >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of >> this >> > > > FLIP >> > > > > > too >> > > > > > > > much >> > > > > > > > > > > > because >> > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > > > > > have >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call >> sites >> > of >> > > > the >> > > > > > > > > > > MemoryManager >> > > > > > > > > > > > > > where >> > > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > > > allocate >> > > > > > > > > > > > > > > > > > > > > memory segments (this should >> mainly >> > be >> > > > > batch >> > > > > > > > > > > operators). >> > > > > > > > > > > > > The >> > > > > > > > > > > > > > > > > addition >> > > > > > > > > > > > > > > > > > > of >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to the >> > > > > > > MemoryManager >> > > > > > > > > > should >> > > > > > > > > > > > not >> > > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > affected >> > > > > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > > this and I would hope that this is >> > the >> > > > only >> > > > > > > point >> > > > > > > > > of >> > > > > > > > > > > > > > > interaction >> > > > > > > > > > > > > > > > a >> > > > > > > > > > > > > > > > > > > > > streaming job would have with the >> > > > > > > MemoryManager. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> question >> > > about >> > > > > > > setting >> > > > > > > > > or >> > > > > > > > > > > not >> > > > > > > > > > > > > > > setting >> > > > > > > > > > > > > > > > a >> > > > > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also >> be >> > > > > > interested >> > > > > > > > why >> > > > > > > > > > > Yang >> > > > > > > > > > > > > Wang >> > > > > > > > > > > > > > > > > thinks >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My >> > > concern >> > > > > > about >> > > > > > > > > this >> > > > > > > > > > > > would >> > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we >> are >> > now >> > > > > with >> > > > > > > the >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> > > > > > > > > > > > > > > > > > > If >> > > > > > > > > > > > > > > > > > > > > the different memory pools are not >> > > > clearly >> > > > > > > > > separated >> > > > > > > > > > > and >> > > > > > > > > > > > > can >> > > > > > > > > > > > > > > > spill >> > > > > > > > > > > > > > > > > > over >> > > > > > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite >> > hard >> > > > to >> > > > > > > > > understand >> > > > > > > > > > > > what >> > > > > > > > > > > > > > > > exactly >> > > > > > > > > > > > > > > > > > > > causes a >> > > > > > > > > > > > > > > > > > > > > process to get killed for using >> too >> > > much >> > > > > > > memory. >> > > > > > > > > This >> > > > > > > > > > > > could >> > > > > > > > > > > > > > > then >> > > > > > > > > > > > > > > > > > easily >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what >> we >> > > have >> > > > > with >> > > > > > > the >> > > > > > > > > > > > > > cutoff-ratio. >> > > > > > > > > > > > > > > > So >> > > > > > > > > > > > > > > > > > why >> > > > > > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > > > setting a sane default value for >> max >> > > > direct >> > > > > > > > memory >> > > > > > > > > > and >> > > > > > > > > > > > > giving >> > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > an >> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs >> into >> > > an >> > > > > OOM. >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2 >> > lead >> > > to >> > > > > > lower >> > > > > > > > > > memory >> > > > > > > > > > > > > > > > utilization >> > > > > > > > > > > > > > > > > > than >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the >> direct >> > > > > memory >> > > > > > > to a >> > > > > > > > > > > higher >> > > > > > > > > > > > > > value? >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, >> > > > > > > > > > > > > > > > > > > > > Till >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM >> > Xintong >> > > > > Song < >> > > > > > > > > > > > > > > > [hidden email] >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large max >> > > direct >> > > > > > > memory >> > > > > > > > > size >> > > > > > > > > > > > > > > definitely >> > > > > > > > > > > > > > > > > has >> > > > > > > > > > > > > > > > > > > some >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not >> worry >> > > about >> > > > > > > direct >> > > > > > > > > OOM, >> > > > > > > > > > > and >> > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > don't >> > > > > > > > > > > > > > > > > > even >> > > > > > > > > > > > > > > > > > > > > need >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network >> > memory >> > > > with >> > > > > > > > > > > > > > Unsafe.allocate() . >> > > > > > > > > > > > > > > > > > > > > > However, there are also some >> down >> > > sides >> > > > > of >> > > > > > > > doing >> > > > > > > > > > > this. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think of is >> > that >> > > > if >> > > > > a >> > > > > > > task >> > > > > > > > > > > > executor >> > > > > > > > > > > > > > > > > container >> > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing >> memory, >> > it >> > > > > could >> > > > > > > be >> > > > > > > > > hard >> > > > > > > > > > > for >> > > > > > > > > > > > > use >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > know >> > > > > > > > > > > > > > > > > > > > which >> > > > > > > > > > > > > > > > > > > > > > part >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. >> > > > > > > > > > > > > > > > > > > > > > - Another down side is that >> the >> > > JVM >> > > > > > never >> > > > > > > > > > trigger >> > > > > > > > > > > GC >> > > > > > > > > > > > > due >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > reaching >> > > > > > > > > > > > > > > > > > > > > max >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, because >> the >> > > > limit >> > > > > > is >> > > > > > > > too >> > > > > > > > > > high >> > > > > > > > > > > > to >> > > > > > > > > > > > > be >> > > > > > > > > > > > > > > > > > reached. >> > > > > > > > > > > > > > > > > > > > That >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on >> heap >> > > > memory >> > > > > to >> > > > > > > > > trigger >> > > > > > > > > > > GC >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > release >> > > > > > > > > > > > > > > > > > > > direct >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a >> problem >> > in >> > > > > cases >> > > > > > > > where >> > > > > > > > > > we >> > > > > > > > > > > > have >> > > > > > > > > > > > > > > more >> > > > > > > > > > > > > > > > > > direct >> > > > > > > > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap >> > activity >> > > > to >> > > > > > > > trigger >> > > > > > > > > > the >> > > > > > > > > > > > GC. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons >> > for >> > > > > > > preferring >> > > > > > > > > > > > setting a >> > > > > > > > > > > > > > > very >> > > > > > > > > > > > > > > > > > large >> > > > > > > > > > > > > > > > > > > > > value, >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I >> > > > overlooked. >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict between >> > > > multiple >> > > > > > > > > > > configuration >> > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > user >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we >> > > should >> > > > > > throw >> > > > > > > > an >> > > > > > > > > > > error. >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the >> > client >> > > > side >> > > > > > is >> > > > > > > a >> > > > > > > > > good >> > > > > > > > > > > > idea, >> > > > > > > > > > > > > > so >> > > > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > > > > Yarn / >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem >> > > before >> > > > > > > > submitting >> > > > > > > > > > the >> > > > > > > > > > > > > Flink >> > > > > > > > > > > > > > > > > > cluster, >> > > > > > > > > > > > > > > > > > > > > which >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the >> > > client >> > > > > side >> > > > > > > > > > checking, >> > > > > > > > > > > > > > because >> > > > > > > > > > > > > > > > for >> > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers >> on >> > > > > > different >> > > > > > > > > > machines >> > > > > > > > > > > > may >> > > > > > > > > > > > > > > have >> > > > > > > > > > > > > > > > > > > > different >> > > > > > > > > > > > > > > > > > > > > > configurations and the client >> does >> > > see >> > > > > > that. >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM >> Yang >> > > > Wang >> > > > > < >> > > > > > > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed >> > proposal. >> > > > > After >> > > > > > > all >> > > > > > > > > the >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > configuration >> > > > > > > > > > > > > > > > > > > > > are >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more >> > > powerful >> > > > to >> > > > > > > > control >> > > > > > > > > > the >> > > > > > > > > > > > > flink >> > > > > > > > > > > > > > > > > memory >> > > > > > > > > > > > > > > > > > > > > usage. I >> > > > > > > > > > > > > > > > > > > > > > > just have few questions about >> it. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct Memory >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user >> > direct >> > > > > > memory >> > > > > > > > and >> > > > > > > > > > > native >> > > > > > > > > > > > > > > memory. >> > > > > > > > > > > > > > > > > > They >> > > > > > > > > > > > > > > > > > > > are >> > > > > > > > > > > > > > > > > > > > > > all >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap >> memory. >> > > > > Right? >> > > > > > > So i >> > > > > > > > > > don’t >> > > > > > > > > > > > > think >> > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > > > could >> > > > > > > > > > > > > > > > > > > > not >> > > > > > > > > > > > > > > > > > > > > > set >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize >> > > > properly. I >> > > > > > > > prefer >> > > > > > > > > > > > leaving >> > > > > > > > > > > > > > it a >> > > > > > > > > > > > > > > > > very >> > > > > > > > > > > > > > > > > > > > large >> > > > > > > > > > > > > > > > > > > > > > > value. >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained >> > > > > > > memory(network >> > > > > > > > > > > memory, >> > > > > > > > > > > > > > > managed >> > > > > > > > > > > > > > > > > > > memory, >> > > > > > > > > > > > > > > > > > > > > > etc.) >> > > > > > > > > > > > > > > > > > > > > > > is larger than total process >> > > memory, >> > > > > how >> > > > > > do >> > > > > > > > we >> > > > > > > > > > deal >> > > > > > > > > > > > > with >> > > > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > > > > > > situation? >> > > > > > > > > > > > > > > > > > > > > > Do >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory >> > > > > configuration >> > > > > > > in >> > > > > > > > > > > client? >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> > > [hidden email]> >> > > > > > > > > > 于2019年8月7日周三 >> > > > > > > > > > > > > > > 下午10:14写道: >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a >> > > discussion >> > > > > > > thread >> > > > > > > > on >> > > > > > > > > > > > > "FLIP-49: >> > > > > > > > > > > > > > > > > Unified >> > > > > > > > > > > > > > > > > > > > > Memory >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> > > > TaskExecutors"[1], >> > > > > > > where >> > > > > > > > we >> > > > > > > > > > > > > describe >> > > > > > > > > > > > > > > how >> > > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > > > > improve >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > > configurations. >> > > > > The >> > > > > > > > FLIP >> > > > > > > > > > > > document >> > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > mostly >> > > > > > > > > > > > > > > > > > > > based >> > > > > > > > > > > > > > > > > > > > > > on >> > > > > > > > > > > > > > > > > > > > > > > an >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> Management >> > > and >> > > > > > > > > > Configuration >> > > > > > > > > > > > > > > > > Reloaded"[2] >> > > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up >> > > > > discussions >> > > > > > > > both >> > > > > > > > > > > online >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > > offline. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several >> > > > > > shortcomings >> > > > > > > of >> > > > > > > > > > > current >> > > > > > > > > > > > > > > (Flink >> > > > > > > > > > > > > > > > > 1.9) >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > > configuration. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different configuration >> > for >> > > > > > > Streaming >> > > > > > > > > and >> > > > > > > > > > > > Batch. >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult >> > > > > > configuration >> > > > > > > of >> > > > > > > > > > > RocksDB >> > > > > > > > > > > > > in >> > > > > > > > > > > > > > > > > > Streaming. >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, uncertain >> and >> > > > hard >> > > > > to >> > > > > > > > > > > understand. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the >> > problems >> > > > can >> > > > > > be >> > > > > > > > > > > summarized >> > > > > > > > > > > > > as >> > > > > > > > > > > > > > > > > follows. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager >> to >> > > also >> > > > > > > account >> > > > > > > > > for >> > > > > > > > > > > > memory >> > > > > > > > > > > > > > > usage >> > > > > > > > > > > > > > > > > by >> > > > > > > > > > > > > > > > > > > > state >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how TaskExecutor >> > > memory >> > > > > is >> > > > > > > > > > > partitioned >> > > > > > > > > > > > > > > > accounted >> > > > > > > > > > > > > > > > > > > > > individual >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations and >> > pools. >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> > > configuration >> > > > > > > options >> > > > > > > > > and >> > > > > > > > > > > > > > > calculations >> > > > > > > > > > > > > > > > > > > logics. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details in >> the >> > > > FLIP >> > > > > > wiki >> > > > > > > > > > > document >> > > > > > > > > > > > > [1]. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early >> > > design >> > > > > doc >> > > > > > > [2] >> > > > > > > > is >> > > > > > > > > > out >> > > > > > > > > > > > of >> > > > > > > > > > > > > > > sync, >> > > > > > > > > > > > > > > > > and >> > > > > > > > > > > > > > > > > > it >> > > > > > > > > > > > > > > > > > > > is >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the >> > > discussion >> > > > in >> > > > > > > this >> > > > > > > > > > > mailing >> > > > > > > > > > > > > list >> > > > > > > > > > > > > > > > > > thread.) >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your >> > > feedbacks. >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > |
Sorry for the late response.
- Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail in PR. - Regarding passing parameters into the `TaskExecutor`, +1 for using dynamic configuration at the moment, given that there are more questions to be discussed to have a general framework for overwriting configurations with ENV variables. - Regarding memory reservation, I double checked with Yu and he will take care of it. Thank you~ Xintong Song On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> wrote: > What I forgot to add is that we could tackle specifying the configuration > fully in an incremental way and that the full specification should be the > desired end state. > > On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> > wrote: > > > I think our goal should be that the configuration is fully specified when > > the process is started. By considering the internal calculation step to > be > > rather validate existing values and calculate missing ones, these two > > proposal shouldn't even conflict (given determinism). > > > > Since we don't want to change an existing flink-conf.yaml, specifying the > > full configuration would require to pass in the options differently. > > > > One way could be the ENV variables approach. The reason why I'm trying to > > exclude this feature from the FLIP is that I believe it needs a bit more > > discussion. Just some questions which come to my mind: What would be the > > exact format (FLINK_KEY_NAME)? Would we support a dot separator which is > > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > > separator what would be the order of precedence if there are two ENV > > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > > precedence of env variable vs. dynamic configuration value specified via > -D? > > > > Another approach could be to pass in the dynamic configuration values via > > `-Dkey=value` to the Flink process. For that we don't have to change > > anything because the functionality already exists. > > > > Cheers, > > Till > > > > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> wrote: > > > >> I see. Under the assumption of strict determinism that should work. > >> > >> The original proposal had this point "don't compute inside the TM, > compute > >> outside and supply a full config", because that sounded more intuitive. > >> > >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> > >> wrote: > >> > >> > My understanding was that before starting the Flink process we call a > >> > utility which calculates these values. I assume that this utility will > >> do > >> > the calculation based on a set of configured values (process memory, > >> flink > >> > memory, network memory etc.). Assuming that these values don't differ > >> from > >> > the values with which the JVM is started, it should be possible to > >> > recompute them in the Flink process in order to set the values. > >> > > >> > > >> > > >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> > wrote: > >> > > >> > > When computing the values in the JVM process after it started, how > >> would > >> > > you deal with values like Max Direct Memory, Metaspace size. native > >> > memory > >> > > reservation (reduce heap size), etc? All the values that are > >> parameters > >> > to > >> > > the JVM process and that need to be supplied at process startup? > >> > > > >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <[hidden email] > > > >> > > wrote: > >> > > > >> > > > Thanks for the clarification. I have some more comments: > >> > > > > >> > > > - I would actually split the logic to compute the process memory > >> > > > requirements and storing the values into two things. E.g. one > could > >> > name > >> > > > the former TaskExecutorProcessUtility and the latter > >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR since > >> it's > >> > > > just a naming detail. > >> > > > > >> > > > - Generally, I'm not opposed to making configuration values > >> overridable > >> > > by > >> > > > ENV variables. I think this is a very good idea and makes the > >> > > > configurability of Flink processes easier. However, I think that > >> adding > >> > > > this functionality should not be part of this FLIP because it > would > >> > > simply > >> > > > widen the scope unnecessarily. > >> > > > > >> > > > The reasons why I believe it is unnecessary are the following: For > >> Yarn > >> > > we > >> > > > already create write a flink-conf.yaml which could be populated > with > >> > the > >> > > > memory settings. For the other processes it should not make a > >> > difference > >> > > > whether the loaded Configuration is populated with the memory > >> settings > >> > > from > >> > > > ENV variables or by using TaskExecutorProcessUtility to compute > the > >> > > missing > >> > > > values from the loaded configuration. If the latter would not be > >> > possible > >> > > > (wrong or missing configuration values), then we should not have > >> been > >> > > able > >> > > > to actually start the process in the first place. > >> > > > > >> > > > - Concerning the memory reservation: I agree with you that we need > >> the > >> > > > memory reservation functionality to make streaming jobs work with > >> > > "managed" > >> > > > memory. However, w/o this functionality the whole Flip would > already > >> > > bring > >> > > > a good amount of improvements to our users when running batch > jobs. > >> > > > Moreover, by keeping the scope smaller we can complete the FLIP > >> faster. > >> > > > Hence, I would propose to address the memory reservation > >> functionality > >> > > as a > >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). > >> > > > > >> > > > Cheers, > >> > > > Till > >> > > > > >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <[hidden email] > > > >> > > wrote: > >> > > > > >> > > > > Just add my 2 cents. > >> > > > > > >> > > > > Using environment variables to override the configuration for > >> > different > >> > > > > taskmanagers is better. > >> > > > > We do not need to generate dedicated flink-conf.yaml for all > >> > > > taskmanagers. > >> > > > > A common flink-conf.yam and different environment variables are > >> > enough. > >> > > > > By reducing the distributed cached files, it could make > launching > >> a > >> > > > > taskmanager faster. > >> > > > > > >> > > > > Stephan gives a good suggestion that we could move the logic > into > >> > > > > "GlobalConfiguration.loadConfig()" method. > >> > > > > Maybe the client could also benefit from this. Different users > do > >> not > >> > > > have > >> > > > > to export FLINK_CONF_DIR to update few config options. > >> > > > > > >> > > > > > >> > > > > Best, > >> > > > > Yang > >> > > > > > >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > >> > > > > > >> > > > > > One note on the Environment Variables and Configuration > >> discussion. > >> > > > > > > >> > > > > > My understanding is that passed ENV variables are added to the > >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" method > >> (or > >> > > > > > similar). > >> > > > > > For all the code inside Flink, it looks like the data was in > the > >> > > config > >> > > > > to > >> > > > > > start with, just that the scripts that compute the variables > can > >> > pass > >> > > > the > >> > > > > > values to the process without actually needing to write a > file. > >> > > > > > > >> > > > > > For example the "GlobalConfiguration.loadConfig()" method > would > >> > take > >> > > > any > >> > > > > > ENV variable prefixed with "flink" and add it as a config key. > >> > > > > > "flink_taskmanager_memory_size=2g" would become > >> > > > "taskmanager.memory.size: > >> > > > > > 2g". > >> > > > > > > >> > > > > > > >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > >> > [hidden email]> > >> > > > > > wrote: > >> > > > > > > >> > > > > > > Thanks for the comments, Till. > >> > > > > > > > >> > > > > > > I've also seen your comments on the wiki page, but let's > keep > >> the > >> > > > > > > discussion here. > >> > > > > > > > >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about > >> > naming > >> > > it > >> > > > > > > 'TaskExecutorResourceSpecifics'. > >> > > > > > > - Regarding passing memory configurations into task > executors, > >> > I'm > >> > > in > >> > > > > > favor > >> > > > > > > of do it via environment variables rather than > configurations, > >> > with > >> > > > the > >> > > > > > > following two reasons. > >> > > > > > > - It is easier to keep the memory options once calculate > >> not to > >> > > be > >> > > > > > > changed with environment variables rather than > configurations. > >> > > > > > > - I'm not sure whether we should write the configuration > in > >> > > startup > >> > > > > > > scripts. Writing changes into the configuration files when > >> > running > >> > > > the > >> > > > > > > startup scripts does not sounds right to me. Or we could > make > >> a > >> > > copy > >> > > > of > >> > > > > > > configuration files per flink cluster, and make the task > >> executor > >> > > to > >> > > > > load > >> > > > > > > from the copy, and clean up the copy after the cluster is > >> > shutdown, > >> > > > > which > >> > > > > > > is complicated. (I think this is also what Stephan means in > >> his > >> > > > comment > >> > > > > > on > >> > > > > > > the wiki page?) > >> > > > > > > - Regarding reserving memory, I think this change should be > >> > > included > >> > > > in > >> > > > > > > this FLIP. I think a big part of motivations of this FLIP is > >> to > >> > > unify > >> > > > > > > memory configuration for streaming / batch and make it easy > >> for > >> > > > > > configuring > >> > > > > > > rocksdb memory. If we don't support memory reservation, then > >> > > > streaming > >> > > > > > jobs > >> > > > > > > cannot use managed memory (neither on-heap or off-heap), > which > >> > > makes > >> > > > > this > >> > > > > > > FLIP incomplete. > >> > > > > > > - Regarding network memory, I think you are right. I think > we > >> > > > probably > >> > > > > > > don't need to change network stack from using direct memory > to > >> > > using > >> > > > > > unsafe > >> > > > > > > native memory. Network memory size is deterministic, cannot > be > >> > > > reserved > >> > > > > > as > >> > > > > > > managed memory does, and cannot be overused. I think it also > >> > works > >> > > if > >> > > > > we > >> > > > > > > simply keep using direct memory for network and include it > in > >> jvm > >> > > max > >> > > > > > > direct memory size. > >> > > > > > > > >> > > > > > > Thank you~ > >> > > > > > > > >> > > > > > > Xintong Song > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > >> > > [hidden email]> > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > Hi Xintong, > >> > > > > > > > > >> > > > > > > > thanks for addressing the comments and adding a more > >> detailed > >> > > > > > > > implementation plan. I have a couple of comments > concerning > >> the > >> > > > > > > > implementation plan: > >> > > > > > > > > >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > >> descriptive. > >> > > > > Choosing > >> > > > > > a > >> > > > > > > > different name could help here. > >> > > > > > > > - I'm not sure whether I would pass the memory > >> configuration to > >> > > the > >> > > > > > > > TaskExecutor via environment variables. I think it would > be > >> > > better > >> > > > to > >> > > > > > > write > >> > > > > > > > it into the configuration one uses to start the TM > process. > >> > > > > > > > - If possible, I would exclude the memory reservation from > >> this > >> > > > FLIP > >> > > > > > and > >> > > > > > > > add this as part of a dedicated FLIP. > >> > > > > > > > - If possible, then I would exclude changes to the network > >> > stack > >> > > > from > >> > > > > > > this > >> > > > > > > > FLIP. Maybe we can simply say that the direct memory > needed > >> by > >> > > the > >> > > > > > > network > >> > > > > > > > stack is the framework direct memory requirement. Changing > >> how > >> > > the > >> > > > > > memory > >> > > > > > > > is allocated can happen in a second step. This would keep > >> the > >> > > scope > >> > > > > of > >> > > > > > > this > >> > > > > > > > FLIP smaller. > >> > > > > > > > > >> > > > > > > > Cheers, > >> > > > > > > > Till > >> > > > > > > > > >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > >> > > > [hidden email]> > >> > > > > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Hi everyone, > >> > > > > > > > > > >> > > > > > > > > I just updated the FLIP document on wiki [1], with the > >> > > following > >> > > > > > > changes. > >> > > > > > > > > > >> > > > > > > > > - Removed open question regarding MemorySegment > >> > allocation. > >> > > As > >> > > > > > > > > discussed, we exclude this topic from the scope of > this > >> > > FLIP. > >> > > > > > > > > - Updated content about JVM direct memory parameter > >> > > according > >> > > > to > >> > > > > > > > recent > >> > > > > > > > > discussions, and moved the other options to "Rejected > >> > > > > > Alternatives" > >> > > > > > > > for > >> > > > > > > > > the > >> > > > > > > > > moment. > >> > > > > > > > > - Added implementation steps. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Thank you~ > >> > > > > > > > > > >> > > > > > > > > Xintong Song > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > [1] > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > > > > > > > > > >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > >> > [hidden email] > >> > > > > >> > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > @Xintong: Concerning "wait for memory users before > task > >> > > dispose > >> > > > > and > >> > > > > > > > > memory > >> > > > > > > > > > release": I agree, that's how it should be. Let's try > it > >> > out. > >> > > > > > > > > > > >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for > >> GC > >> > > when > >> > > > > > > > allocating > >> > > > > > > > > > direct memory buffer": There seems to be pretty > >> elaborate > >> > > logic > >> > > > > to > >> > > > > > > free > >> > > > > > > > > > buffers when allocating new ones. See > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > >> > > > > > > > > > > >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default works > >> (like > >> > > > going > >> > > > > > > with > >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at > >> all), > >> > > > then > >> > > > > I > >> > > > > > > > think > >> > > > > > > > > it > >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we > use > >> > > > RocksDB. > >> > > > > > > That > >> > > > > > > > > is a > >> > > > > > > > > > big if, though, I honestly have no idea :D Would be > >> good to > >> > > > > > > understand > >> > > > > > > > > > this, though, because this would affect option (2) and > >> > option > >> > > > > > (1.2). > >> > > > > > > > > > > >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > >> > > > > > [hidden email]> > >> > > > > > > > > > wrote: > >> > > > > > > > > > > >> > > > > > > > > > > Thanks for the inputs, Jingsong. > >> > > > > > > > > > > > >> > > > > > > > > > > Let me try to summarize your points. Please correct > >> me if > >> > > I'm > >> > > > > > > wrong. > >> > > > > > > > > > > > >> > > > > > > > > > > - Memory consumers should always avoid returning > >> > memory > >> > > > > > segments > >> > > > > > > > to > >> > > > > > > > > > > memory manager while there are still un-cleaned > >> > > > structures / > >> > > > > > > > threads > >> > > > > > > > > > > that > >> > > > > > > > > > > may use the memory. Otherwise, it would cause > >> serious > >> > > > > problems > >> > > > > > > by > >> > > > > > > > > > having > >> > > > > > > > > > > multiple consumers trying to use the same memory > >> > > segment. > >> > > > > > > > > > > - JVM does not wait for GC when allocating direct > >> > memory > >> > > > > > buffer. > >> > > > > > > > > > > Therefore even we set proper max direct memory > size > >> > > limit, > >> > > > > we > >> > > > > > > may > >> > > > > > > > > > still > >> > > > > > > > > > > encounter direct memory oom if the GC cleaning > >> memory > >> > > > slower > >> > > > > > > than > >> > > > > > > > > the > >> > > > > > > > > > > direct memory allocation. > >> > > > > > > > > > > > >> > > > > > > > > > > Am I understanding this correctly? > >> > > > > > > > > > > > >> > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > >> > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > >> > > > > > > [hidden email] > >> > > > > > > > > > > .invalid> > >> > > > > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > > Hi stephan: > >> > > > > > > > > > > > > >> > > > > > > > > > > > About option 2: > >> > > > > > > > > > > > > >> > > > > > > > > > > > if additional threads not cleanly shut down before > >> we > >> > can > >> > > > > exit > >> > > > > > > the > >> > > > > > > > > > task: > >> > > > > > > > > > > > In the current case of memory reuse, it has freed > up > >> > the > >> > > > > memory > >> > > > > > > it > >> > > > > > > > > > > > uses. If this memory is used by other tasks and > >> > > > asynchronous > >> > > > > > > > threads > >> > > > > > > > > > > > of exited task may still be writing, there will > be > >> > > > > concurrent > >> > > > > > > > > security > >> > > > > > > > > > > > problems, and even lead to errors in user > computing > >> > > > results. > >> > > > > > > > > > > > > >> > > > > > > > > > > > So I think this is a serious and intolerable bug, > No > >> > > matter > >> > > > > > what > >> > > > > > > > the > >> > > > > > > > > > > > option is, it should be avoided. > >> > > > > > > > > > > > > >> > > > > > > > > > > > About direct memory cleaned by GC: > >> > > > > > > > > > > > I don't think it is a good idea, I've encountered > so > >> > many > >> > > > > > > > situations > >> > > > > > > > > > > > that it's too late for GC to cause DirectMemory > >> OOM. > >> > > > Release > >> > > > > > and > >> > > > > > > > > > > > allocate DirectMemory depend on the type of user > >> job, > >> > > > which > >> > > > > is > >> > > > > > > > > > > > often beyond our control. > >> > > > > > > > > > > > > >> > > > > > > > > > > > Best, > >> > > > > > > > > > > > Jingsong Lee > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > >> > ------------------------------------------------------------------ > >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > >> > > > > > > > > > > > To:dev <[hidden email]> > >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > >> > > Configuration > >> > > > > for > >> > > > > > > > > > > > TaskExecutors > >> > > > > > > > > > > > > >> > > > > > > > > > > > My main concern with option 2 (manually release > >> memory) > >> > > is > >> > > > > that > >> > > > > > > > > > segfaults > >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user > >> ends. > >> > So > >> > > we > >> > > > > > need > >> > > > > > > to > >> > > > > > > > > > > > guarantee that this never happens. > >> > > > > > > > > > > > > >> > > > > > > > > > > > The trickyness is in tasks that uses data > >> structures / > >> > > > > > algorithms > >> > > > > > > > > with > >> > > > > > > > > > > > additional threads, like hash table spill/read and > >> > > sorting > >> > > > > > > threads. > >> > > > > > > > > We > >> > > > > > > > > > > need > >> > > > > > > > > > > > to ensure that these cleanly shut down before we > can > >> > exit > >> > > > the > >> > > > > > > task. > >> > > > > > > > > > > > I am not sure that we have that guaranteed > already, > >> > > that's > >> > > > > why > >> > > > > > > > option > >> > > > > > > > > > 1.1 > >> > > > > > > > > > > > seemed simpler to me. > >> > > > > > > > > > > > > >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > >> > > > > > > > [hidden email]> > >> > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in > >> this > >> > > way > >> > > > > > really > >> > > > > > > > > makes > >> > > > > > > > > > > > > things easier to understand. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > I'm in favor of option 2, at least for the > >> moment. I > >> > > > think > >> > > > > it > >> > > > > > > is > >> > > > > > > > > not > >> > > > > > > > > > > that > >> > > > > > > > > > > > > difficult to keep it segfault safe for memory > >> > manager, > >> > > as > >> > > > > > long > >> > > > > > > as > >> > > > > > > > > we > >> > > > > > > > > > > > always > >> > > > > > > > > > > > > de-allocate the memory segment when it is > released > >> > from > >> > > > the > >> > > > > > > > memory > >> > > > > > > > > > > > > consumers. Only if the memory consumer continue > >> using > >> > > the > >> > > > > > > buffer > >> > > > > > > > of > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > segment after releasing it, in which case we do > >> want > >> > > the > >> > > > > job > >> > > > > > to > >> > > > > > > > > fail > >> > > > > > > > > > so > >> > > > > > > > > > > > we > >> > > > > > > > > > > > > detect the memory leak early. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.2, I don't think this is a good > idea. > >> > Not > >> > > > only > >> > > > > > > > because > >> > > > > > > > > > the > >> > > > > > > > > > > > > assumption (regular GC is enough to clean direct > >> > > buffers) > >> > > > > may > >> > > > > > > not > >> > > > > > > > > > > always > >> > > > > > > > > > > > be > >> > > > > > > > > > > > > true, but also it makes harder for finding > >> problems > >> > in > >> > > > > cases > >> > > > > > of > >> > > > > > > > > > memory > >> > > > > > > > > > > > > overuse. E.g., user configured some direct > memory > >> for > >> > > the > >> > > > > > user > >> > > > > > > > > > > libraries. > >> > > > > > > > > > > > > If the library actually use more direct memory > >> then > >> > > > > > configured, > >> > > > > > > > > which > >> > > > > > > > > > > > > cannot be cleaned by GC because they are still > in > >> > use, > >> > > > may > >> > > > > > lead > >> > > > > > > > to > >> > > > > > > > > > > > overuse > >> > > > > > > > > > > > > of the total container memory. In that case, if > it > >> > > didn't > >> > > > > > touch > >> > > > > > > > the > >> > > > > > > > > > JVM > >> > > > > > > > > > > > > default max direct memory limit, we cannot get a > >> > direct > >> > > > > > memory > >> > > > > > > > OOM > >> > > > > > > > > > and > >> > > > > > > > > > > it > >> > > > > > > > > > > > > will become super hard to understand which part > of > >> > the > >> > > > > > > > > configuration > >> > > > > > > > > > > need > >> > > > > > > > > > > > > to be updated. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.1, it has the similar problem as > >> 1.2, if > >> > > the > >> > > > > > > > exceeded > >> > > > > > > > > > > direct > >> > > > > > > > > > > > > memory does not reach the max direct memory > limit > >> > > > specified > >> > > > > > by > >> > > > > > > > the > >> > > > > > > > > > > > > dedicated parameter. I think it is slightly > better > >> > than > >> > > > > 1.2, > >> > > > > > > only > >> > > > > > > > > > > because > >> > > > > > > > > > > > > we can tune the parameter. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > >> > > > > > [hidden email] > >> > > > > > > > > >> > > > > > > > > > wrote: > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > discussion, > >> > maybe > >> > > > let > >> > > > > > me > >> > > > > > > > > > > summarize > >> > > > > > > > > > > > > it a > >> > > > > > > > > > > > > > bit differently: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > We have the following two options: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by > the > >> > GC. > >> > > > That > >> > > > > > > makes > >> > > > > > > > > it > >> > > > > > > > > > > > > segfault > >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC in > >> case > >> > > > > > > > de-allocation > >> > > > > > > > > > and > >> > > > > > > > > > > > > > re-allocation of a bunch of segments happens > >> > quickly, > >> > > > > which > >> > > > > > > is > >> > > > > > > > > > often > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > case during batch scheduling or task restart. > >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option 1.1) > >> is > >> > one > >> > > > way > >> > > > > > to > >> > > > > > > do > >> > > > > > > > > > this > >> > > > > > > > > > > > > > - Another way could be to have a dedicated > >> > > > bookkeeping > >> > > > > in > >> > > > > > > the > >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a > >> > number > >> > > > > > > > independent > >> > > > > > > > > of > >> > > > > > > > > > > the > >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the > >> memory > >> > > for > >> > > > > the > >> > > > > > > > > > > > > MemorySegments > >> > > > > > > > > > > > > > (option 2). That way we need not worry about > >> > > triggering > >> > > > > GC > >> > > > > > by > >> > > > > > > > > some > >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to > >> > prevent > >> > > > > > > > segfaults. > >> > > > > > > > > We > >> > > > > > > > > > > > need > >> > > > > > > > > > > > > to > >> > > > > > > > > > > > > > be very careful about when we release the > memory > >> > > > segments > >> > > > > > > (only > >> > > > > > > > > in > >> > > > > > > > > > > the > >> > > > > > > > > > > > > > cleanup phase of the main thread). > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.1, we probably need to > >> set > >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > >> > > "off_heap_managed_memory + > >> > > > > > > > > > > direct_memory" > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved > >> memory > >> > > > pool. > >> > > > > > > > Because > >> > > > > > > > > if > >> > > > > > > > > > > we > >> > > > > > > > > > > > > just > >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > >> > > > > "off_heap_managed_memory + > >> > > > > > > > > > > > > jvm_overhead", > >> > > > > > > > > > > > > > then there will be times when that entire > >> memory is > >> > > > > > allocated > >> > > > > > > > by > >> > > > > > > > > > > direct > >> > > > > > > > > > > > > > buffers and we have nothing left for the JVM > >> > > overhead. > >> > > > So > >> > > > > > we > >> > > > > > > > > either > >> > > > > > > > > > > > need > >> > > > > > > > > > > > > a > >> > > > > > > > > > > > > > way to compensate for that (again some safety > >> > margin > >> > > > > cutoff > >> > > > > > > > > value) > >> > > > > > > > > > or > >> > > > > > > > > > > > we > >> > > > > > > > > > > > > > will exceed container memory. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.2, we need to be aware > >> that > >> > it > >> > > > > takes > >> > > > > > > > > > elaborate > >> > > > > > > > > > > > > logic > >> > > > > > > > > > > > > > to push recycling of direct buffers without > >> always > >> > > > > > > triggering a > >> > > > > > > > > > full > >> > > > > > > > > > > > GC. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > My first guess is that the options will be > >> easiest > >> > to > >> > > > do > >> > > > > in > >> > > > > > > the > >> > > > > > > > > > > > following > >> > > > > > > > > > > > > > order: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory > >> > > > parameter, > >> > > > > as > >> > > > > > > > > > discussed > >> > > > > > > > > > > > > > above. We would need to find a way to set the > >> > > > > direct_memory > >> > > > > > > > > > parameter > >> > > > > > > > > > > > by > >> > > > > > > > > > > > > > default. We could start with 64 MB and see how > >> it > >> > > goes > >> > > > in > >> > > > > > > > > practice. > >> > > > > > > > > > > One > >> > > > > > > > > > > > > > danger I see is that setting this loo low can > >> > cause a > >> > > > > bunch > >> > > > > > > of > >> > > > > > > > > > > > additional > >> > > > > > > > > > > > > > GCs compared to before (we need to watch this > >> > > > carefully). > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 2. It is actually quite simple to > >> > > implement, > >> > > > > we > >> > > > > > > > could > >> > > > > > > > > > try > >> > > > > > > > > > > > how > >> > > > > > > > > > > > > > segfault safe we are at the moment. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > >> > > > > > > > "-XX:MaxDirectMemorySize" > >> > > > > > > > > > > > > parameter > >> > > > > > > > > > > > > > at all and assume that all the direct memory > >> > > > allocations > >> > > > > > that > >> > > > > > > > the > >> > > > > > > > > > JVM > >> > > > > > > > > > > > and > >> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned > up > >> > fast > >> > > > > enough > >> > > > > > > > > through > >> > > > > > > > > > > > > regular > >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > assumption, > >> > > > though. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > Best, > >> > > > > > > > > > > > > > Stephan > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > >> > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> > wondering > >> > > > > > whether > >> > > > > > > > we > >> > > > > > > > > > can > >> > > > > > > > > > > > > avoid > >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > >> > memory > >> > > > and > >> > > > > > > > network > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > with > >> > > > > > > > > > > > > > > alternative 3. But after giving it a second > >> > > thought, > >> > > > I > >> > > > > > > think > >> > > > > > > > > even > >> > > > > > > > > > > for > >> > > > > > > > > > > > > > > alternative 3 using direct memory for > off-heap > >> > > > managed > >> > > > > > > memory > >> > > > > > > > > > could > >> > > > > > > > > > > > > cause > >> > > > > > > > > > > > > > > problems. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what > proposed > >> in > >> > > this > >> > > > > > FLIP > >> > > > > > > it > >> > > > > > > > > to > >> > > > > > > > > > > have > >> > > > > > > > > > > > > > both > >> > > > > > > > > > > > > > > off-heap managed memory and network memory > >> > > allocated > >> > > > > > > through > >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> > practically > >> > > > > > native > >> > > > > > > > > memory > >> > > > > > > > > > > and > >> > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only > >> parts > >> > of > >> > > > > > memory > >> > > > > > > > > > limited > >> > > > > > > > > > > by > >> > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > max direct memory are task off-heap memory > and > >> > JVM > >> > > > > > > overhead, > >> > > > > > > > > > which > >> > > > > > > > > > > > are > >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > JVM > >> max > >> > > > > direct > >> > > > > > > > memory > >> > > > > > > > > > to. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > Rohrmann > >> < > >> > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> > > understand > >> > > > > the > >> > > > > > > two > >> > > > > > > > > > > > > alternatives > >> > > > > > > > > > > > > > > > now. > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because > it > >> > makes > >> > > > > > things > >> > > > > > > > > > > explicit. > >> > > > > > > > > > > > If > >> > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that > >> we > >> > > might > >> > > > > end > >> > > > > > > up > >> > > > > > > > > in a > >> > > > > > > > > > > > > similar > >> > > > > > > > > > > > > > > > situation as we are currently in: The user > >> > might > >> > > > see > >> > > > > > that > >> > > > > > > > her > >> > > > > > > > > > > > process > >> > > > > > > > > > > > > > > gets > >> > > > > > > > > > > > > > > > killed by the OS and does not know why > this > >> is > >> > > the > >> > > > > > case. > >> > > > > > > > > > > > > Consequently, > >> > > > > > > > > > > > > > > she > >> > > > > > > > > > > > > > > > tries to decrease the process memory size > >> > > (similar > >> > > > to > >> > > > > > > > > > increasing > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > cutoff > >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > extra > >> > > direct > >> > > > > > > memory. > >> > > > > > > > > > Even > >> > > > > > > > > > > > > worse, > >> > > > > > > > > > > > > > > she > >> > > > > > > > > > > > > > > > tries to decrease memory budgets which are > >> not > >> > > > fully > >> > > > > > used > >> > > > > > > > and > >> > > > > > > > > > > hence > >> > > > > > > > > > > > > > won't > >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > >> Song < > >> > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> example > >> > > Till. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following > scenario. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory > + > >> JVM > >> > > > > > > Overhead): > >> > > > > > > > > > 200MB > >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> Metaspace, > >> > > > > > Off-Heap > >> > > > > > > > > > Managed > >> > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> > > -XX:MaxDirectMemorySize > >> > > > > to > >> > > > > > > > 200MB. > >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> > > -XX:MaxDirectMemorySize > >> > > > > to > >> > > > > > a > >> > > > > > > > very > >> > > > > > > > > > > large > >> > > > > > > > > > > > > > > value, > >> > > > > > > > > > > > > > > > > let's say 1TB. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > Task > >> > > > Off-Heap > >> > > > > > > Memory > >> > > > > > > > > and > >> > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > Overhead > >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 > >> and > >> > > > > > > alternative 3 > >> > > > > > > > > > > should > >> > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> > > > > -XX:MaxDirectMemorySize > >> > > > > > > will > >> > > > > > > > > not > >> > > > > > > > > > > > > reduce > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > Task > >> > > > Off-Heap > >> > > > > > > Memory > >> > > > > > > > > and > >> > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent > >> OOM. > >> > > To > >> > > > > > avoid > >> > > > > > > > > that, > >> > > > > > > > > > > the > >> > > > > > > > > > > > > only > >> > > > > > > > > > > > > > > > thing > >> > > > > > > > > > > > > > > > > user can do is to modify the > >> configuration > >> > > and > >> > > > > > > > increase > >> > > > > > > > > > JVM > >> > > > > > > > > > > > > Direct > >> > > > > > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > Overhead). > >> > Let's > >> > > > say > >> > > > > > > that > >> > > > > > > > > user > >> > > > > > > > > > > > > > increases > >> > > > > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> reduce > >> > the > >> > > > > total > >> > > > > > > > size > >> > > > > > > > > of > >> > > > > > > > > > > > other > >> > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > process > >> > > memory > >> > > > > > > remains > >> > > > > > > > > > 1GB. > >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> chance of > >> > > > > direct > >> > > > > > > OOM. > >> > > > > > > > > > There > >> > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > chances > >> > > > > > > > > > > > > > > > > of exceeding the total process memory > >> > limit, > >> > > > but > >> > > > > > > given > >> > > > > > > > > > that > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > process > >> > > > > > > > > > > > > > > > > may > >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> memory > >> > > > > > (Off-Heap > >> > > > > > > > > > Managed > >> > > > > > > > > > > > > > Memory, > >> > > > > > > > > > > > > > > > > Network > >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > >> > direct > >> > > > > > memory > >> > > > > > > > > usage > >> > > > > > > > > > is > >> > > > > > > > > > > > > > > slightly > >> > > > > > > > > > > > > > > > > above > >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > probably > >> do > >> > > not > >> > > > > need > >> > > > > > > to > >> > > > > > > > > > change > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > configurations. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> > > perspective, a > >> > > > > > > > feasible > >> > > > > > > > > > > > > > > configuration > >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> resource > >> > > > > > > utilization > >> > > > > > > > > > > compared > >> > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > alternative 3. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> > Rohrmann > >> > > < > >> > > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me understand > >> the > >> > > > > > difference > >> > > > > > > > > > between > >> > > > > > > > > > > > > > > > > alternative 2 > >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > >> > > Xintong. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> XX:MaxDirectMemorySize > >> > > to > >> > > > > Task > >> > > > > > > > > > Off-Heap > >> > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that > >> this > >> > > size > >> > > > > is > >> > > > > > > too > >> > > > > > > > > low > >> > > > > > > > > > > > > > resulting > >> > > > > > > > > > > > > > > > in a > >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> potentially > >> > an > >> > > > OOM. > >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> XX:MaxDirectMemorySize > >> > > to > >> > > > > > > > something > >> > > > > > > > > > > larger > >> > > > > > > > > > > > > > than > >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> reduce > >> > > the > >> > > > > > sizes > >> > > > > > > of > >> > > > > > > > > the > >> > > > > > > > > > > > other > >> > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > types. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result in > an > >> > > under > >> > > > > > > > > utilization > >> > > > > > > > > > of > >> > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> alternative 3 > >> > > > > > strictly > >> > > > > > > > > sets a > >> > > > > > > > > > > > > higher > >> > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> little, > >> > > > then I > >> > > > > > > would > >> > > > > > > > > > > expect > >> > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under > >> > > > > utilization. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > >> Wang < > >> > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large max > >> > direct > >> > > > > > memory > >> > > > > > > > size > >> > > > > > > > > > > when > >> > > > > > > > > > > > we > >> > > > > > > > > > > > > > do > >> > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> memory. > >> > If > >> > > > the > >> > > > > > > direct > >> > > > > > > > > > > > > > > > memory,including > >> > > > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > direct memory and framework direct > >> > > > memory,could > >> > > > > > be > >> > > > > > > > > > > calculated > >> > > > > > > > > > > > > > > > > > > correctly,then > >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > memory > >> > with > >> > > > > fixed > >> > > > > > > > > value. > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > >> k8s,we > >> > > > need > >> > > > > to > >> > > > > > > > check > >> > > > > > > > > > the > >> > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> > > submitting > >> > > > > > > > > successfully > >> > > > > > > > > > > and > >> > > > > > > > > > > > > > > failing > >> > > > > > > > > > > > > > > > > in > >> > > > > > > > > > > > > > > > > > > the flink master. > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > >> > > > > >于2019年8月13日 > >> > > > > > > > > > 周二22:07写道: > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you > are > >> > > right > >> > > > > that > >> > > > > > > we > >> > > > > > > > > > should > >> > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > include > >> > > > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. > >> This > >> > > FLIP > >> > > > > > should > >> > > > > > > > > > > > concentrate > >> > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> > TaskExecutors, > >> > > > > with > >> > > > > > > > > minimum > >> > > > > > > > > > > > > > > involvement > >> > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> > alternative > >> > > 3 > >> > > > > may > >> > > > > > > not > >> > > > > > > > > > having > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > same > >> > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > reservation issue that > alternative 2 > >> > > does, > >> > > > > but > >> > > > > > at > >> > > > > > > > the > >> > > > > > > > > > > cost > >> > > > > > > > > > > > of > >> > > > > > > > > > > > > > > risk > >> > > > > > > > > > > > > > > > of > >> > > > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > using memory at the container > level, > >> > > which > >> > > > is > >> > > > > > not > >> > > > > > > > > good. > >> > > > > > > > > > > My > >> > > > > > > > > > > > > > point > >> > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > "JVM > >> > > > > Overhead" > >> > > > > > > are > >> > > > > > > > > not > >> > > > > > > > > > > easy > >> > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > config. > >> > > > > > > > > > > > > > > > > > > For > >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > configure > >> > them > >> > > > > > higher > >> > > > > > > > than > >> > > > > > > > > > > what > >> > > > > > > > > > > > > > > actually > >> > > > > > > > > > > > > > > > > > > needed, > >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > OOM. > >> For > >> > > > > > > alternative > >> > > > > > > > > 3, > >> > > > > > > > > > > > users > >> > > > > > > > > > > > > do > >> > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > get > >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config > >> the > >> > > two > >> > > > > > > options > >> > > > > > > > > > > > > aggressively > >> > > > > > > > > > > > > > > > high. > >> > > > > > > > > > > > > > > > > > But > >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> overall > >> > > > > container > >> > > > > > > > > memory > >> > > > > > > > > > > > usage > >> > > > > > > > > > > > > > > > exceeds > >> > > > > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > > > > budget. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > Till > >> > > > > Rohrmann < > >> > > > > > > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > >> > Xintong. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> looks > >> > > quite > >> > > > > > good. > >> > > > > > > > > > > > Concerning > >> > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > first > >> > > > > > > > > > > > > > > > > > > open > >> > > > > > > > > > > > > > > > > > > > > question about allocating memory > >> > > > segments, > >> > > > > I > >> > > > > > > was > >> > > > > > > > > > > > wondering > >> > > > > > > > > > > > > > > > whether > >> > > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > >> > context > >> > > > of > >> > > > > > this > >> > > > > > > > > FLIP > >> > > > > > > > > > or > >> > > > > > > > > > > > > > whether > >> > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > could > >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > >> > knowing > >> > > > all > >> > > > > > > > > details, > >> > > > > > > > > > I > >> > > > > > > > > > > > > would > >> > > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > > concerned > >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of > >> this > >> > > > FLIP > >> > > > > > too > >> > > > > > > > much > >> > > > > > > > > > > > because > >> > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > would > >> > > > > > > > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call > >> sites > >> > of > >> > > > the > >> > > > > > > > > > > MemoryManager > >> > > > > > > > > > > > > > where > >> > > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > > > allocate > >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> mainly > >> > be > >> > > > > batch > >> > > > > > > > > > > operators). > >> > > > > > > > > > > > > The > >> > > > > > > > > > > > > > > > > addition > >> > > > > > > > > > > > > > > > > > > of > >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to > the > >> > > > > > > MemoryManager > >> > > > > > > > > > should > >> > > > > > > > > > > > not > >> > > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > affected > >> > > > > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > > this and I would hope that this > is > >> > the > >> > > > only > >> > > > > > > point > >> > > > > > > > > of > >> > > > > > > > > > > > > > > interaction > >> > > > > > > > > > > > > > > > a > >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > the > >> > > > > > > MemoryManager. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> question > >> > > about > >> > > > > > > setting > >> > > > > > > > > or > >> > > > > > > > > > > not > >> > > > > > > > > > > > > > > setting > >> > > > > > > > > > > > > > > > a > >> > > > > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > also > >> be > >> > > > > > interested > >> > > > > > > > why > >> > > > > > > > > > > Yang > >> > > > > > > > > > > > > Wang > >> > > > > > > > > > > > > > > > > thinks > >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. > My > >> > > concern > >> > > > > > about > >> > > > > > > > > this > >> > > > > > > > > > > > would > >> > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > > > would > >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we > >> are > >> > now > >> > > > > with > >> > > > > > > the > >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> > > > > > > > > > > > > > > > > > > If > >> > > > > > > > > > > > > > > > > > > > > the different memory pools are > not > >> > > > clearly > >> > > > > > > > > separated > >> > > > > > > > > > > and > >> > > > > > > > > > > > > can > >> > > > > > > > > > > > > > > > spill > >> > > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > quite > >> > hard > >> > > > to > >> > > > > > > > > understand > >> > > > > > > > > > > > what > >> > > > > > > > > > > > > > > > exactly > >> > > > > > > > > > > > > > > > > > > > causes a > >> > > > > > > > > > > > > > > > > > > > > process to get killed for using > >> too > >> > > much > >> > > > > > > memory. > >> > > > > > > > > This > >> > > > > > > > > > > > could > >> > > > > > > > > > > > > > > then > >> > > > > > > > > > > > > > > > > > easily > >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what > >> we > >> > > have > >> > > > > with > >> > > > > > > the > >> > > > > > > > > > > > > > cutoff-ratio. > >> > > > > > > > > > > > > > > > So > >> > > > > > > > > > > > > > > > > > why > >> > > > > > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > > > setting a sane default value for > >> max > >> > > > direct > >> > > > > > > > memory > >> > > > > > > > > > and > >> > > > > > > > > > > > > giving > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > an > >> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs > >> into > >> > > an > >> > > > > OOM. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative > 2 > >> > lead > >> > > to > >> > > > > > lower > >> > > > > > > > > > memory > >> > > > > > > > > > > > > > > > utilization > >> > > > > > > > > > > > > > > > > > than > >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > >> direct > >> > > > > memory > >> > > > > > > to a > >> > > > > > > > > > > higher > >> > > > > > > > > > > > > > value? > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > >> > Xintong > >> > > > > Song < > >> > > > > > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large > max > >> > > direct > >> > > > > > > memory > >> > > > > > > > > size > >> > > > > > > > > > > > > > > definitely > >> > > > > > > > > > > > > > > > > has > >> > > > > > > > > > > > > > > > > > > some > >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> worry > >> > > about > >> > > > > > > direct > >> > > > > > > > > OOM, > >> > > > > > > > > > > and > >> > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > don't > >> > > > > > > > > > > > > > > > > > even > >> > > > > > > > > > > > > > > > > > > > > need > >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > >> > memory > >> > > > with > >> > > > > > > > > > > > > > Unsafe.allocate() . > >> > > > > > > > > > > > > > > > > > > > > > However, there are also some > >> down > >> > > sides > >> > > > > of > >> > > > > > > > doing > >> > > > > > > > > > > this. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think of > is > >> > that > >> > > > if > >> > > > > a > >> > > > > > > task > >> > > > > > > > > > > > executor > >> > > > > > > > > > > > > > > > > container > >> > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> memory, > >> > it > >> > > > > could > >> > > > > > > be > >> > > > > > > > > hard > >> > > > > > > > > > > for > >> > > > > > > > > > > > > use > >> > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > know > >> > > > > > > > > > > > > > > > > > > > which > >> > > > > > > > > > > > > > > > > > > > > > part > >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > >> > > > > > > > > > > > > > > > > > > > > > - Another down side is that > >> the > >> > > JVM > >> > > > > > never > >> > > > > > > > > > trigger > >> > > > > > > > > > > GC > >> > > > > > > > > > > > > due > >> > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > reaching > >> > > > > > > > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > because > >> the > >> > > > limit > >> > > > > > is > >> > > > > > > > too > >> > > > > > > > > > high > >> > > > > > > > > > > > to > >> > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > reached. > >> > > > > > > > > > > > > > > > > > > > That > >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on > >> heap > >> > > > memory > >> > > > > to > >> > > > > > > > > trigger > >> > > > > > > > > > > GC > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > release > >> > > > > > > > > > > > > > > > > > > > direct > >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> problem > >> > in > >> > > > > cases > >> > > > > > > > where > >> > > > > > > > > > we > >> > > > > > > > > > > > have > >> > > > > > > > > > > > > > > more > >> > > > > > > > > > > > > > > > > > direct > >> > > > > > > > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > >> > activity > >> > > > to > >> > > > > > > > trigger > >> > > > > > > > > > the > >> > > > > > > > > > > > GC. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > reasons > >> > for > >> > > > > > > preferring > >> > > > > > > > > > > > setting a > >> > > > > > > > > > > > > > > very > >> > > > > > > > > > > > > > > > > > large > >> > > > > > > > > > > > > > > > > > > > > value, > >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I > >> > > > overlooked. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > between > >> > > > multiple > >> > > > > > > > > > > configuration > >> > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think > we > >> > > should > >> > > > > > throw > >> > > > > > > > an > >> > > > > > > > > > > error. > >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > >> > client > >> > > > side > >> > > > > > is > >> > > > > > > a > >> > > > > > > > > good > >> > > > > > > > > > > > idea, > >> > > > > > > > > > > > > > so > >> > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > problem > >> > > before > >> > > > > > > > submitting > >> > > > > > > > > > the > >> > > > > > > > > > > > > Flink > >> > > > > > > > > > > > > > > > > > cluster, > >> > > > > > > > > > > > > > > > > > > > > which > >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > the > >> > > client > >> > > > > side > >> > > > > > > > > > checking, > >> > > > > > > > > > > > > > because > >> > > > > > > > > > > > > > > > for > >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > TaskManagers > >> on > >> > > > > > different > >> > > > > > > > > > machines > >> > > > > > > > > > > > may > >> > > > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > > > > > > different > >> > > > > > > > > > > > > > > > > > > > > > configurations and the client > >> does > >> > > see > >> > > > > > that. > >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM > >> Yang > >> > > > Wang > >> > > > > < > >> > > > > > > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> > proposal. > >> > > > > After > >> > > > > > > all > >> > > > > > > > > the > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > configuration > >> > > > > > > > > > > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > >> > > powerful > >> > > > to > >> > > > > > > > control > >> > > > > > > > > > the > >> > > > > > > > > > > > > flink > >> > > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > > > usage. I > >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > about > >> it. > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > Memory > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > >> > direct > >> > > > > > memory > >> > > > > > > > and > >> > > > > > > > > > > native > >> > > > > > > > > > > > > > > memory. > >> > > > > > > > > > > > > > > > > > They > >> > > > > > > > > > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > > > > > > > all > >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> memory. > >> > > > > Right? > >> > > > > > > So i > >> > > > > > > > > > don’t > >> > > > > > > > > > > > > think > >> > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > could > >> > > > > > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > > > > set > >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > >> > > > properly. I > >> > > > > > > > prefer > >> > > > > > > > > > > > leaving > >> > > > > > > > > > > > > > it a > >> > > > > > > > > > > > > > > > > very > >> > > > > > > > > > > > > > > > > > > > large > >> > > > > > > > > > > > > > > > > > > > > > > value. > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > fine-grained > >> > > > > > > memory(network > >> > > > > > > > > > > memory, > >> > > > > > > > > > > > > > > managed > >> > > > > > > > > > > > > > > > > > > memory, > >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> > > > > > > > > > > > > > > > > > > > > > > is larger than total process > >> > > memory, > >> > > > > how > >> > > > > > do > >> > > > > > > > we > >> > > > > > > > > > deal > >> > > > > > > > > > > > > with > >> > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > > situation? > >> > > > > > > > > > > > > > > > > > > > > > Do > >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > >> > > > > configuration > >> > > > > > > in > >> > > > > > > > > > > client? > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> > > [hidden email]> > >> > > > > > > > > > 于2019年8月7日周三 > >> > > > > > > > > > > > > > > 下午10:14写道: > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > >> > > discussion > >> > > > > > > thread > >> > > > > > > > on > >> > > > > > > > > > > > > "FLIP-49: > >> > > > > > > > > > > > > > > > > Unified > >> > > > > > > > > > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> > > > TaskExecutors"[1], > >> > > > > > > where > >> > > > > > > > we > >> > > > > > > > > > > > > describe > >> > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > improve > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> > > configurations. > >> > > > > The > >> > > > > > > > FLIP > >> > > > > > > > > > > > document > >> > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > mostly > >> > > > > > > > > > > > > > > > > > > > based > >> > > > > > > > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > > > > > > an > >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> Management > >> > > and > >> > > > > > > > > > Configuration > >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> > > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > follow-up > >> > > > > discussions > >> > > > > > > > both > >> > > > > > > > > > > online > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > > offline. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > several > >> > > > > > shortcomings > >> > > > > > > of > >> > > > > > > > > > > current > >> > > > > > > > > > > > > > > (Flink > >> > > > > > > > > > > > > > > > > 1.9) > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> > > configuration. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different > configuration > >> > for > >> > > > > > > Streaming > >> > > > > > > > > and > >> > > > > > > > > > > > Batch. > >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > >> > > > > > configuration > >> > > > > > > of > >> > > > > > > > > > > RocksDB > >> > > > > > > > > > > > > in > >> > > > > > > > > > > > > > > > > > Streaming. > >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > uncertain > >> and > >> > > > hard > >> > > > > to > >> > > > > > > > > > > understand. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > >> > problems > >> > > > can > >> > > > > > be > >> > > > > > > > > > > summarized > >> > > > > > > > > > > > > as > >> > > > > > > > > > > > > > > > > follows. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager > >> to > >> > > also > >> > > > > > > account > >> > > > > > > > > for > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > usage > >> > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > state > >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > TaskExecutor > >> > > memory > >> > > > > is > >> > > > > > > > > > > partitioned > >> > > > > > > > > > > > > > > > accounted > >> > > > > > > > > > > > > > > > > > > > > individual > >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > >> > pools. > >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> > > configuration > >> > > > > > > options > >> > > > > > > > > and > >> > > > > > > > > > > > > > > calculations > >> > > > > > > > > > > > > > > > > > > logics. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details > in > >> the > >> > > > FLIP > >> > > > > > wiki > >> > > > > > > > > > > document > >> > > > > > > > > > > > > [1]. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > early > >> > > design > >> > > > > doc > >> > > > > > > [2] > >> > > > > > > > is > >> > > > > > > > > > out > >> > > > > > > > > > > > of > >> > > > > > > > > > > > > > > sync, > >> > > > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > > it > >> > > > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> > > discussion > >> > > > in > >> > > > > > > this > >> > > > > > > > > > > mailing > >> > > > > > > > > > > > > list > >> > > > > > > > > > > > > > > > > > thread.) > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> > > feedbacks. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song < > >> > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> > wondering > >> > > > > > whether > >> > > > > > > > we > >> > > > > > > > > > can > >> > > > > > > > > > > > > avoid > >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed > >> > memory > >> > > > and > >> > > > > > > > network > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > with > >> > > > > > > > > > > > > > > alternative 3. But after giving it a second > >> > > thought, > >> > > > I > >> > > > > > > think > >> > > > > > > > > even > >> > > > > > > > > > > for > >> > > > > > > > > > > > > > > alternative 3 using direct memory for > off-heap > >> > > > managed > >> > > > > > > memory > >> > > > > > > > > > could > >> > > > > > > > > > > > > cause > >> > > > > > > > > > > > > > > problems. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what > proposed > >> in > >> > > this > >> > > > > > FLIP > >> > > > > > > it > >> > > > > > > > > to > >> > > > > > > > > > > have > >> > > > > > > > > > > > > > both > >> > > > > > > > > > > > > > > off-heap managed memory and network memory > >> > > allocated > >> > > > > > > through > >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> > practically > >> > > > > > native > >> > > > > > > > > memory > >> > > > > > > > > > > and > >> > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only > >> parts > >> > of > >> > > > > > memory > >> > > > > > > > > > limited > >> > > > > > > > > > > by > >> > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > max direct memory are task off-heap memory > and > >> > JVM > >> > > > > > > overhead, > >> > > > > > > > > > which > >> > > > > > > > > > > > are > >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > JVM > >> max > >> > > > > direct > >> > > > > > > > memory > >> > > > > > > > > > to. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > Rohrmann > >> < > >> > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> > > understand > >> > > > > the > >> > > > > > > two > >> > > > > > > > > > > > > alternatives > >> > > > > > > > > > > > > > > > now. > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because > it > >> > makes > >> > > > > > things > >> > > > > > > > > > > explicit. > >> > > > > > > > > > > > If > >> > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that > >> we > >> > > might > >> > > > > end > >> > > > > > > up > >> > > > > > > > > in a > >> > > > > > > > > > > > > similar > >> > > > > > > > > > > > > > > > situation as we are currently in: The user > >> > might > >> > > > see > >> > > > > > that > >> > > > > > > > her > >> > > > > > > > > > > > process > >> > > > > > > > > > > > > > > gets > >> > > > > > > > > > > > > > > > killed by the OS and does not know why > this > >> is > >> > > the > >> > > > > > case. > >> > > > > > > > > > > > > Consequently, > >> > > > > > > > > > > > > > > she > >> > > > > > > > > > > > > > > > tries to decrease the process memory size > >> > > (similar > >> > > > to > >> > > > > > > > > > increasing > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > cutoff > >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > extra > >> > > direct > >> > > > > > > memory. > >> > > > > > > > > > Even > >> > > > > > > > > > > > > worse, > >> > > > > > > > > > > > > > > she > >> > > > > > > > > > > > > > > > tries to decrease memory budgets which are > >> not > >> > > > fully > >> > > > > > used > >> > > > > > > > and > >> > > > > > > > > > > hence > >> > > > > > > > > > > > > > won't > >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > >> Song < > >> > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> example > >> > > Till. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following > scenario. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory > + > >> JVM > >> > > > > > > Overhead): > >> > > > > > > > > > 200MB > >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> Metaspace, > >> > > > > > Off-Heap > >> > > > > > > > > > Managed > >> > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> > > -XX:MaxDirectMemorySize > >> > > > > to > >> > > > > > > > 200MB. > >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> > > -XX:MaxDirectMemorySize > >> > > > > to > >> > > > > > a > >> > > > > > > > very > >> > > > > > > > > > > large > >> > > > > > > > > > > > > > > value, > >> > > > > > > > > > > > > > > > > let's say 1TB. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > Task > >> > > > Off-Heap > >> > > > > > > Memory > >> > > > > > > > > and > >> > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > Overhead > >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 > >> and > >> > > > > > > alternative 3 > >> > > > > > > > > > > should > >> > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> > > > > -XX:MaxDirectMemorySize > >> > > > > > > will > >> > > > > > > > > not > >> > > > > > > > > > > > > reduce > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > Task > >> > > > Off-Heap > >> > > > > > > Memory > >> > > > > > > > > and > >> > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from frequent > >> OOM. > >> > > To > >> > > > > > avoid > >> > > > > > > > > that, > >> > > > > > > > > > > the > >> > > > > > > > > > > > > only > >> > > > > > > > > > > > > > > > thing > >> > > > > > > > > > > > > > > > > user can do is to modify the > >> configuration > >> > > and > >> > > > > > > > increase > >> > > > > > > > > > JVM > >> > > > > > > > > > > > > Direct > >> > > > > > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > Overhead). > >> > Let's > >> > > > say > >> > > > > > > that > >> > > > > > > > > user > >> > > > > > > > > > > > > > increases > >> > > > > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> reduce > >> > the > >> > > > > total > >> > > > > > > > size > >> > > > > > > > > of > >> > > > > > > > > > > > other > >> > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > process > >> > > memory > >> > > > > > > remains > >> > > > > > > > > > 1GB. > >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> chance of > >> > > > > direct > >> > > > > > > OOM. > >> > > > > > > > > > There > >> > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > chances > >> > > > > > > > > > > > > > > > > of exceeding the total process memory > >> > limit, > >> > > > but > >> > > > > > > given > >> > > > > > > > > > that > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > process > >> > > > > > > > > > > > > > > > > may > >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> memory > >> > > > > > (Off-Heap > >> > > > > > > > > > Managed > >> > > > > > > > > > > > > > Memory, > >> > > > > > > > > > > > > > > > > Network > >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the actual > >> > direct > >> > > > > > memory > >> > > > > > > > > usage > >> > > > > > > > > > is > >> > > > > > > > > > > > > > > slightly > >> > > > > > > > > > > > > > > > > above > >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > probably > >> do > >> > > not > >> > > > > need > >> > > > > > > to > >> > > > > > > > > > change > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > configurations. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> > > perspective, a > >> > > > > > > > feasible > >> > > > > > > > > > > > > > > configuration > >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> resource > >> > > > > > > utilization > >> > > > > > > > > > > compared > >> > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > alternative 3. > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> > Rohrmann > >> > > < > >> > > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me understand > >> the > >> > > > > > difference > >> > > > > > > > > > between > >> > > > > > > > > > > > > > > > > alternative 2 > >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization > >> > > Xintong. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> XX:MaxDirectMemorySize > >> > > to > >> > > > > Task > >> > > > > > > > > > Off-Heap > >> > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > JVM > >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that > >> this > >> > > size > >> > > > > is > >> > > > > > > too > >> > > > > > > > > low > >> > > > > > > > > > > > > > resulting > >> > > > > > > > > > > > > > > > in a > >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> potentially > >> > an > >> > > > OOM. > >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> XX:MaxDirectMemorySize > >> > > to > >> > > > > > > > something > >> > > > > > > > > > > larger > >> > > > > > > > > > > > > > than > >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> reduce > >> > > the > >> > > > > > sizes > >> > > > > > > of > >> > > > > > > > > the > >> > > > > > > > > > > > other > >> > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > types. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result in > an > >> > > under > >> > > > > > > > > utilization > >> > > > > > > > > > of > >> > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> alternative 3 > >> > > > > > strictly > >> > > > > > > > > sets a > >> > > > > > > > > > > > > higher > >> > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> little, > >> > > > then I > >> > > > > > > would > >> > > > > > > > > > > expect > >> > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under > >> > > > > utilization. > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > >> Wang < > >> > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large max > >> > direct > >> > > > > > memory > >> > > > > > > > size > >> > > > > > > > > > > when > >> > > > > > > > > > > > we > >> > > > > > > > > > > > > > do > >> > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> memory. > >> > If > >> > > > the > >> > > > > > > direct > >> > > > > > > > > > > > > > > > memory,including > >> > > > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > direct memory and framework direct > >> > > > memory,could > >> > > > > > be > >> > > > > > > > > > > calculated > >> > > > > > > > > > > > > > > > > > > correctly,then > >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > memory > >> > with > >> > > > > fixed > >> > > > > > > > > value. > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > >> k8s,we > >> > > > need > >> > > > > to > >> > > > > > > > check > >> > > > > > > > > > the > >> > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> > > submitting > >> > > > > > > > > successfully > >> > > > > > > > > > > and > >> > > > > > > > > > > > > > > failing > >> > > > > > > > > > > > > > > > > in > >> > > > > > > > > > > > > > > > > > > the flink master. > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email] > >> > > > > >于2019年8月13日 > >> > > > > > > > > > 周二22:07写道: > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you > are > >> > > right > >> > > > > that > >> > > > > > > we > >> > > > > > > > > > should > >> > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > include > >> > > > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. > >> This > >> > > FLIP > >> > > > > > should > >> > > > > > > > > > > > concentrate > >> > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> > TaskExecutors, > >> > > > > with > >> > > > > > > > > minimum > >> > > > > > > > > > > > > > > involvement > >> > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> > alternative > >> > > 3 > >> > > > > may > >> > > > > > > not > >> > > > > > > > > > having > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > same > >> > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > reservation issue that > alternative 2 > >> > > does, > >> > > > > but > >> > > > > > at > >> > > > > > > > the > >> > > > > > > > > > > cost > >> > > > > > > > > > > > of > >> > > > > > > > > > > > > > > risk > >> > > > > > > > > > > > > > > > of > >> > > > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > using memory at the container > level, > >> > > which > >> > > > is > >> > > > > > not > >> > > > > > > > > good. > >> > > > > > > > > > > My > >> > > > > > > > > > > > > > point > >> > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > "JVM > >> > > > > Overhead" > >> > > > > > > are > >> > > > > > > > > not > >> > > > > > > > > > > easy > >> > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > config. > >> > > > > > > > > > > > > > > > > > > For > >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > configure > >> > them > >> > > > > > higher > >> > > > > > > > than > >> > > > > > > > > > > what > >> > > > > > > > > > > > > > > actually > >> > > > > > > > > > > > > > > > > > > needed, > >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > OOM. > >> For > >> > > > > > > alternative > >> > > > > > > > > 3, > >> > > > > > > > > > > > users > >> > > > > > > > > > > > > do > >> > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > get > >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config > >> the > >> > > two > >> > > > > > > options > >> > > > > > > > > > > > > aggressively > >> > > > > > > > > > > > > > > > high. > >> > > > > > > > > > > > > > > > > > But > >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> overall > >> > > > > container > >> > > > > > > > > memory > >> > > > > > > > > > > > usage > >> > > > > > > > > > > > > > > > exceeds > >> > > > > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > > > > budget. > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > Till > >> > > > > Rohrmann < > >> > > > > > > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > >> > Xintong. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> looks > >> > > quite > >> > > > > > good. > >> > > > > > > > > > > > Concerning > >> > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > first > >> > > > > > > > > > > > > > > > > > > open > >> > > > > > > > > > > > > > > > > > > > > question about allocating memory > >> > > > segments, > >> > > > > I > >> > > > > > > was > >> > > > > > > > > > > > wondering > >> > > > > > > > > > > > > > > > whether > >> > > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the > >> > context > >> > > > of > >> > > > > > this > >> > > > > > > > > FLIP > >> > > > > > > > > > or > >> > > > > > > > > > > > > > whether > >> > > > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > could > >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without > >> > knowing > >> > > > all > >> > > > > > > > > details, > >> > > > > > > > > > I > >> > > > > > > > > > > > > would > >> > > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > > concerned > >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of > >> this > >> > > > FLIP > >> > > > > > too > >> > > > > > > > much > >> > > > > > > > > > > > because > >> > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > would > >> > > > > > > > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call > >> sites > >> > of > >> > > > the > >> > > > > > > > > > > MemoryManager > >> > > > > > > > > > > > > > where > >> > > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > > > allocate > >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> mainly > >> > be > >> > > > > batch > >> > > > > > > > > > > operators). > >> > > > > > > > > > > > > The > >> > > > > > > > > > > > > > > > > addition > >> > > > > > > > > > > > > > > > > > > of > >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to > the > >> > > > > > > MemoryManager > >> > > > > > > > > > should > >> > > > > > > > > > > > not > >> > > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > affected > >> > > > > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > > this and I would hope that this > is > >> > the > >> > > > only > >> > > > > > > point > >> > > > > > > > > of > >> > > > > > > > > > > > > > > interaction > >> > > > > > > > > > > > > > > > a > >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > the > >> > > > > > > MemoryManager. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> question > >> > > about > >> > > > > > > setting > >> > > > > > > > > or > >> > > > > > > > > > > not > >> > > > > > > > > > > > > > > setting > >> > > > > > > > > > > > > > > > a > >> > > > > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > also > >> be > >> > > > > > interested > >> > > > > > > > why > >> > > > > > > > > > > Yang > >> > > > > > > > > > > > > Wang > >> > > > > > > > > > > > > > > > > thinks > >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. > My > >> > > concern > >> > > > > > about > >> > > > > > > > > this > >> > > > > > > > > > > > would > >> > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > > > would > >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we > >> are > >> > now > >> > > > > with > >> > > > > > > the > >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> > > > > > > > > > > > > > > > > > > If > >> > > > > > > > > > > > > > > > > > > > > the different memory pools are > not > >> > > > clearly > >> > > > > > > > > separated > >> > > > > > > > > > > and > >> > > > > > > > > > > > > can > >> > > > > > > > > > > > > > > > spill > >> > > > > > > > > > > > > > > > > > over > >> > > > > > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > quite > >> > hard > >> > > > to > >> > > > > > > > > understand > >> > > > > > > > > > > > what > >> > > > > > > > > > > > > > > > exactly > >> > > > > > > > > > > > > > > > > > > > causes a > >> > > > > > > > > > > > > > > > > > > > > process to get killed for using > >> too > >> > > much > >> > > > > > > memory. > >> > > > > > > > > This > >> > > > > > > > > > > > could > >> > > > > > > > > > > > > > > then > >> > > > > > > > > > > > > > > > > > easily > >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what > >> we > >> > > have > >> > > > > with > >> > > > > > > the > >> > > > > > > > > > > > > > cutoff-ratio. > >> > > > > > > > > > > > > > > > So > >> > > > > > > > > > > > > > > > > > why > >> > > > > > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > > > setting a sane default value for > >> max > >> > > > direct > >> > > > > > > > memory > >> > > > > > > > > > and > >> > > > > > > > > > > > > giving > >> > > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > an > >> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs > >> into > >> > > an > >> > > > > OOM. > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative > 2 > >> > lead > >> > > to > >> > > > > > lower > >> > > > > > > > > > memory > >> > > > > > > > > > > > > > > > utilization > >> > > > > > > > > > > > > > > > > > than > >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > >> direct > >> > > > > memory > >> > > > > > > to a > >> > > > > > > > > > > higher > >> > > > > > > > > > > > > > value? > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> > > > > > > > > > > > > > > > > > > > > Till > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > >> > Xintong > >> > > > > Song < > >> > > > > > > > > > > > > > > > [hidden email] > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large > max > >> > > direct > >> > > > > > > memory > >> > > > > > > > > size > >> > > > > > > > > > > > > > > definitely > >> > > > > > > > > > > > > > > > > has > >> > > > > > > > > > > > > > > > > > > some > >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> worry > >> > > about > >> > > > > > > direct > >> > > > > > > > > OOM, > >> > > > > > > > > > > and > >> > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > don't > >> > > > > > > > > > > > > > > > > > even > >> > > > > > > > > > > > > > > > > > > > > need > >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network > >> > memory > >> > > > with > >> > > > > > > > > > > > > > Unsafe.allocate() . > >> > > > > > > > > > > > > > > > > > > > > > However, there are also some > >> down > >> > > sides > >> > > > > of > >> > > > > > > > doing > >> > > > > > > > > > > this. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think of > is > >> > that > >> > > > if > >> > > > > a > >> > > > > > > task > >> > > > > > > > > > > > executor > >> > > > > > > > > > > > > > > > > container > >> > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> memory, > >> > it > >> > > > > could > >> > > > > > > be > >> > > > > > > > > hard > >> > > > > > > > > > > for > >> > > > > > > > > > > > > use > >> > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > know > >> > > > > > > > > > > > > > > > > > > > which > >> > > > > > > > > > > > > > > > > > > > > > part > >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. > >> > > > > > > > > > > > > > > > > > > > > > - Another down side is that > >> the > >> > > JVM > >> > > > > > never > >> > > > > > > > > > trigger > >> > > > > > > > > > > GC > >> > > > > > > > > > > > > due > >> > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > reaching > >> > > > > > > > > > > > > > > > > > > > > max > >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > because > >> the > >> > > > limit > >> > > > > > is > >> > > > > > > > too > >> > > > > > > > > > high > >> > > > > > > > > > > > to > >> > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > > > reached. > >> > > > > > > > > > > > > > > > > > > > That > >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on > >> heap > >> > > > memory > >> > > > > to > >> > > > > > > > > trigger > >> > > > > > > > > > > GC > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > release > >> > > > > > > > > > > > > > > > > > > > direct > >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> problem > >> > in > >> > > > > cases > >> > > > > > > > where > >> > > > > > > > > > we > >> > > > > > > > > > > > have > >> > > > > > > > > > > > > > > more > >> > > > > > > > > > > > > > > > > > direct > >> > > > > > > > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > >> > activity > >> > > > to > >> > > > > > > > trigger > >> > > > > > > > > > the > >> > > > > > > > > > > > GC. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > reasons > >> > for > >> > > > > > > preferring > >> > > > > > > > > > > > setting a > >> > > > > > > > > > > > > > > very > >> > > > > > > > > > > > > > > > > > large > >> > > > > > > > > > > > > > > > > > > > > value, > >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I > >> > > > overlooked. > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > between > >> > > > multiple > >> > > > > > > > > > > configuration > >> > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > user > >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think > we > >> > > should > >> > > > > > throw > >> > > > > > > > an > >> > > > > > > > > > > error. > >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the > >> > client > >> > > > side > >> > > > > > is > >> > > > > > > a > >> > > > > > > > > good > >> > > > > > > > > > > > idea, > >> > > > > > > > > > > > > > so > >> > > > > > > > > > > > > > > > that > >> > > > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > problem > >> > > before > >> > > > > > > > submitting > >> > > > > > > > > > the > >> > > > > > > > > > > > > Flink > >> > > > > > > > > > > > > > > > > > cluster, > >> > > > > > > > > > > > > > > > > > > > > which > >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > the > >> > > client > >> > > > > side > >> > > > > > > > > > checking, > >> > > > > > > > > > > > > > because > >> > > > > > > > > > > > > > > > for > >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > TaskManagers > >> on > >> > > > > > different > >> > > > > > > > > > machines > >> > > > > > > > > > > > may > >> > > > > > > > > > > > > > > have > >> > > > > > > > > > > > > > > > > > > > different > >> > > > > > > > > > > > > > > > > > > > > > configurations and the client > >> does > >> > > see > >> > > > > > that. > >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM > >> Yang > >> > > > Wang > >> > > > > < > >> > > > > > > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> > proposal. > >> > > > > After > >> > > > > > > all > >> > > > > > > > > the > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > configuration > >> > > > > > > > > > > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more > >> > > powerful > >> > > > to > >> > > > > > > > control > >> > > > > > > > > > the > >> > > > > > > > > > > > > flink > >> > > > > > > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > > > > > > > usage. I > >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > about > >> it. > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > Memory > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user > >> > direct > >> > > > > > memory > >> > > > > > > > and > >> > > > > > > > > > > native > >> > > > > > > > > > > > > > > memory. > >> > > > > > > > > > > > > > > > > > They > >> > > > > > > > > > > > > > > > > > > > are > >> > > > > > > > > > > > > > > > > > > > > > all > >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> memory. > >> > > > > Right? > >> > > > > > > So i > >> > > > > > > > > > don’t > >> > > > > > > > > > > > > think > >> > > > > > > > > > > > > > > we > >> > > > > > > > > > > > > > > > > > could > >> > > > > > > > > > > > > > > > > > > > not > >> > > > > > > > > > > > > > > > > > > > > > set > >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize > >> > > > properly. I > >> > > > > > > > prefer > >> > > > > > > > > > > > leaving > >> > > > > > > > > > > > > > it a > >> > > > > > > > > > > > > > > > > very > >> > > > > > > > > > > > > > > > > > > > large > >> > > > > > > > > > > > > > > > > > > > > > > value. > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > fine-grained > >> > > > > > > memory(network > >> > > > > > > > > > > memory, > >> > > > > > > > > > > > > > > managed > >> > > > > > > > > > > > > > > > > > > memory, > >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> > > > > > > > > > > > > > > > > > > > > > > is larger than total process > >> > > memory, > >> > > > > how > >> > > > > > do > >> > > > > > > > we > >> > > > > > > > > > deal > >> > > > > > > > > > > > > with > >> > > > > > > > > > > > > > > this > >> > > > > > > > > > > > > > > > > > > > > situation? > >> > > > > > > > > > > > > > > > > > > > > > Do > >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory > >> > > > > configuration > >> > > > > > > in > >> > > > > > > > > > > client? > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> > > [hidden email]> > >> > > > > > > > > > 于2019年8月7日周三 > >> > > > > > > > > > > > > > > 下午10:14写道: > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > >> > > discussion > >> > > > > > > thread > >> > > > > > > > on > >> > > > > > > > > > > > > "FLIP-49: > >> > > > > > > > > > > > > > > > > Unified > >> > > > > > > > > > > > > > > > > > > > > Memory > >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> > > > TaskExecutors"[1], > >> > > > > > > where > >> > > > > > > > we > >> > > > > > > > > > > > > describe > >> > > > > > > > > > > > > > > how > >> > > > > > > > > > > > > > > > to > >> > > > > > > > > > > > > > > > > > > > improve > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> > > configurations. > >> > > > > The > >> > > > > > > > FLIP > >> > > > > > > > > > > > document > >> > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > mostly > >> > > > > > > > > > > > > > > > > > > > based > >> > > > > > > > > > > > > > > > > > > > > > on > >> > > > > > > > > > > > > > > > > > > > > > > an > >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> Management > >> > > and > >> > > > > > > > > > Configuration > >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> > > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > follow-up > >> > > > > discussions > >> > > > > > > > both > >> > > > > > > > > > > online > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > > offline. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > several > >> > > > > > shortcomings > >> > > > > > > of > >> > > > > > > > > > > current > >> > > > > > > > > > > > > > > (Flink > >> > > > > > > > > > > > > > > > > 1.9) > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> > > configuration. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different > configuration > >> > for > >> > > > > > > Streaming > >> > > > > > > > > and > >> > > > > > > > > > > > Batch. > >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and difficult > >> > > > > > configuration > >> > > > > > > of > >> > > > > > > > > > > RocksDB > >> > > > > > > > > > > > > in > >> > > > > > > > > > > > > > > > > > Streaming. > >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > uncertain > >> and > >> > > > hard > >> > > > > to > >> > > > > > > > > > > understand. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > >> > problems > >> > > > can > >> > > > > > be > >> > > > > > > > > > > summarized > >> > > > > > > > > > > > > as > >> > > > > > > > > > > > > > > > > follows. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory manager > >> to > >> > > also > >> > > > > > > account > >> > > > > > > > > for > >> > > > > > > > > > > > memory > >> > > > > > > > > > > > > > > usage > >> > > > > > > > > > > > > > > > > by > >> > > > > > > > > > > > > > > > > > > > state > >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > TaskExecutor > >> > > memory > >> > > > > is > >> > > > > > > > > > > partitioned > >> > > > > > > > > > > > > > > > accounted > >> > > > > > > > > > > > > > > > > > > > > individual > >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations and > >> > pools. > >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> > > configuration > >> > > > > > > options > >> > > > > > > > > and > >> > > > > > > > > > > > > > > calculations > >> > > > > > > > > > > > > > > > > > > logics. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details > in > >> the > >> > > > FLIP > >> > > > > > wiki > >> > > > > > > > > > > document > >> > > > > > > > > > > > > [1]. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > early > >> > > design > >> > > > > doc > >> > > > > > > [2] > >> > > > > > > > is > >> > > > > > > > > > out > >> > > > > > > > > > > > of > >> > > > > > > > > > > > > > > sync, > >> > > > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > > > it > >> > > > > > > > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> > > discussion > >> > > > in > >> > > > > > > this > >> > > > > > > > > > > mailing > >> > > > > > > > > > > > > list > >> > > > > > > > > > > > > > > > > > thread.) > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> > > feedbacks. > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > |
Yes I'll address the memory reservation functionality in a separate FLIP to
cooperate with FLIP-49 (sorry for being late for the discussion). Best Regards, Yu On Mon, 2 Sep 2019 at 11:14, Xintong Song <[hidden email]> wrote: > Sorry for the late response. > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail in > PR. > - Regarding passing parameters into the `TaskExecutor`, +1 for using > dynamic configuration at the moment, given that there are more questions to > be discussed to have a general framework for overwriting configurations > with ENV variables. > - Regarding memory reservation, I double checked with Yu and he will take > care of it. > > Thank you~ > > Xintong Song > > > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> > wrote: > > > What I forgot to add is that we could tackle specifying the configuration > > fully in an incremental way and that the full specification should be the > > desired end state. > > > > On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> > > wrote: > > > > > I think our goal should be that the configuration is fully specified > when > > > the process is started. By considering the internal calculation step to > > be > > > rather validate existing values and calculate missing ones, these two > > > proposal shouldn't even conflict (given determinism). > > > > > > Since we don't want to change an existing flink-conf.yaml, specifying > the > > > full configuration would require to pass in the options differently. > > > > > > One way could be the ENV variables approach. The reason why I'm trying > to > > > exclude this feature from the FLIP is that I believe it needs a bit > more > > > discussion. Just some questions which come to my mind: What would be > the > > > exact format (FLINK_KEY_NAME)? Would we support a dot separator which > is > > > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > > > separator what would be the order of precedence if there are two ENV > > > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > > > precedence of env variable vs. dynamic configuration value specified > via > > -D? > > > > > > Another approach could be to pass in the dynamic configuration values > via > > > `-Dkey=value` to the Flink process. For that we don't have to change > > > anything because the functionality already exists. > > > > > > Cheers, > > > Till > > > > > > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> > wrote: > > > > > >> I see. Under the assumption of strict determinism that should work. > > >> > > >> The original proposal had this point "don't compute inside the TM, > > compute > > >> outside and supply a full config", because that sounded more > intuitive. > > >> > > >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> > > >> wrote: > > >> > > >> > My understanding was that before starting the Flink process we call > a > > >> > utility which calculates these values. I assume that this utility > will > > >> do > > >> > the calculation based on a set of configured values (process memory, > > >> flink > > >> > memory, network memory etc.). Assuming that these values don't > differ > > >> from > > >> > the values with which the JVM is started, it should be possible to > > >> > recompute them in the Flink process in order to set the values. > > >> > > > >> > > > >> > > > >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> > > wrote: > > >> > > > >> > > When computing the values in the JVM process after it started, how > > >> would > > >> > > you deal with values like Max Direct Memory, Metaspace size. > native > > >> > memory > > >> > > reservation (reduce heap size), etc? All the values that are > > >> parameters > > >> > to > > >> > > the JVM process and that need to be supplied at process startup? > > >> > > > > >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > [hidden email] > > > > > >> > > wrote: > > >> > > > > >> > > > Thanks for the clarification. I have some more comments: > > >> > > > > > >> > > > - I would actually split the logic to compute the process memory > > >> > > > requirements and storing the values into two things. E.g. one > > could > > >> > name > > >> > > > the former TaskExecutorProcessUtility and the latter > > >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR > since > > >> it's > > >> > > > just a naming detail. > > >> > > > > > >> > > > - Generally, I'm not opposed to making configuration values > > >> overridable > > >> > > by > > >> > > > ENV variables. I think this is a very good idea and makes the > > >> > > > configurability of Flink processes easier. However, I think that > > >> adding > > >> > > > this functionality should not be part of this FLIP because it > > would > > >> > > simply > > >> > > > widen the scope unnecessarily. > > >> > > > > > >> > > > The reasons why I believe it is unnecessary are the following: > For > > >> Yarn > > >> > > we > > >> > > > already create write a flink-conf.yaml which could be populated > > with > > >> > the > > >> > > > memory settings. For the other processes it should not make a > > >> > difference > > >> > > > whether the loaded Configuration is populated with the memory > > >> settings > > >> > > from > > >> > > > ENV variables or by using TaskExecutorProcessUtility to compute > > the > > >> > > missing > > >> > > > values from the loaded configuration. If the latter would not be > > >> > possible > > >> > > > (wrong or missing configuration values), then we should not have > > >> been > > >> > > able > > >> > > > to actually start the process in the first place. > > >> > > > > > >> > > > - Concerning the memory reservation: I agree with you that we > need > > >> the > > >> > > > memory reservation functionality to make streaming jobs work > with > > >> > > "managed" > > >> > > > memory. However, w/o this functionality the whole Flip would > > already > > >> > > bring > > >> > > > a good amount of improvements to our users when running batch > > jobs. > > >> > > > Moreover, by keeping the scope smaller we can complete the FLIP > > >> faster. > > >> > > > Hence, I would propose to address the memory reservation > > >> functionality > > >> > > as a > > >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). > > >> > > > > > >> > > > Cheers, > > >> > > > Till > > >> > > > > > >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > [hidden email] > > > > > >> > > wrote: > > >> > > > > > >> > > > > Just add my 2 cents. > > >> > > > > > > >> > > > > Using environment variables to override the configuration for > > >> > different > > >> > > > > taskmanagers is better. > > >> > > > > We do not need to generate dedicated flink-conf.yaml for all > > >> > > > taskmanagers. > > >> > > > > A common flink-conf.yam and different environment variables > are > > >> > enough. > > >> > > > > By reducing the distributed cached files, it could make > > launching > > >> a > > >> > > > > taskmanager faster. > > >> > > > > > > >> > > > > Stephan gives a good suggestion that we could move the logic > > into > > >> > > > > "GlobalConfiguration.loadConfig()" method. > > >> > > > > Maybe the client could also benefit from this. Different users > > do > > >> not > > >> > > > have > > >> > > > > to export FLINK_CONF_DIR to update few config options. > > >> > > > > > > >> > > > > > > >> > > > > Best, > > >> > > > > Yang > > >> > > > > > > >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > >> > > > > > > >> > > > > > One note on the Environment Variables and Configuration > > >> discussion. > > >> > > > > > > > >> > > > > > My understanding is that passed ENV variables are added to > the > > >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" > method > > >> (or > > >> > > > > > similar). > > >> > > > > > For all the code inside Flink, it looks like the data was in > > the > > >> > > config > > >> > > > > to > > >> > > > > > start with, just that the scripts that compute the variables > > can > > >> > pass > > >> > > > the > > >> > > > > > values to the process without actually needing to write a > > file. > > >> > > > > > > > >> > > > > > For example the "GlobalConfiguration.loadConfig()" method > > would > > >> > take > > >> > > > any > > >> > > > > > ENV variable prefixed with "flink" and add it as a config > key. > > >> > > > > > "flink_taskmanager_memory_size=2g" would become > > >> > > > "taskmanager.memory.size: > > >> > > > > > 2g". > > >> > > > > > > > >> > > > > > > > >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > >> > [hidden email]> > > >> > > > > > wrote: > > >> > > > > > > > >> > > > > > > Thanks for the comments, Till. > > >> > > > > > > > > >> > > > > > > I've also seen your comments on the wiki page, but let's > > keep > > >> the > > >> > > > > > > discussion here. > > >> > > > > > > > > >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think > about > > >> > naming > > >> > > it > > >> > > > > > > 'TaskExecutorResourceSpecifics'. > > >> > > > > > > - Regarding passing memory configurations into task > > executors, > > >> > I'm > > >> > > in > > >> > > > > > favor > > >> > > > > > > of do it via environment variables rather than > > configurations, > > >> > with > > >> > > > the > > >> > > > > > > following two reasons. > > >> > > > > > > - It is easier to keep the memory options once calculate > > >> not to > > >> > > be > > >> > > > > > > changed with environment variables rather than > > configurations. > > >> > > > > > > - I'm not sure whether we should write the configuration > > in > > >> > > startup > > >> > > > > > > scripts. Writing changes into the configuration files when > > >> > running > > >> > > > the > > >> > > > > > > startup scripts does not sounds right to me. Or we could > > make > > >> a > > >> > > copy > > >> > > > of > > >> > > > > > > configuration files per flink cluster, and make the task > > >> executor > > >> > > to > > >> > > > > load > > >> > > > > > > from the copy, and clean up the copy after the cluster is > > >> > shutdown, > > >> > > > > which > > >> > > > > > > is complicated. (I think this is also what Stephan means > in > > >> his > > >> > > > comment > > >> > > > > > on > > >> > > > > > > the wiki page?) > > >> > > > > > > - Regarding reserving memory, I think this change should > be > > >> > > included > > >> > > > in > > >> > > > > > > this FLIP. I think a big part of motivations of this FLIP > is > > >> to > > >> > > unify > > >> > > > > > > memory configuration for streaming / batch and make it > easy > > >> for > > >> > > > > > configuring > > >> > > > > > > rocksdb memory. If we don't support memory reservation, > then > > >> > > > streaming > > >> > > > > > jobs > > >> > > > > > > cannot use managed memory (neither on-heap or off-heap), > > which > > >> > > makes > > >> > > > > this > > >> > > > > > > FLIP incomplete. > > >> > > > > > > - Regarding network memory, I think you are right. I think > > we > > >> > > > probably > > >> > > > > > > don't need to change network stack from using direct > memory > > to > > >> > > using > > >> > > > > > unsafe > > >> > > > > > > native memory. Network memory size is deterministic, > cannot > > be > > >> > > > reserved > > >> > > > > > as > > >> > > > > > > managed memory does, and cannot be overused. I think it > also > > >> > works > > >> > > if > > >> > > > > we > > >> > > > > > > simply keep using direct memory for network and include it > > in > > >> jvm > > >> > > max > > >> > > > > > > direct memory size. > > >> > > > > > > > > >> > > > > > > Thank you~ > > >> > > > > > > > > >> > > > > > > Xintong Song > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > >> > > [hidden email]> > > >> > > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Hi Xintong, > > >> > > > > > > > > > >> > > > > > > > thanks for addressing the comments and adding a more > > >> detailed > > >> > > > > > > > implementation plan. I have a couple of comments > > concerning > > >> the > > >> > > > > > > > implementation plan: > > >> > > > > > > > > > >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > > >> descriptive. > > >> > > > > Choosing > > >> > > > > > a > > >> > > > > > > > different name could help here. > > >> > > > > > > > - I'm not sure whether I would pass the memory > > >> configuration to > > >> > > the > > >> > > > > > > > TaskExecutor via environment variables. I think it would > > be > > >> > > better > > >> > > > to > > >> > > > > > > write > > >> > > > > > > > it into the configuration one uses to start the TM > > process. > > >> > > > > > > > - If possible, I would exclude the memory reservation > from > > >> this > > >> > > > FLIP > > >> > > > > > and > > >> > > > > > > > add this as part of a dedicated FLIP. > > >> > > > > > > > - If possible, then I would exclude changes to the > network > > >> > stack > > >> > > > from > > >> > > > > > > this > > >> > > > > > > > FLIP. Maybe we can simply say that the direct memory > > needed > > >> by > > >> > > the > > >> > > > > > > network > > >> > > > > > > > stack is the framework direct memory requirement. > Changing > > >> how > > >> > > the > > >> > > > > > memory > > >> > > > > > > > is allocated can happen in a second step. This would > keep > > >> the > > >> > > scope > > >> > > > > of > > >> > > > > > > this > > >> > > > > > > > FLIP smaller. > > >> > > > > > > > > > >> > > > > > > > Cheers, > > >> > > > > > > > Till > > >> > > > > > > > > > >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > >> > > > [hidden email]> > > >> > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > Hi everyone, > > >> > > > > > > > > > > >> > > > > > > > > I just updated the FLIP document on wiki [1], with the > > >> > > following > > >> > > > > > > changes. > > >> > > > > > > > > > > >> > > > > > > > > - Removed open question regarding MemorySegment > > >> > allocation. > > >> > > As > > >> > > > > > > > > discussed, we exclude this topic from the scope of > > this > > >> > > FLIP. > > >> > > > > > > > > - Updated content about JVM direct memory parameter > > >> > > according > > >> > > > to > > >> > > > > > > > recent > > >> > > > > > > > > discussions, and moved the other options to > "Rejected > > >> > > > > > Alternatives" > > >> > > > > > > > for > > >> > > > > > > > > the > > >> > > > > > > > > moment. > > >> > > > > > > > > - Added implementation steps. > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > Thank you~ > > >> > > > > > > > > > > >> > > > > > > > > Xintong Song > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > [1] > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > > > > > > > > > > >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > > >> > [hidden email] > > >> > > > > > >> > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > > > @Xintong: Concerning "wait for memory users before > > task > > >> > > dispose > > >> > > > > and > > >> > > > > > > > > memory > > >> > > > > > > > > > release": I agree, that's how it should be. Let's > try > > it > > >> > out. > > >> > > > > > > > > > > > >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait > for > > >> GC > > >> > > when > > >> > > > > > > > allocating > > >> > > > > > > > > > direct memory buffer": There seems to be pretty > > >> elaborate > > >> > > logic > > >> > > > > to > > >> > > > > > > free > > >> > > > > > > > > > buffers when allocating new ones. See > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > >> > > > > > > > > > > > >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default > works > > >> (like > > >> > > > going > > >> > > > > > > with > > >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" > at > > >> all), > > >> > > > then > > >> > > > > I > > >> > > > > > > > think > > >> > > > > > > > > it > > >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > > >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we > > use > > >> > > > RocksDB. > > >> > > > > > > That > > >> > > > > > > > > is a > > >> > > > > > > > > > big if, though, I honestly have no idea :D Would be > > >> good to > > >> > > > > > > understand > > >> > > > > > > > > > this, though, because this would affect option (2) > and > > >> > option > > >> > > > > > (1.2). > > >> > > > > > > > > > > > >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > >> > > > > > [hidden email]> > > >> > > > > > > > > > wrote: > > >> > > > > > > > > > > > >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > >> > > > > > > > > > > > > >> > > > > > > > > > > Let me try to summarize your points. Please > correct > > >> me if > > >> > > I'm > > >> > > > > > > wrong. > > >> > > > > > > > > > > > > >> > > > > > > > > > > - Memory consumers should always avoid > returning > > >> > memory > > >> > > > > > segments > > >> > > > > > > > to > > >> > > > > > > > > > > memory manager while there are still un-cleaned > > >> > > > structures / > > >> > > > > > > > threads > > >> > > > > > > > > > > that > > >> > > > > > > > > > > may use the memory. Otherwise, it would cause > > >> serious > > >> > > > > problems > > >> > > > > > > by > > >> > > > > > > > > > having > > >> > > > > > > > > > > multiple consumers trying to use the same > memory > > >> > > segment. > > >> > > > > > > > > > > - JVM does not wait for GC when allocating > direct > > >> > memory > > >> > > > > > buffer. > > >> > > > > > > > > > > Therefore even we set proper max direct memory > > size > > >> > > limit, > > >> > > > > we > > >> > > > > > > may > > >> > > > > > > > > > still > > >> > > > > > > > > > > encounter direct memory oom if the GC cleaning > > >> memory > > >> > > > slower > > >> > > > > > > than > > >> > > > > > > > > the > > >> > > > > > > > > > > direct memory allocation. > > >> > > > > > > > > > > > > >> > > > > > > > > > > Am I understanding this correctly? > > >> > > > > > > > > > > > > >> > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > >> > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > >> > > > > > > [hidden email] > > >> > > > > > > > > > > .invalid> > > >> > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > Hi stephan: > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > About option 2: > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > if additional threads not cleanly shut down > before > > >> we > > >> > can > > >> > > > > exit > > >> > > > > > > the > > >> > > > > > > > > > task: > > >> > > > > > > > > > > > In the current case of memory reuse, it has > freed > > up > > >> > the > > >> > > > > memory > > >> > > > > > > it > > >> > > > > > > > > > > > uses. If this memory is used by other tasks and > > >> > > > asynchronous > > >> > > > > > > > threads > > >> > > > > > > > > > > > of exited task may still be writing, there will > > be > > >> > > > > concurrent > > >> > > > > > > > > security > > >> > > > > > > > > > > > problems, and even lead to errors in user > > computing > > >> > > > results. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > So I think this is a serious and intolerable > bug, > > No > > >> > > matter > > >> > > > > > what > > >> > > > > > > > the > > >> > > > > > > > > > > > option is, it should be avoided. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > About direct memory cleaned by GC: > > >> > > > > > > > > > > > I don't think it is a good idea, I've > encountered > > so > > >> > many > > >> > > > > > > > situations > > >> > > > > > > > > > > > that it's too late for GC to cause DirectMemory > > >> OOM. > > >> > > > Release > > >> > > > > > and > > >> > > > > > > > > > > > allocate DirectMemory depend on the type of > user > > >> job, > > >> > > > which > > >> > > > > is > > >> > > > > > > > > > > > often beyond our control. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > Best, > > >> > > > > > > > > > > > Jingsong Lee > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > >> > ------------------------------------------------------------------ > > >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > >> > > > > > > > > > > > To:dev <[hidden email]> > > >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > > >> > > Configuration > > >> > > > > for > > >> > > > > > > > > > > > TaskExecutors > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > My main concern with option 2 (manually release > > >> memory) > > >> > > is > > >> > > > > that > > >> > > > > > > > > > segfaults > > >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user > > >> ends. > > >> > So > > >> > > we > > >> > > > > > need > > >> > > > > > > to > > >> > > > > > > > > > > > guarantee that this never happens. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > The trickyness is in tasks that uses data > > >> structures / > > >> > > > > > algorithms > > >> > > > > > > > > with > > >> > > > > > > > > > > > additional threads, like hash table spill/read > and > > >> > > sorting > > >> > > > > > > threads. > > >> > > > > > > > > We > > >> > > > > > > > > > > need > > >> > > > > > > > > > > > to ensure that these cleanly shut down before we > > can > > >> > exit > > >> > > > the > > >> > > > > > > task. > > >> > > > > > > > > > > > I am not sure that we have that guaranteed > > already, > > >> > > that's > > >> > > > > why > > >> > > > > > > > option > > >> > > > > > > > > > 1.1 > > >> > > > > > > > > > > > seemed simpler to me. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > > >> > > > > > > > [hidden email]> > > >> > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized > in > > >> this > > >> > > way > > >> > > > > > really > > >> > > > > > > > > makes > > >> > > > > > > > > > > > > things easier to understand. > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > I'm in favor of option 2, at least for the > > >> moment. I > > >> > > > think > > >> > > > > it > > >> > > > > > > is > > >> > > > > > > > > not > > >> > > > > > > > > > > that > > >> > > > > > > > > > > > > difficult to keep it segfault safe for memory > > >> > manager, > > >> > > as > > >> > > > > > long > > >> > > > > > > as > > >> > > > > > > > > we > > >> > > > > > > > > > > > always > > >> > > > > > > > > > > > > de-allocate the memory segment when it is > > released > > >> > from > > >> > > > the > > >> > > > > > > > memory > > >> > > > > > > > > > > > > consumers. Only if the memory consumer > continue > > >> using > > >> > > the > > >> > > > > > > buffer > > >> > > > > > > > of > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > segment after releasing it, in which case we > do > > >> want > > >> > > the > > >> > > > > job > > >> > > > > > to > > >> > > > > > > > > fail > > >> > > > > > > > > > so > > >> > > > > > > > > > > > we > > >> > > > > > > > > > > > > detect the memory leak early. > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.2, I don't think this is a good > > idea. > > >> > Not > > >> > > > only > > >> > > > > > > > because > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > assumption (regular GC is enough to clean > direct > > >> > > buffers) > > >> > > > > may > > >> > > > > > > not > > >> > > > > > > > > > > always > > >> > > > > > > > > > > > be > > >> > > > > > > > > > > > > true, but also it makes harder for finding > > >> problems > > >> > in > > >> > > > > cases > > >> > > > > > of > > >> > > > > > > > > > memory > > >> > > > > > > > > > > > > overuse. E.g., user configured some direct > > memory > > >> for > > >> > > the > > >> > > > > > user > > >> > > > > > > > > > > libraries. > > >> > > > > > > > > > > > > If the library actually use more direct memory > > >> then > > >> > > > > > configured, > > >> > > > > > > > > which > > >> > > > > > > > > > > > > cannot be cleaned by GC because they are still > > in > > >> > use, > > >> > > > may > > >> > > > > > lead > > >> > > > > > > > to > > >> > > > > > > > > > > > overuse > > >> > > > > > > > > > > > > of the total container memory. In that case, > if > > it > > >> > > didn't > > >> > > > > > touch > > >> > > > > > > > the > > >> > > > > > > > > > JVM > > >> > > > > > > > > > > > > default max direct memory limit, we cannot > get a > > >> > direct > > >> > > > > > memory > > >> > > > > > > > OOM > > >> > > > > > > > > > and > > >> > > > > > > > > > > it > > >> > > > > > > > > > > > > will become super hard to understand which > part > > of > > >> > the > > >> > > > > > > > > configuration > > >> > > > > > > > > > > need > > >> > > > > > > > > > > > > to be updated. > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > For option 1.1, it has the similar problem as > > >> 1.2, if > > >> > > the > > >> > > > > > > > exceeded > > >> > > > > > > > > > > direct > > >> > > > > > > > > > > > > memory does not reach the max direct memory > > limit > > >> > > > specified > > >> > > > > > by > > >> > > > > > > > the > > >> > > > > > > > > > > > > dedicated parameter. I think it is slightly > > better > > >> > than > > >> > > > > 1.2, > > >> > > > > > > only > > >> > > > > > > > > > > because > > >> > > > > > > > > > > > > we can tune the parameter. > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < > > >> > > > > > [hidden email] > > >> > > > > > > > > > >> > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > > discussion, > > >> > maybe > > >> > > > let > > >> > > > > > me > > >> > > > > > > > > > > summarize > > >> > > > > > > > > > > > > it a > > >> > > > > > > > > > > > > > bit differently: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > We have the following two options: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by > > the > > >> > GC. > > >> > > > That > > >> > > > > > > makes > > >> > > > > > > > > it > > >> > > > > > > > > > > > > segfault > > >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC > in > > >> case > > >> > > > > > > > de-allocation > > >> > > > > > > > > > and > > >> > > > > > > > > > > > > > re-allocation of a bunch of segments happens > > >> > quickly, > > >> > > > > which > > >> > > > > > > is > > >> > > > > > > > > > often > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > case during batch scheduling or task > restart. > > >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option > 1.1) > > >> is > > >> > one > > >> > > > way > > >> > > > > > to > > >> > > > > > > do > > >> > > > > > > > > > this > > >> > > > > > > > > > > > > > - Another way could be to have a dedicated > > >> > > > bookkeeping > > >> > > > > in > > >> > > > > > > the > > >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is > a > > >> > number > > >> > > > > > > > independent > > >> > > > > > > > > of > > >> > > > > > > > > > > the > > >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the > > >> memory > > >> > > for > > >> > > > > the > > >> > > > > > > > > > > > > MemorySegments > > >> > > > > > > > > > > > > > (option 2). That way we need not worry about > > >> > > triggering > > >> > > > > GC > > >> > > > > > by > > >> > > > > > > > > some > > >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder > to > > >> > prevent > > >> > > > > > > > segfaults. > > >> > > > > > > > > We > > >> > > > > > > > > > > > need > > >> > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > be very careful about when we release the > > memory > > >> > > > segments > > >> > > > > > > (only > > >> > > > > > > > > in > > >> > > > > > > > > > > the > > >> > > > > > > > > > > > > > cleanup phase of the main thread). > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.1, we probably need > to > > >> set > > >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > >> > > "off_heap_managed_memory + > > >> > > > > > > > > > > direct_memory" > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved > > >> memory > > >> > > > pool. > > >> > > > > > > > Because > > >> > > > > > > > > if > > >> > > > > > > > > > > we > > >> > > > > > > > > > > > > just > > >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > >> > > > > "off_heap_managed_memory + > > >> > > > > > > > > > > > > jvm_overhead", > > >> > > > > > > > > > > > > > then there will be times when that entire > > >> memory is > > >> > > > > > allocated > > >> > > > > > > > by > > >> > > > > > > > > > > direct > > >> > > > > > > > > > > > > > buffers and we have nothing left for the JVM > > >> > > overhead. > > >> > > > So > > >> > > > > > we > > >> > > > > > > > > either > > >> > > > > > > > > > > > need > > >> > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > way to compensate for that (again some > safety > > >> > margin > > >> > > > > cutoff > > >> > > > > > > > > value) > > >> > > > > > > > > > or > > >> > > > > > > > > > > > we > > >> > > > > > > > > > > > > > will exceed container memory. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > If we go with option 1.2, we need to be > aware > > >> that > > >> > it > > >> > > > > takes > > >> > > > > > > > > > elaborate > > >> > > > > > > > > > > > > logic > > >> > > > > > > > > > > > > > to push recycling of direct buffers without > > >> always > > >> > > > > > > triggering a > > >> > > > > > > > > > full > > >> > > > > > > > > > > > GC. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > My first guess is that the options will be > > >> easiest > > >> > to > > >> > > > do > > >> > > > > in > > >> > > > > > > the > > >> > > > > > > > > > > > following > > >> > > > > > > > > > > > > > order: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > direct_memory > > >> > > > parameter, > > >> > > > > as > > >> > > > > > > > > > discussed > > >> > > > > > > > > > > > > > above. We would need to find a way to set > the > > >> > > > > direct_memory > > >> > > > > > > > > > parameter > > >> > > > > > > > > > > > by > > >> > > > > > > > > > > > > > default. We could start with 64 MB and see > how > > >> it > > >> > > goes > > >> > > > in > > >> > > > > > > > > practice. > > >> > > > > > > > > > > One > > >> > > > > > > > > > > > > > danger I see is that setting this loo low > can > > >> > cause a > > >> > > > > bunch > > >> > > > > > > of > > >> > > > > > > > > > > > additional > > >> > > > > > > > > > > > > > GCs compared to before (we need to watch > this > > >> > > > carefully). > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 2. It is actually quite simple to > > >> > > implement, > > >> > > > > we > > >> > > > > > > > could > > >> > > > > > > > > > try > > >> > > > > > > > > > > > how > > >> > > > > > > > > > > > > > segfault safe we are at the moment. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > > >> > > > > > > > "-XX:MaxDirectMemorySize" > > >> > > > > > > > > > > > > parameter > > >> > > > > > > > > > > > > > at all and assume that all the direct memory > > >> > > > allocations > > >> > > > > > that > > >> > > > > > > > the > > >> > > > > > > > > > JVM > > >> > > > > > > > > > > > and > > >> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned > > up > > >> > fast > > >> > > > > enough > > >> > > > > > > > > through > > >> > > > > > > > > > > > > regular > > >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > > assumption, > > >> > > > though. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > Best, > > >> > > > > > > > > > > > > > Stephan > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song < > > >> > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > > >> > wondering > > >> > > > > > whether > > >> > > > > > > > we > > >> > > > > > > > > > can > > >> > > > > > > > > > > > > avoid > > >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > managed > > >> > memory > > >> > > > and > > >> > > > > > > > network > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > with > > >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > > >> > > thought, > > >> > > > I > > >> > > > > > > think > > >> > > > > > > > > even > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > > > > alternative 3 using direct memory for > > off-heap > > >> > > > managed > > >> > > > > > > memory > > >> > > > > > > > > > could > > >> > > > > > > > > > > > > cause > > >> > > > > > > > > > > > > > > problems. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what > > proposed > > >> in > > >> > > this > > >> > > > > > FLIP > > >> > > > > > > it > > >> > > > > > > > > to > > >> > > > > > > > > > > have > > >> > > > > > > > > > > > > > both > > >> > > > > > > > > > > > > > > off-heap managed memory and network memory > > >> > > allocated > > >> > > > > > > through > > >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > >> > practically > > >> > > > > > native > > >> > > > > > > > > memory > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only > > >> parts > > >> > of > > >> > > > > > memory > > >> > > > > > > > > > limited > > >> > > > > > > > > > > by > > >> > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > max direct memory are task off-heap memory > > and > > >> > JVM > > >> > > > > > > overhead, > > >> > > > > > > > > > which > > >> > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > > JVM > > >> max > > >> > > > > direct > > >> > > > > > > > memory > > >> > > > > > > > > > to. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > > Rohrmann > > >> < > > >> > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > >> > > understand > > >> > > > > the > > >> > > > > > > two > > >> > > > > > > > > > > > > alternatives > > >> > > > > > > > > > > > > > > > now. > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because > > it > > >> > makes > > >> > > > > > things > > >> > > > > > > > > > > explicit. > > >> > > > > > > > > > > > If > > >> > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > that > > >> we > > >> > > might > > >> > > > > end > > >> > > > > > > up > > >> > > > > > > > > in a > > >> > > > > > > > > > > > > similar > > >> > > > > > > > > > > > > > > > situation as we are currently in: The > user > > >> > might > > >> > > > see > > >> > > > > > that > > >> > > > > > > > her > > >> > > > > > > > > > > > process > > >> > > > > > > > > > > > > > > gets > > >> > > > > > > > > > > > > > > > killed by the OS and does not know why > > this > > >> is > > >> > > the > > >> > > > > > case. > > >> > > > > > > > > > > > > Consequently, > > >> > > > > > > > > > > > > > > she > > >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > > >> > > (similar > > >> > > > to > > >> > > > > > > > > > increasing > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > cutoff > > >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > > extra > > >> > > direct > > >> > > > > > > memory. > > >> > > > > > > > > > Even > > >> > > > > > > > > > > > > worse, > > >> > > > > > > > > > > > > > > she > > >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > are > > >> not > > >> > > > fully > > >> > > > > > used > > >> > > > > > > > and > > >> > > > > > > > > > > hence > > >> > > > > > > > > > > > > > won't > > >> > > > > > > > > > > > > > > > change the overall memory consumption. > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > > >> Song < > > >> > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > > >> example > > >> > > Till. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following > > scenario. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > Memory > > + > > >> JVM > > >> > > > > > > Overhead): > > >> > > > > > > > > > 200MB > > >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > > >> Metaspace, > > >> > > > > > Off-Heap > > >> > > > > > > > > > Managed > > >> > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> > > -XX:MaxDirectMemorySize > > >> > > > > to > > >> > > > > > > > 200MB. > > >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> > > -XX:MaxDirectMemorySize > > >> > > > > to > > >> > > > > > a > > >> > > > > > > > very > > >> > > > > > > > > > > large > > >> > > > > > > > > > > > > > > value, > > >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > > Task > > >> > > > Off-Heap > > >> > > > > > > Memory > > >> > > > > > > > > and > > >> > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > Overhead > > >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative > 2 > > >> and > > >> > > > > > > alternative 3 > > >> > > > > > > > > > > should > > >> > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> > > > > -XX:MaxDirectMemorySize > > >> > > > > > > will > > >> > > > > > > > > not > > >> > > > > > > > > > > > > reduce > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > > Task > > >> > > > Off-Heap > > >> > > > > > > Memory > > >> > > > > > > > > and > > >> > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > frequent > > >> OOM. > > >> > > To > > >> > > > > > avoid > > >> > > > > > > > > that, > > >> > > > > > > > > > > the > > >> > > > > > > > > > > > > only > > >> > > > > > > > > > > > > > > > thing > > >> > > > > > > > > > > > > > > > > user can do is to modify the > > >> configuration > > >> > > and > > >> > > > > > > > increase > > >> > > > > > > > > > JVM > > >> > > > > > > > > > > > > Direct > > >> > > > > > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > Overhead). > > >> > Let's > > >> > > > say > > >> > > > > > > that > > >> > > > > > > > > user > > >> > > > > > > > > > > > > > increases > > >> > > > > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > > >> reduce > > >> > the > > >> > > > > total > > >> > > > > > > > size > > >> > > > > > > > > of > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > > process > > >> > > memory > > >> > > > > > > remains > > >> > > > > > > > > > 1GB. > > >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > > >> chance of > > >> > > > > direct > > >> > > > > > > OOM. > > >> > > > > > > > > > There > > >> > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > chances > > >> > > > > > > > > > > > > > > > > of exceeding the total process > memory > > >> > limit, > > >> > > > but > > >> > > > > > > given > > >> > > > > > > > > > that > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > process > > >> > > > > > > > > > > > > > > > > may > > >> > > > > > > > > > > > > > > > > not use up all the reserved native > > >> memory > > >> > > > > > (Off-Heap > > >> > > > > > > > > > Managed > > >> > > > > > > > > > > > > > Memory, > > >> > > > > > > > > > > > > > > > > Network > > >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > actual > > >> > direct > > >> > > > > > memory > > >> > > > > > > > > usage > > >> > > > > > > > > > is > > >> > > > > > > > > > > > > > > slightly > > >> > > > > > > > > > > > > > > > > above > > >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > > probably > > >> do > > >> > > not > > >> > > > > need > > >> > > > > > > to > > >> > > > > > > > > > change > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > configurations. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > > >> > > perspective, a > > >> > > > > > > > feasible > > >> > > > > > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > > >> resource > > >> > > > > > > utilization > > >> > > > > > > > > > > compared > > >> > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > alternative 3. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > > >> > Rohrmann > > >> > > < > > >> > > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me > understand > > >> the > > >> > > > > > difference > > >> > > > > > > > > > between > > >> > > > > > > > > > > > > > > > > alternative 2 > > >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > > >> > > Xintong. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> XX:MaxDirectMemorySize > > >> > > to > > >> > > > > Task > > >> > > > > > > > > > Off-Heap > > >> > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > > >> this > > >> > > size > > >> > > > > is > > >> > > > > > > too > > >> > > > > > > > > low > > >> > > > > > > > > > > > > > resulting > > >> > > > > > > > > > > > > > > > in a > > >> > > > > > > > > > > > > > > > > > lot of garbage collection and > > >> potentially > > >> > an > > >> > > > OOM. > > >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> XX:MaxDirectMemorySize > > >> > > to > > >> > > > > > > > something > > >> > > > > > > > > > > larger > > >> > > > > > > > > > > > > > than > > >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > > >> reduce > > >> > > the > > >> > > > > > sizes > > >> > > > > > > of > > >> > > > > > > > > the > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > types. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > in > > an > > >> > > under > > >> > > > > > > > > utilization > > >> > > > > > > > > > of > > >> > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > > >> alternative 3 > > >> > > > > > strictly > > >> > > > > > > > > sets a > > >> > > > > > > > > > > > > higher > > >> > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > direct memory size and we use only > > >> little, > > >> > > > then I > > >> > > > > > > would > > >> > > > > > > > > > > expect > > >> > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > > >> > > > > utilization. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > > >> Wang < > > >> > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large > max > > >> > direct > > >> > > > > > memory > > >> > > > > > > > size > > >> > > > > > > > > > > when > > >> > > > > > > > > > > > we > > >> > > > > > > > > > > > > > do > > >> > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > differentiate direct and native > > >> memory. > > >> > If > > >> > > > the > > >> > > > > > > direct > > >> > > > > > > > > > > > > > > > memory,including > > >> > > > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > direct memory and framework direct > > >> > > > memory,could > > >> > > > > > be > > >> > > > > > > > > > > calculated > > >> > > > > > > > > > > > > > > > > > > correctly,then > > >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > > memory > > >> > with > > >> > > > > fixed > > >> > > > > > > > > value. > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > > >> k8s,we > > >> > > > need > > >> > > > > to > > >> > > > > > > > check > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > > >> > > submitting > > >> > > > > > > > > successfully > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > > > failing > > >> > > > > > > > > > > > > > > > > in > > >> > > > > > > > > > > > > > > > > > > the flink master. > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song < > [hidden email] > > >> > > > > >于2019年8月13日 > > >> > > > > > > > > > 周二22:07写道: > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you > > are > > >> > > right > > >> > > > > that > > >> > > > > > > we > > >> > > > > > > > > > should > > >> > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > include > > >> > > > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. > > >> This > > >> > > FLIP > > >> > > > > > should > > >> > > > > > > > > > > > concentrate > > >> > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > configure memory pools for > > >> > TaskExecutors, > > >> > > > > with > > >> > > > > > > > > minimum > > >> > > > > > > > > > > > > > > involvement > > >> > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > > >> > alternative > > >> > > 3 > > >> > > > > may > > >> > > > > > > not > > >> > > > > > > > > > having > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > same > > >> > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > reservation issue that > > alternative 2 > > >> > > does, > > >> > > > > but > > >> > > > > > at > > >> > > > > > > > the > > >> > > > > > > > > > > cost > > >> > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > risk > > >> > > > > > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > using memory at the container > > level, > > >> > > which > > >> > > > is > > >> > > > > > not > > >> > > > > > > > > good. > > >> > > > > > > > > > > My > > >> > > > > > > > > > > > > > point > > >> > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > > "JVM > > >> > > > > Overhead" > > >> > > > > > > are > > >> > > > > > > > > not > > >> > > > > > > > > > > easy > > >> > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > config. > > >> > > > > > > > > > > > > > > > > > > For > > >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > > configure > > >> > them > > >> > > > > > higher > > >> > > > > > > > than > > >> > > > > > > > > > > what > > >> > > > > > > > > > > > > > > actually > > >> > > > > > > > > > > > > > > > > > > needed, > > >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > > OOM. > > >> For > > >> > > > > > > alternative > > >> > > > > > > > > 3, > > >> > > > > > > > > > > > users > > >> > > > > > > > > > > > > do > > >> > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > get > > >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > config > > >> the > > >> > > two > > >> > > > > > > options > > >> > > > > > > > > > > > > aggressively > > >> > > > > > > > > > > > > > > > high. > > >> > > > > > > > > > > > > > > > > > But > > >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > > >> overall > > >> > > > > container > > >> > > > > > > > > memory > > >> > > > > > > > > > > > usage > > >> > > > > > > > > > > > > > > > exceeds > > >> > > > > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > > > > budget. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > > Till > > >> > > > > Rohrmann < > > >> > > > > > > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > > >> > Xintong. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > > >> looks > > >> > > quite > > >> > > > > > good. > > >> > > > > > > > > > > > Concerning > > >> > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > first > > >> > > > > > > > > > > > > > > > > > > open > > >> > > > > > > > > > > > > > > > > > > > > question about allocating > memory > > >> > > > segments, > > >> > > > > I > > >> > > > > > > was > > >> > > > > > > > > > > > wondering > > >> > > > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > > >> > context > > >> > > > of > > >> > > > > > this > > >> > > > > > > > > FLIP > > >> > > > > > > > > > or > > >> > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > > >> > knowing > > >> > > > all > > >> > > > > > > > > details, > > >> > > > > > > > > > I > > >> > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > > concerned > > >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > of > > >> this > > >> > > > FLIP > > >> > > > > > too > > >> > > > > > > > much > > >> > > > > > > > > > > > because > > >> > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call > > >> sites > > >> > of > > >> > > > the > > >> > > > > > > > > > > MemoryManager > > >> > > > > > > > > > > > > > where > > >> > > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > > > allocate > > >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > > >> mainly > > >> > be > > >> > > > > batch > > >> > > > > > > > > > > operators). > > >> > > > > > > > > > > > > The > > >> > > > > > > > > > > > > > > > > addition > > >> > > > > > > > > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to > > the > > >> > > > > > > MemoryManager > > >> > > > > > > > > > should > > >> > > > > > > > > > > > not > > >> > > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > affected > > >> > > > > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > this > > is > > >> > the > > >> > > > only > > >> > > > > > > point > > >> > > > > > > > > of > > >> > > > > > > > > > > > > > > interaction > > >> > > > > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > > the > > >> > > > > > > MemoryManager. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > > >> question > > >> > > about > > >> > > > > > > setting > > >> > > > > > > > > or > > >> > > > > > > > > > > not > > >> > > > > > > > > > > > > > > setting > > >> > > > > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > > also > > >> be > > >> > > > > > interested > > >> > > > > > > > why > > >> > > > > > > > > > > Yang > > >> > > > > > > > > > > > > Wang > > >> > > > > > > > > > > > > > > > > thinks > > >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. > > My > > >> > > concern > > >> > > > > > about > > >> > > > > > > > > this > > >> > > > > > > > > > > > would > > >> > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > > >> are > > >> > now > > >> > > > > with > > >> > > > > > > the > > >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> > > > > > > > > > > > > > > > > > > If > > >> > > > > > > > > > > > > > > > > > > > > the different memory pools are > > not > > >> > > > clearly > > >> > > > > > > > > separated > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > can > > >> > > > > > > > > > > > > > > > spill > > >> > > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > > quite > > >> > hard > > >> > > > to > > >> > > > > > > > > understand > > >> > > > > > > > > > > > what > > >> > > > > > > > > > > > > > > > exactly > > >> > > > > > > > > > > > > > > > > > > > causes a > > >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > > >> too > > >> > > much > > >> > > > > > > memory. > > >> > > > > > > > > This > > >> > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > then > > >> > > > > > > > > > > > > > > > > > easily > > >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > what > > >> we > > >> > > have > > >> > > > > with > > >> > > > > > > the > > >> > > > > > > > > > > > > > cutoff-ratio. > > >> > > > > > > > > > > > > > > > So > > >> > > > > > > > > > > > > > > > > > why > > >> > > > > > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > for > > >> max > > >> > > > direct > > >> > > > > > > > memory > > >> > > > > > > > > > and > > >> > > > > > > > > > > > > giving > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > an > > >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > runs > > >> into > > >> > > an > > >> > > > > OOM. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > alternative > > 2 > > >> > lead > > >> > > to > > >> > > > > > lower > > >> > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > utilization > > >> > > > > > > > > > > > > > > > > > than > > >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > > >> direct > > >> > > > > memory > > >> > > > > > > to a > > >> > > > > > > > > > > higher > > >> > > > > > > > > > > > > > value? > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > > >> > Xintong > > >> > > > > Song < > > >> > > > > > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > Yang. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large > > max > > >> > > direct > > >> > > > > > > memory > > >> > > > > > > > > size > > >> > > > > > > > > > > > > > > definitely > > >> > > > > > > > > > > > > > > > > has > > >> > > > > > > > > > > > > > > > > > > some > > >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > > >> worry > > >> > > about > > >> > > > > > > direct > > >> > > > > > > > > OOM, > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > don't > > >> > > > > > > > > > > > > > > > > > even > > >> > > > > > > > > > > > > > > > > > > > > need > > >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > > >> > memory > > >> > > > with > > >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> > > > > > > > > > > > > > > > > > > > > > However, there are also some > > >> down > > >> > > sides > > >> > > > > of > > >> > > > > > > > doing > > >> > > > > > > > > > > this. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > of > > is > > >> > that > > >> > > > if > > >> > > > > a > > >> > > > > > > task > > >> > > > > > > > > > > > executor > > >> > > > > > > > > > > > > > > > > container > > >> > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > > >> memory, > > >> > it > > >> > > > > could > > >> > > > > > > be > > >> > > > > > > > > hard > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > > use > > >> > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > know > > >> > > > > > > > > > > > > > > > > > > > which > > >> > > > > > > > > > > > > > > > > > > > > > part > > >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > > >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > that > > >> the > > >> > > JVM > > >> > > > > > never > > >> > > > > > > > > > trigger > > >> > > > > > > > > > > GC > > >> > > > > > > > > > > > > due > > >> > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > reaching > > >> > > > > > > > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > > because > > >> the > > >> > > > limit > > >> > > > > > is > > >> > > > > > > > too > > >> > > > > > > > > > high > > >> > > > > > > > > > > > to > > >> > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > reached. > > >> > > > > > > > > > > > > > > > > > > > That > > >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on > > >> heap > > >> > > > memory > > >> > > > > to > > >> > > > > > > > > trigger > > >> > > > > > > > > > > GC > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > release > > >> > > > > > > > > > > > > > > > > > > > direct > > >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > > >> problem > > >> > in > > >> > > > > cases > > >> > > > > > > > where > > >> > > > > > > > > > we > > >> > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > more > > >> > > > > > > > > > > > > > > > > > direct > > >> > > > > > > > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > > >> > activity > > >> > > > to > > >> > > > > > > > trigger > > >> > > > > > > > > > the > > >> > > > > > > > > > > > GC. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > > reasons > > >> > for > > >> > > > > > > preferring > > >> > > > > > > > > > > > setting a > > >> > > > > > > > > > > > > > > very > > >> > > > > > > > > > > > > > > > > > large > > >> > > > > > > > > > > > > > > > > > > > > value, > > >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > >> > > > overlooked. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > > between > > >> > > > multiple > > >> > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > think > > we > > >> > > should > > >> > > > > > throw > > >> > > > > > > > an > > >> > > > > > > > > > > error. > > >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > > >> > client > > >> > > > side > > >> > > > > > is > > >> > > > > > > a > > >> > > > > > > > > good > > >> > > > > > > > > > > > idea, > > >> > > > > > > > > > > > > > so > > >> > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > > problem > > >> > > before > > >> > > > > > > > submitting > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > Flink > > >> > > > > > > > > > > > > > > > > > cluster, > > >> > > > > > > > > > > > > > > > > > > > > which > > >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > > the > > >> > > client > > >> > > > > side > > >> > > > > > > > > > checking, > > >> > > > > > > > > > > > > > because > > >> > > > > > > > > > > > > > > > for > > >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > TaskManagers > > >> on > > >> > > > > > different > > >> > > > > > > > > > machines > > >> > > > > > > > > > > > may > > >> > > > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > > > > > > different > > >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > > >> does > > >> > > see > > >> > > > > > that. > > >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > PM > > >> Yang > > >> > > > Wang > > >> > > > > < > > >> > > > > > > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > > >> > proposal. > > >> > > > > After > > >> > > > > > > all > > >> > > > > > > > > the > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > > >> > > powerful > > >> > > > to > > >> > > > > > > > control > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > flink > > >> > > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > > about > > >> it. > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > > Memory > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > user > > >> > direct > > >> > > > > > memory > > >> > > > > > > > and > > >> > > > > > > > > > > native > > >> > > > > > > > > > > > > > > memory. > > >> > > > > > > > > > > > > > > > > > They > > >> > > > > > > > > > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > > > > > > > all > > >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > > >> memory. > > >> > > > > Right? > > >> > > > > > > So i > > >> > > > > > > > > > don’t > > >> > > > > > > > > > > > > think > > >> > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > > > > set > > >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > > >> > > > properly. I > > >> > > > > > > > prefer > > >> > > > > > > > > > > > leaving > > >> > > > > > > > > > > > > > it a > > >> > > > > > > > > > > > > > > > > very > > >> > > > > > > > > > > > > > > > > > > > large > > >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > fine-grained > > >> > > > > > > memory(network > > >> > > > > > > > > > > memory, > > >> > > > > > > > > > > > > > > managed > > >> > > > > > > > > > > > > > > > > > > memory, > > >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > process > > >> > > memory, > > >> > > > > how > > >> > > > > > do > > >> > > > > > > > we > > >> > > > > > > > > > deal > > >> > > > > > > > > > > > > with > > >> > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > > situation? > > >> > > > > > > > > > > > > > > > > > > > > > Do > > >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > > >> > > > > configuration > > >> > > > > > > in > > >> > > > > > > > > > > client? > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > > [hidden email]> > > >> > > > > > > > > > 于2019年8月7日周三 > > >> > > > > > > > > > > > > > > 下午10:14写道: > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > >> > > discussion > > >> > > > > > > thread > > >> > > > > > > > on > > >> > > > > > > > > > > > > "FLIP-49: > > >> > > > > > > > > > > > > > > > > Unified > > >> > > > > > > > > > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > >> > > > TaskExecutors"[1], > > >> > > > > > > where > > >> > > > > > > > we > > >> > > > > > > > > > > > > describe > > >> > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > improve > > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> > > configurations. > > >> > > > > The > > >> > > > > > > > FLIP > > >> > > > > > > > > > > > document > > >> > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > mostly > > >> > > > > > > > > > > > > > > > > > > > based > > >> > > > > > > > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > > > > > > an > > >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > > >> Management > > >> > > and > > >> > > > > > > > > > Configuration > > >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> > > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > > follow-up > > >> > > > > discussions > > >> > > > > > > > both > > >> > > > > > > > > > > online > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > > offline. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > > several > > >> > > > > > shortcomings > > >> > > > > > > of > > >> > > > > > > > > > > current > > >> > > > > > > > > > > > > > > (Flink > > >> > > > > > > > > > > > > > > > > 1.9) > > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> > > configuration. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > configuration > > >> > for > > >> > > > > > > Streaming > > >> > > > > > > > > and > > >> > > > > > > > > > > > Batch. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > difficult > > >> > > > > > configuration > > >> > > > > > > of > > >> > > > > > > > > > > RocksDB > > >> > > > > > > > > > > > > in > > >> > > > > > > > > > > > > > > > > > Streaming. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > > uncertain > > >> and > > >> > > > hard > > >> > > > > to > > >> > > > > > > > > > > understand. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > > >> > problems > > >> > > > can > > >> > > > > > be > > >> > > > > > > > > > > summarized > > >> > > > > > > > > > > > > as > > >> > > > > > > > > > > > > > > > > follows. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > manager > > >> to > > >> > > also > > >> > > > > > > account > > >> > > > > > > > > for > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > usage > > >> > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > state > > >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > TaskExecutor > > >> > > memory > > >> > > > > is > > >> > > > > > > > > > > partitioned > > >> > > > > > > > > > > > > > > > accounted > > >> > > > > > > > > > > > > > > > > > > > > individual > > >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > and > > >> > pools. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > >> > > configuration > > >> > > > > > > options > > >> > > > > > > > > and > > >> > > > > > > > > > > > > > > calculations > > >> > > > > > > > > > > > > > > > > > > logics. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details > > in > > >> the > > >> > > > FLIP > > >> > > > > > wiki > > >> > > > > > > > > > > document > > >> > > > > > > > > > > > > [1]. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > > early > > >> > > design > > >> > > > > doc > > >> > > > > > > [2] > > >> > > > > > > > is > > >> > > > > > > > > > out > > >> > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > sync, > > >> > > > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > > it > > >> > > > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > >> > > discussion > > >> > > > in > > >> > > > > > > this > > >> > > > > > > > > > > mailing > > >> > > > > > > > > > > > > list > > >> > > > > > > > > > > > > > > > > > thread.) > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > >> > > feedbacks. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song < > > >> > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > > >> > wondering > > >> > > > > > whether > > >> > > > > > > > we > > >> > > > > > > > > > can > > >> > > > > > > > > > > > > avoid > > >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > managed > > >> > memory > > >> > > > and > > >> > > > > > > > network > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > with > > >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > > >> > > thought, > > >> > > > I > > >> > > > > > > think > > >> > > > > > > > > even > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > > > > alternative 3 using direct memory for > > off-heap > > >> > > > managed > > >> > > > > > > memory > > >> > > > > > > > > > could > > >> > > > > > > > > > > > > cause > > >> > > > > > > > > > > > > > > problems. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Yang, > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Regarding your concern, I think what > > proposed > > >> in > > >> > > this > > >> > > > > > FLIP > > >> > > > > > > it > > >> > > > > > > > > to > > >> > > > > > > > > > > have > > >> > > > > > > > > > > > > > both > > >> > > > > > > > > > > > > > > off-heap managed memory and network memory > > >> > > allocated > > >> > > > > > > through > > >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > >> > practically > > >> > > > > > native > > >> > > > > > > > > memory > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only > > >> parts > > >> > of > > >> > > > > > memory > > >> > > > > > > > > > limited > > >> > > > > > > > > > > by > > >> > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > max direct memory are task off-heap memory > > and > > >> > JVM > > >> > > > > > > overhead, > > >> > > > > > > > > > which > > >> > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > > JVM > > >> max > > >> > > > > direct > > >> > > > > > > > memory > > >> > > > > > > > > > to. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > > Rohrmann > > >> < > > >> > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > > >> > > understand > > >> > > > > the > > >> > > > > > > two > > >> > > > > > > > > > > > > alternatives > > >> > > > > > > > > > > > > > > > now. > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > I would be in favour of option 2 because > > it > > >> > makes > > >> > > > > > things > > >> > > > > > > > > > > explicit. > > >> > > > > > > > > > > > If > > >> > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > that > > >> we > > >> > > might > > >> > > > > end > > >> > > > > > > up > > >> > > > > > > > > in a > > >> > > > > > > > > > > > > similar > > >> > > > > > > > > > > > > > > > situation as we are currently in: The > user > > >> > might > > >> > > > see > > >> > > > > > that > > >> > > > > > > > her > > >> > > > > > > > > > > > process > > >> > > > > > > > > > > > > > > gets > > >> > > > > > > > > > > > > > > > killed by the OS and does not know why > > this > > >> is > > >> > > the > > >> > > > > > case. > > >> > > > > > > > > > > > > Consequently, > > >> > > > > > > > > > > > > > > she > > >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > > >> > > (similar > > >> > > > to > > >> > > > > > > > > > increasing > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > cutoff > > >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > > extra > > >> > > direct > > >> > > > > > > memory. > > >> > > > > > > > > > Even > > >> > > > > > > > > > > > > worse, > > >> > > > > > > > > > > > > > > she > > >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > are > > >> not > > >> > > > fully > > >> > > > > > used > > >> > > > > > > > and > > >> > > > > > > > > > > hence > > >> > > > > > > > > > > > > > won't > > >> > > > > > > > > > > > > > > > change the overall memory consumption. > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong > > >> Song < > > >> > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > > >> example > > >> > > Till. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Let's say we have the following > > scenario. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > Memory > > + > > >> JVM > > >> > > > > > > Overhead): > > >> > > > > > > > > > 200MB > > >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > > >> Metaspace, > > >> > > > > > Off-Heap > > >> > > > > > > > > > Managed > > >> > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> > > -XX:MaxDirectMemorySize > > >> > > > > to > > >> > > > > > > > 200MB. > > >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> > > -XX:MaxDirectMemorySize > > >> > > > > to > > >> > > > > > a > > >> > > > > > > > very > > >> > > > > > > > > > > large > > >> > > > > > > > > > > > > > > value, > > >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > > Task > > >> > > > Off-Heap > > >> > > > > > > Memory > > >> > > > > > > > > and > > >> > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > Overhead > > >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative > 2 > > >> and > > >> > > > > > > alternative 3 > > >> > > > > > > > > > > should > > >> > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> > > > > -XX:MaxDirectMemorySize > > >> > > > > > > will > > >> > > > > > > > > not > > >> > > > > > > > > > > > > reduce > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > > Task > > >> > > > Off-Heap > > >> > > > > > > Memory > > >> > > > > > > > > and > > >> > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > frequent > > >> OOM. > > >> > > To > > >> > > > > > avoid > > >> > > > > > > > > that, > > >> > > > > > > > > > > the > > >> > > > > > > > > > > > > only > > >> > > > > > > > > > > > > > > > thing > > >> > > > > > > > > > > > > > > > > user can do is to modify the > > >> configuration > > >> > > and > > >> > > > > > > > increase > > >> > > > > > > > > > JVM > > >> > > > > > > > > > > > > Direct > > >> > > > > > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > Overhead). > > >> > Let's > > >> > > > say > > >> > > > > > > that > > >> > > > > > > > > user > > >> > > > > > > > > > > > > > increases > > >> > > > > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > > >> reduce > > >> > the > > >> > > > > total > > >> > > > > > > > size > > >> > > > > > > > > of > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > > process > > >> > > memory > > >> > > > > > > remains > > >> > > > > > > > > > 1GB. > > >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > > >> chance of > > >> > > > > direct > > >> > > > > > > OOM. > > >> > > > > > > > > > There > > >> > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > chances > > >> > > > > > > > > > > > > > > > > of exceeding the total process > memory > > >> > limit, > > >> > > > but > > >> > > > > > > given > > >> > > > > > > > > > that > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > process > > >> > > > > > > > > > > > > > > > > may > > >> > > > > > > > > > > > > > > > > not use up all the reserved native > > >> memory > > >> > > > > > (Off-Heap > > >> > > > > > > > > > Managed > > >> > > > > > > > > > > > > > Memory, > > >> > > > > > > > > > > > > > > > > Network > > >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > actual > > >> > direct > > >> > > > > > memory > > >> > > > > > > > > usage > > >> > > > > > > > > > is > > >> > > > > > > > > > > > > > > slightly > > >> > > > > > > > > > > > > > > > > above > > >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > > probably > > >> do > > >> > > not > > >> > > > > need > > >> > > > > > > to > > >> > > > > > > > > > change > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > configurations. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > > >> > > perspective, a > > >> > > > > > > > feasible > > >> > > > > > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > > >> resource > > >> > > > > > > utilization > > >> > > > > > > > > > > compared > > >> > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > alternative 3. > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > > >> > Rohrmann > > >> > > < > > >> > > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > I guess you have to help me > understand > > >> the > > >> > > > > > difference > > >> > > > > > > > > > between > > >> > > > > > > > > > > > > > > > > alternative 2 > > >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > > >> > > Xintong. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> XX:MaxDirectMemorySize > > >> > > to > > >> > > > > Task > > >> > > > > > > > > > Off-Heap > > >> > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > JVM > > >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > > >> this > > >> > > size > > >> > > > > is > > >> > > > > > > too > > >> > > > > > > > > low > > >> > > > > > > > > > > > > > resulting > > >> > > > > > > > > > > > > > > > in a > > >> > > > > > > > > > > > > > > > > > lot of garbage collection and > > >> potentially > > >> > an > > >> > > > OOM. > > >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> XX:MaxDirectMemorySize > > >> > > to > > >> > > > > > > > something > > >> > > > > > > > > > > larger > > >> > > > > > > > > > > > > > than > > >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > > >> reduce > > >> > > the > > >> > > > > > sizes > > >> > > > > > > of > > >> > > > > > > > > the > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > types. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > in > > an > > >> > > under > > >> > > > > > > > > utilization > > >> > > > > > > > > > of > > >> > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > > >> alternative 3 > > >> > > > > > strictly > > >> > > > > > > > > sets a > > >> > > > > > > > > > > > > higher > > >> > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > direct memory size and we use only > > >> little, > > >> > > > then I > > >> > > > > > > would > > >> > > > > > > > > > > expect > > >> > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > > >> > > > > utilization. > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang > > >> Wang < > > >> > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > My point is setting a very large > max > > >> > direct > > >> > > > > > memory > > >> > > > > > > > size > > >> > > > > > > > > > > when > > >> > > > > > > > > > > > we > > >> > > > > > > > > > > > > > do > > >> > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > differentiate direct and native > > >> memory. > > >> > If > > >> > > > the > > >> > > > > > > direct > > >> > > > > > > > > > > > > > > > memory,including > > >> > > > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > direct memory and framework direct > > >> > > > memory,could > > >> > > > > > be > > >> > > > > > > > > > > calculated > > >> > > > > > > > > > > > > > > > > > > correctly,then > > >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > > memory > > >> > with > > >> > > > > fixed > > >> > > > > > > > > value. > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and > > >> k8s,we > > >> > > > need > > >> > > > > to > > >> > > > > > > > check > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > > >> > > submitting > > >> > > > > > > > > successfully > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > > > failing > > >> > > > > > > > > > > > > > > > > in > > >> > > > > > > > > > > > > > > > > > > the flink master. > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Yang > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Xintong Song < > [hidden email] > > >> > > > > >于2019年8月13日 > > >> > > > > > > > > > 周二22:07写道: > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you > > are > > >> > > right > > >> > > > > that > > >> > > > > > > we > > >> > > > > > > > > > should > > >> > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > include > > >> > > > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. > > >> This > > >> > > FLIP > > >> > > > > > should > > >> > > > > > > > > > > > concentrate > > >> > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > configure memory pools for > > >> > TaskExecutors, > > >> > > > > with > > >> > > > > > > > > minimum > > >> > > > > > > > > > > > > > > involvement > > >> > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > > >> > alternative > > >> > > 3 > > >> > > > > may > > >> > > > > > > not > > >> > > > > > > > > > having > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > same > > >> > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > reservation issue that > > alternative 2 > > >> > > does, > > >> > > > > but > > >> > > > > > at > > >> > > > > > > > the > > >> > > > > > > > > > > cost > > >> > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > risk > > >> > > > > > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > using memory at the container > > level, > > >> > > which > > >> > > > is > > >> > > > > > not > > >> > > > > > > > > good. > > >> > > > > > > > > > > My > > >> > > > > > > > > > > > > > point > > >> > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > > "JVM > > >> > > > > Overhead" > > >> > > > > > > are > > >> > > > > > > > > not > > >> > > > > > > > > > > easy > > >> > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > config. > > >> > > > > > > > > > > > > > > > > > > For > > >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > > configure > > >> > them > > >> > > > > > higher > > >> > > > > > > > than > > >> > > > > > > > > > > what > > >> > > > > > > > > > > > > > > actually > > >> > > > > > > > > > > > > > > > > > > needed, > > >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > > OOM. > > >> For > > >> > > > > > > alternative > > >> > > > > > > > > 3, > > >> > > > > > > > > > > > users > > >> > > > > > > > > > > > > do > > >> > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > get > > >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > config > > >> the > > >> > > two > > >> > > > > > > options > > >> > > > > > > > > > > > > aggressively > > >> > > > > > > > > > > > > > > > high. > > >> > > > > > > > > > > > > > > > > > But > > >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > > >> overall > > >> > > > > container > > >> > > > > > > > > memory > > >> > > > > > > > > > > > usage > > >> > > > > > > > > > > > > > > > exceeds > > >> > > > > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > > > > budget. > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > > Till > > >> > > > > Rohrmann < > > >> > > > > > > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP > > >> > Xintong. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > > >> looks > > >> > > quite > > >> > > > > > good. > > >> > > > > > > > > > > > Concerning > > >> > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > first > > >> > > > > > > > > > > > > > > > > > > open > > >> > > > > > > > > > > > > > > > > > > > > question about allocating > memory > > >> > > > segments, > > >> > > > > I > > >> > > > > > > was > > >> > > > > > > > > > > > wondering > > >> > > > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > > >> > context > > >> > > > of > > >> > > > > > this > > >> > > > > > > > > FLIP > > >> > > > > > > > > > or > > >> > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > > >> > knowing > > >> > > > all > > >> > > > > > > > > details, > > >> > > > > > > > > > I > > >> > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > > concerned > > >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > of > > >> this > > >> > > > FLIP > > >> > > > > > too > > >> > > > > > > > much > > >> > > > > > > > > > > > because > > >> > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call > > >> sites > > >> > of > > >> > > > the > > >> > > > > > > > > > > MemoryManager > > >> > > > > > > > > > > > > > where > > >> > > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > > > allocate > > >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > > >> mainly > > >> > be > > >> > > > > batch > > >> > > > > > > > > > > operators). > > >> > > > > > > > > > > > > The > > >> > > > > > > > > > > > > > > > > addition > > >> > > > > > > > > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to > > the > > >> > > > > > > MemoryManager > > >> > > > > > > > > > should > > >> > > > > > > > > > > > not > > >> > > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > affected > > >> > > > > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > this > > is > > >> > the > > >> > > > only > > >> > > > > > > point > > >> > > > > > > > > of > > >> > > > > > > > > > > > > > > interaction > > >> > > > > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > > the > > >> > > > > > > MemoryManager. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > > >> question > > >> > > about > > >> > > > > > > setting > > >> > > > > > > > > or > > >> > > > > > > > > > > not > > >> > > > > > > > > > > > > > > setting > > >> > > > > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > > also > > >> be > > >> > > > > > interested > > >> > > > > > > > why > > >> > > > > > > > > > > Yang > > >> > > > > > > > > > > > > Wang > > >> > > > > > > > > > > > > > > > > thinks > > >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. > > My > > >> > > concern > > >> > > > > > about > > >> > > > > > > > > this > > >> > > > > > > > > > > > would > > >> > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > > > would > > >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > > >> are > > >> > now > > >> > > > > with > > >> > > > > > > the > > >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> > > > > > > > > > > > > > > > > > > If > > >> > > > > > > > > > > > > > > > > > > > > the different memory pools are > > not > > >> > > > clearly > > >> > > > > > > > > separated > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > can > > >> > > > > > > > > > > > > > > > spill > > >> > > > > > > > > > > > > > > > > > over > > >> > > > > > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > > quite > > >> > hard > > >> > > > to > > >> > > > > > > > > understand > > >> > > > > > > > > > > > what > > >> > > > > > > > > > > > > > > > exactly > > >> > > > > > > > > > > > > > > > > > > > causes a > > >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > > >> too > > >> > > much > > >> > > > > > > memory. > > >> > > > > > > > > This > > >> > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > then > > >> > > > > > > > > > > > > > > > > > easily > > >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > what > > >> we > > >> > > have > > >> > > > > with > > >> > > > > > > the > > >> > > > > > > > > > > > > > cutoff-ratio. > > >> > > > > > > > > > > > > > > > So > > >> > > > > > > > > > > > > > > > > > why > > >> > > > > > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > for > > >> max > > >> > > > direct > > >> > > > > > > > memory > > >> > > > > > > > > > and > > >> > > > > > > > > > > > > giving > > >> > > > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > an > > >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > runs > > >> into > > >> > > an > > >> > > > > OOM. > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > alternative > > 2 > > >> > lead > > >> > > to > > >> > > > > > lower > > >> > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > utilization > > >> > > > > > > > > > > > > > > > > > than > > >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the > > >> direct > > >> > > > > memory > > >> > > > > > > to a > > >> > > > > > > > > > > higher > > >> > > > > > > > > > > > > > value? > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> > > > > > > > > > > > > > > > > > > > > Till > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM > > >> > Xintong > > >> > > > > Song < > > >> > > > > > > > > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > Yang. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > > >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large > > max > > >> > > direct > > >> > > > > > > memory > > >> > > > > > > > > size > > >> > > > > > > > > > > > > > > definitely > > >> > > > > > > > > > > > > > > > > has > > >> > > > > > > > > > > > > > > > > > > some > > >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > > >> worry > > >> > > about > > >> > > > > > > direct > > >> > > > > > > > > OOM, > > >> > > > > > > > > > > and > > >> > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > don't > > >> > > > > > > > > > > > > > > > > > even > > >> > > > > > > > > > > > > > > > > > > > > need > > >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > > >> > memory > > >> > > > with > > >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> > > > > > > > > > > > > > > > > > > > > > However, there are also some > > >> down > > >> > > sides > > >> > > > > of > > >> > > > > > > > doing > > >> > > > > > > > > > > this. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > of > > is > > >> > that > > >> > > > if > > >> > > > > a > > >> > > > > > > task > > >> > > > > > > > > > > > executor > > >> > > > > > > > > > > > > > > > > container > > >> > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > > >> memory, > > >> > it > > >> > > > > could > > >> > > > > > > be > > >> > > > > > > > > hard > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > > use > > >> > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > know > > >> > > > > > > > > > > > > > > > > > > > which > > >> > > > > > > > > > > > > > > > > > > > > > part > > >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > > >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > that > > >> the > > >> > > JVM > > >> > > > > > never > > >> > > > > > > > > > trigger > > >> > > > > > > > > > > GC > > >> > > > > > > > > > > > > due > > >> > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > reaching > > >> > > > > > > > > > > > > > > > > > > > > max > > >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > > because > > >> the > > >> > > > limit > > >> > > > > > is > > >> > > > > > > > too > > >> > > > > > > > > > high > > >> > > > > > > > > > > > to > > >> > > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > > > > reached. > > >> > > > > > > > > > > > > > > > > > > > That > > >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on > > >> heap > > >> > > > memory > > >> > > > > to > > >> > > > > > > > > trigger > > >> > > > > > > > > > > GC > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > release > > >> > > > > > > > > > > > > > > > > > > > direct > > >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > > >> problem > > >> > in > > >> > > > > cases > > >> > > > > > > > where > > >> > > > > > > > > > we > > >> > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > more > > >> > > > > > > > > > > > > > > > > > direct > > >> > > > > > > > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap > > >> > activity > > >> > > > to > > >> > > > > > > > trigger > > >> > > > > > > > > > the > > >> > > > > > > > > > > > GC. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > > reasons > > >> > for > > >> > > > > > > preferring > > >> > > > > > > > > > > > setting a > > >> > > > > > > > > > > > > > > very > > >> > > > > > > > > > > > > > > > > > large > > >> > > > > > > > > > > > > > > > > > > > > value, > > >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I > > >> > > > overlooked. > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > > between > > >> > > > multiple > > >> > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > user > > >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > think > > we > > >> > > should > > >> > > > > > throw > > >> > > > > > > > an > > >> > > > > > > > > > > error. > > >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > > >> > client > > >> > > > side > > >> > > > > > is > > >> > > > > > > a > > >> > > > > > > > > good > > >> > > > > > > > > > > > idea, > > >> > > > > > > > > > > > > > so > > >> > > > > > > > > > > > > > > > that > > >> > > > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > > problem > > >> > > before > > >> > > > > > > > submitting > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > Flink > > >> > > > > > > > > > > > > > > > > > cluster, > > >> > > > > > > > > > > > > > > > > > > > > which > > >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > > the > > >> > > client > > >> > > > > side > > >> > > > > > > > > > checking, > > >> > > > > > > > > > > > > > because > > >> > > > > > > > > > > > > > > > for > > >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > TaskManagers > > >> on > > >> > > > > > different > > >> > > > > > > > > > machines > > >> > > > > > > > > > > > may > > >> > > > > > > > > > > > > > > have > > >> > > > > > > > > > > > > > > > > > > > different > > >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > > >> does > > >> > > see > > >> > > > > > that. > > >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > PM > > >> Yang > > >> > > > Wang > > >> > > > > < > > >> > > > > > > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > > >> > proposal. > > >> > > > > After > > >> > > > > > > all > > >> > > > > > > > > the > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > configuration > > >> > > > > > > > > > > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > > >> > > powerful > > >> > > > to > > >> > > > > > > > control > > >> > > > > > > > > > the > > >> > > > > > > > > > > > > flink > > >> > > > > > > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > > about > > >> it. > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > > Memory > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > user > > >> > direct > > >> > > > > > memory > > >> > > > > > > > and > > >> > > > > > > > > > > native > > >> > > > > > > > > > > > > > > memory. > > >> > > > > > > > > > > > > > > > > > They > > >> > > > > > > > > > > > > > > > > > > > are > > >> > > > > > > > > > > > > > > > > > > > > > all > > >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > > >> memory. > > >> > > > > Right? > > >> > > > > > > So i > > >> > > > > > > > > > don’t > > >> > > > > > > > > > > > > think > > >> > > > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > > > > > could > > >> > > > > > > > > > > > > > > > > > > > not > > >> > > > > > > > > > > > > > > > > > > > > > set > > >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > > >> > > > properly. I > > >> > > > > > > > prefer > > >> > > > > > > > > > > > leaving > > >> > > > > > > > > > > > > > it a > > >> > > > > > > > > > > > > > > > > very > > >> > > > > > > > > > > > > > > > > > > > large > > >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > fine-grained > > >> > > > > > > memory(network > > >> > > > > > > > > > > memory, > > >> > > > > > > > > > > > > > > managed > > >> > > > > > > > > > > > > > > > > > > memory, > > >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > process > > >> > > memory, > > >> > > > > how > > >> > > > > > do > > >> > > > > > > > we > > >> > > > > > > > > > deal > > >> > > > > > > > > > > > > with > > >> > > > > > > > > > > > > > > this > > >> > > > > > > > > > > > > > > > > > > > > situation? > > >> > > > > > > > > > > > > > > > > > > > > > Do > > >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > > >> > > > > configuration > > >> > > > > > > in > > >> > > > > > > > > > > client? > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > > [hidden email]> > > >> > > > > > > > > > 于2019年8月7日周三 > > >> > > > > > > > > > > > > > > 下午10:14写道: > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a > > >> > > discussion > > >> > > > > > > thread > > >> > > > > > > > on > > >> > > > > > > > > > > > > "FLIP-49: > > >> > > > > > > > > > > > > > > > > Unified > > >> > > > > > > > > > > > > > > > > > > > > Memory > > >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > >> > > > TaskExecutors"[1], > > >> > > > > > > where > > >> > > > > > > > we > > >> > > > > > > > > > > > > describe > > >> > > > > > > > > > > > > > > how > > >> > > > > > > > > > > > > > > > to > > >> > > > > > > > > > > > > > > > > > > > improve > > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> > > configurations. > > >> > > > > The > > >> > > > > > > > FLIP > > >> > > > > > > > > > > > document > > >> > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > mostly > > >> > > > > > > > > > > > > > > > > > > > based > > >> > > > > > > > > > > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > > > > > > > > > > an > > >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > > >> Management > > >> > > and > > >> > > > > > > > > > Configuration > > >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> > > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > > follow-up > > >> > > > > discussions > > >> > > > > > > > both > > >> > > > > > > > > > > online > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > > offline. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > > several > > >> > > > > > shortcomings > > >> > > > > > > of > > >> > > > > > > > > > > current > > >> > > > > > > > > > > > > > > (Flink > > >> > > > > > > > > > > > > > > > > 1.9) > > >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> > > configuration. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > configuration > > >> > for > > >> > > > > > > Streaming > > >> > > > > > > > > and > > >> > > > > > > > > > > > Batch. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > difficult > > >> > > > > > configuration > > >> > > > > > > of > > >> > > > > > > > > > > RocksDB > > >> > > > > > > > > > > > > in > > >> > > > > > > > > > > > > > > > > > Streaming. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > > uncertain > > >> and > > >> > > > hard > > >> > > > > to > > >> > > > > > > > > > > understand. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the > > >> > problems > > >> > > > can > > >> > > > > > be > > >> > > > > > > > > > > summarized > > >> > > > > > > > > > > > > as > > >> > > > > > > > > > > > > > > > > follows. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > manager > > >> to > > >> > > also > > >> > > > > > > account > > >> > > > > > > > > for > > >> > > > > > > > > > > > memory > > >> > > > > > > > > > > > > > > usage > > >> > > > > > > > > > > > > > > > > by > > >> > > > > > > > > > > > > > > > > > > > state > > >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > TaskExecutor > > >> > > memory > > >> > > > > is > > >> > > > > > > > > > > partitioned > > >> > > > > > > > > > > > > > > > accounted > > >> > > > > > > > > > > > > > > > > > > > > individual > > >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > and > > >> > pools. > > >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > >> > > configuration > > >> > > > > > > options > > >> > > > > > > > > and > > >> > > > > > > > > > > > > > > calculations > > >> > > > > > > > > > > > > > > > > > > logics. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details > > in > > >> the > > >> > > > FLIP > > >> > > > > > wiki > > >> > > > > > > > > > > document > > >> > > > > > > > > > > > > [1]. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > > early > > >> > > design > > >> > > > > doc > > >> > > > > > > [2] > > >> > > > > > > > is > > >> > > > > > > > > > out > > >> > > > > > > > > > > > of > > >> > > > > > > > > > > > > > > sync, > > >> > > > > > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > > > > it > > >> > > > > > > > > > > > > > > > > > > > is > > >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > > >> > > discussion > > >> > > > in > > >> > > > > > > this > > >> > > > > > > > > > > mailing > > >> > > > > > > > > > > > > list > > >> > > > > > > > > > > > > > > > > > thread.) > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > > >> > > feedbacks. > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > |
In reply to this post by Xintong Song
I just updated the FLIP wiki page [1], with the following changes:
- Network memory uses JVM direct memory, and is accounted when setting JVM max direct memory size parameter. - Use dynamic configurations (`-Dkey=value`) to pass calculated memory configs into TaskExecutors, instead of ENV variables. - Remove 'supporting memory reservation' from the scope of this FLIP. @till @stephan, please take another look see if there are any other concerns. Thank you~ Xintong Song [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email]> wrote: > Sorry for the late response. > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail > in PR. > - Regarding passing parameters into the `TaskExecutor`, +1 for using > dynamic configuration at the moment, given that there are more questions to > be discussed to have a general framework for overwriting configurations > with ENV variables. > - Regarding memory reservation, I double checked with Yu and he will take > care of it. > > Thank you~ > > Xintong Song > > > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> > wrote: > >> What I forgot to add is that we could tackle specifying the configuration >> fully in an incremental way and that the full specification should be the >> desired end state. >> >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> >> wrote: >> >> > I think our goal should be that the configuration is fully specified >> when >> > the process is started. By considering the internal calculation step to >> be >> > rather validate existing values and calculate missing ones, these two >> > proposal shouldn't even conflict (given determinism). >> > >> > Since we don't want to change an existing flink-conf.yaml, specifying >> the >> > full configuration would require to pass in the options differently. >> > >> > One way could be the ENV variables approach. The reason why I'm trying >> to >> > exclude this feature from the FLIP is that I believe it needs a bit more >> > discussion. Just some questions which come to my mind: What would be the >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator which is >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot >> > separator what would be the order of precedence if there are two ENV >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the >> > precedence of env variable vs. dynamic configuration value specified >> via -D? >> > >> > Another approach could be to pass in the dynamic configuration values >> via >> > `-Dkey=value` to the Flink process. For that we don't have to change >> > anything because the functionality already exists. >> > >> > Cheers, >> > Till >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> wrote: >> > >> >> I see. Under the assumption of strict determinism that should work. >> >> >> >> The original proposal had this point "don't compute inside the TM, >> compute >> >> outside and supply a full config", because that sounded more intuitive. >> >> >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email]> >> >> wrote: >> >> >> >> > My understanding was that before starting the Flink process we call a >> >> > utility which calculates these values. I assume that this utility >> will >> >> do >> >> > the calculation based on a set of configured values (process memory, >> >> flink >> >> > memory, network memory etc.). Assuming that these values don't differ >> >> from >> >> > the values with which the JVM is started, it should be possible to >> >> > recompute them in the Flink process in order to set the values. >> >> > >> >> > >> >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> >> wrote: >> >> > >> >> > > When computing the values in the JVM process after it started, how >> >> would >> >> > > you deal with values like Max Direct Memory, Metaspace size. native >> >> > memory >> >> > > reservation (reduce heap size), etc? All the values that are >> >> parameters >> >> > to >> >> > > the JVM process and that need to be supplied at process startup? >> >> > > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < >> [hidden email]> >> >> > > wrote: >> >> > > >> >> > > > Thanks for the clarification. I have some more comments: >> >> > > > >> >> > > > - I would actually split the logic to compute the process memory >> >> > > > requirements and storing the values into two things. E.g. one >> could >> >> > name >> >> > > > the former TaskExecutorProcessUtility and the latter >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR >> since >> >> it's >> >> > > > just a naming detail. >> >> > > > >> >> > > > - Generally, I'm not opposed to making configuration values >> >> overridable >> >> > > by >> >> > > > ENV variables. I think this is a very good idea and makes the >> >> > > > configurability of Flink processes easier. However, I think that >> >> adding >> >> > > > this functionality should not be part of this FLIP because it >> would >> >> > > simply >> >> > > > widen the scope unnecessarily. >> >> > > > >> >> > > > The reasons why I believe it is unnecessary are the following: >> For >> >> Yarn >> >> > > we >> >> > > > already create write a flink-conf.yaml which could be populated >> with >> >> > the >> >> > > > memory settings. For the other processes it should not make a >> >> > difference >> >> > > > whether the loaded Configuration is populated with the memory >> >> settings >> >> > > from >> >> > > > ENV variables or by using TaskExecutorProcessUtility to compute >> the >> >> > > missing >> >> > > > values from the loaded configuration. If the latter would not be >> >> > possible >> >> > > > (wrong or missing configuration values), then we should not have >> >> been >> >> > > able >> >> > > > to actually start the process in the first place. >> >> > > > >> >> > > > - Concerning the memory reservation: I agree with you that we >> need >> >> the >> >> > > > memory reservation functionality to make streaming jobs work with >> >> > > "managed" >> >> > > > memory. However, w/o this functionality the whole Flip would >> already >> >> > > bring >> >> > > > a good amount of improvements to our users when running batch >> jobs. >> >> > > > Moreover, by keeping the scope smaller we can complete the FLIP >> >> faster. >> >> > > > Hence, I would propose to address the memory reservation >> >> functionality >> >> > > as a >> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). >> >> > > > >> >> > > > Cheers, >> >> > > > Till >> >> > > > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < >> [hidden email]> >> >> > > wrote: >> >> > > > >> >> > > > > Just add my 2 cents. >> >> > > > > >> >> > > > > Using environment variables to override the configuration for >> >> > different >> >> > > > > taskmanagers is better. >> >> > > > > We do not need to generate dedicated flink-conf.yaml for all >> >> > > > taskmanagers. >> >> > > > > A common flink-conf.yam and different environment variables are >> >> > enough. >> >> > > > > By reducing the distributed cached files, it could make >> launching >> >> a >> >> > > > > taskmanager faster. >> >> > > > > >> >> > > > > Stephan gives a good suggestion that we could move the logic >> into >> >> > > > > "GlobalConfiguration.loadConfig()" method. >> >> > > > > Maybe the client could also benefit from this. Different users >> do >> >> not >> >> > > > have >> >> > > > > to export FLINK_CONF_DIR to update few config options. >> >> > > > > >> >> > > > > >> >> > > > > Best, >> >> > > > > Yang >> >> > > > > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: >> >> > > > > >> >> > > > > > One note on the Environment Variables and Configuration >> >> discussion. >> >> > > > > > >> >> > > > > > My understanding is that passed ENV variables are added to >> the >> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" >> method >> >> (or >> >> > > > > > similar). >> >> > > > > > For all the code inside Flink, it looks like the data was in >> the >> >> > > config >> >> > > > > to >> >> > > > > > start with, just that the scripts that compute the variables >> can >> >> > pass >> >> > > > the >> >> > > > > > values to the process without actually needing to write a >> file. >> >> > > > > > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" method >> would >> >> > take >> >> > > > any >> >> > > > > > ENV variable prefixed with "flink" and add it as a config >> key. >> >> > > > > > "flink_taskmanager_memory_size=2g" would become >> >> > > > "taskmanager.memory.size: >> >> > > > > > 2g". >> >> > > > > > >> >> > > > > > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < >> >> > [hidden email]> >> >> > > > > > wrote: >> >> > > > > > >> >> > > > > > > Thanks for the comments, Till. >> >> > > > > > > >> >> > > > > > > I've also seen your comments on the wiki page, but let's >> keep >> >> the >> >> > > > > > > discussion here. >> >> > > > > > > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about >> >> > naming >> >> > > it >> >> > > > > > > 'TaskExecutorResourceSpecifics'. >> >> > > > > > > - Regarding passing memory configurations into task >> executors, >> >> > I'm >> >> > > in >> >> > > > > > favor >> >> > > > > > > of do it via environment variables rather than >> configurations, >> >> > with >> >> > > > the >> >> > > > > > > following two reasons. >> >> > > > > > > - It is easier to keep the memory options once calculate >> >> not to >> >> > > be >> >> > > > > > > changed with environment variables rather than >> configurations. >> >> > > > > > > - I'm not sure whether we should write the configuration >> in >> >> > > startup >> >> > > > > > > scripts. Writing changes into the configuration files when >> >> > running >> >> > > > the >> >> > > > > > > startup scripts does not sounds right to me. Or we could >> make >> >> a >> >> > > copy >> >> > > > of >> >> > > > > > > configuration files per flink cluster, and make the task >> >> executor >> >> > > to >> >> > > > > load >> >> > > > > > > from the copy, and clean up the copy after the cluster is >> >> > shutdown, >> >> > > > > which >> >> > > > > > > is complicated. (I think this is also what Stephan means in >> >> his >> >> > > > comment >> >> > > > > > on >> >> > > > > > > the wiki page?) >> >> > > > > > > - Regarding reserving memory, I think this change should be >> >> > > included >> >> > > > in >> >> > > > > > > this FLIP. I think a big part of motivations of this FLIP >> is >> >> to >> >> > > unify >> >> > > > > > > memory configuration for streaming / batch and make it easy >> >> for >> >> > > > > > configuring >> >> > > > > > > rocksdb memory. If we don't support memory reservation, >> then >> >> > > > streaming >> >> > > > > > jobs >> >> > > > > > > cannot use managed memory (neither on-heap or off-heap), >> which >> >> > > makes >> >> > > > > this >> >> > > > > > > FLIP incomplete. >> >> > > > > > > - Regarding network memory, I think you are right. I think >> we >> >> > > > probably >> >> > > > > > > don't need to change network stack from using direct >> memory to >> >> > > using >> >> > > > > > unsafe >> >> > > > > > > native memory. Network memory size is deterministic, >> cannot be >> >> > > > reserved >> >> > > > > > as >> >> > > > > > > managed memory does, and cannot be overused. I think it >> also >> >> > works >> >> > > if >> >> > > > > we >> >> > > > > > > simply keep using direct memory for network and include it >> in >> >> jvm >> >> > > max >> >> > > > > > > direct memory size. >> >> > > > > > > >> >> > > > > > > Thank you~ >> >> > > > > > > >> >> > > > > > > Xintong Song >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < >> >> > > [hidden email]> >> >> > > > > > > wrote: >> >> > > > > > > >> >> > > > > > > > Hi Xintong, >> >> > > > > > > > >> >> > > > > > > > thanks for addressing the comments and adding a more >> >> detailed >> >> > > > > > > > implementation plan. I have a couple of comments >> concerning >> >> the >> >> > > > > > > > implementation plan: >> >> > > > > > > > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really >> >> descriptive. >> >> > > > > Choosing >> >> > > > > > a >> >> > > > > > > > different name could help here. >> >> > > > > > > > - I'm not sure whether I would pass the memory >> >> configuration to >> >> > > the >> >> > > > > > > > TaskExecutor via environment variables. I think it would >> be >> >> > > better >> >> > > > to >> >> > > > > > > write >> >> > > > > > > > it into the configuration one uses to start the TM >> process. >> >> > > > > > > > - If possible, I would exclude the memory reservation >> from >> >> this >> >> > > > FLIP >> >> > > > > > and >> >> > > > > > > > add this as part of a dedicated FLIP. >> >> > > > > > > > - If possible, then I would exclude changes to the >> network >> >> > stack >> >> > > > from >> >> > > > > > > this >> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory >> needed >> >> by >> >> > > the >> >> > > > > > > network >> >> > > > > > > > stack is the framework direct memory requirement. >> Changing >> >> how >> >> > > the >> >> > > > > > memory >> >> > > > > > > > is allocated can happen in a second step. This would keep >> >> the >> >> > > scope >> >> > > > > of >> >> > > > > > > this >> >> > > > > > > > FLIP smaller. >> >> > > > > > > > >> >> > > > > > > > Cheers, >> >> > > > > > > > Till >> >> > > > > > > > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < >> >> > > > [hidden email]> >> >> > > > > > > > wrote: >> >> > > > > > > > >> >> > > > > > > > > Hi everyone, >> >> > > > > > > > > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], with the >> >> > > following >> >> > > > > > > changes. >> >> > > > > > > > > >> >> > > > > > > > > - Removed open question regarding MemorySegment >> >> > allocation. >> >> > > As >> >> > > > > > > > > discussed, we exclude this topic from the scope of >> this >> >> > > FLIP. >> >> > > > > > > > > - Updated content about JVM direct memory parameter >> >> > > according >> >> > > > to >> >> > > > > > > > recent >> >> > > > > > > > > discussions, and moved the other options to >> "Rejected >> >> > > > > > Alternatives" >> >> > > > > > > > for >> >> > > > > > > > > the >> >> > > > > > > > > moment. >> >> > > > > > > > > - Added implementation steps. >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > Thank you~ >> >> > > > > > > > > >> >> > > > > > > > > Xintong Song >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > [1] >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> >> > > > > > > > > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < >> >> > [hidden email] >> >> > > > >> >> > > > > > wrote: >> >> > > > > > > > > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users before >> task >> >> > > dispose >> >> > > > > and >> >> > > > > > > > > memory >> >> > > > > > > > > > release": I agree, that's how it should be. Let's >> try it >> >> > out. >> >> > > > > > > > > > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait >> for >> >> GC >> >> > > when >> >> > > > > > > > allocating >> >> > > > > > > > > > direct memory buffer": There seems to be pretty >> >> elaborate >> >> > > logic >> >> > > > > to >> >> > > > > > > free >> >> > > > > > > > > > buffers when allocating new ones. See >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 >> >> > > > > > > > > > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default works >> >> (like >> >> > > > going >> >> > > > > > > with >> >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at >> >> all), >> >> > > > then >> >> > > > > I >> >> > > > > > > > think >> >> > > > > > > > > it >> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we >> use >> >> > > > RocksDB. >> >> > > > > > > That >> >> > > > > > > > > is a >> >> > > > > > > > > > big if, though, I honestly have no idea :D Would be >> >> good to >> >> > > > > > > understand >> >> > > > > > > > > > this, though, because this would affect option (2) >> and >> >> > option >> >> > > > > > (1.2). >> >> > > > > > > > > > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < >> >> > > > > > [hidden email]> >> >> > > > > > > > > > wrote: >> >> > > > > > > > > > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. >> >> > > > > > > > > > > >> >> > > > > > > > > > > Let me try to summarize your points. Please correct >> >> me if >> >> > > I'm >> >> > > > > > > wrong. >> >> > > > > > > > > > > >> >> > > > > > > > > > > - Memory consumers should always avoid returning >> >> > memory >> >> > > > > > segments >> >> > > > > > > > to >> >> > > > > > > > > > > memory manager while there are still un-cleaned >> >> > > > structures / >> >> > > > > > > > threads >> >> > > > > > > > > > > that >> >> > > > > > > > > > > may use the memory. Otherwise, it would cause >> >> serious >> >> > > > > problems >> >> > > > > > > by >> >> > > > > > > > > > having >> >> > > > > > > > > > > multiple consumers trying to use the same memory >> >> > > segment. >> >> > > > > > > > > > > - JVM does not wait for GC when allocating >> direct >> >> > memory >> >> > > > > > buffer. >> >> > > > > > > > > > > Therefore even we set proper max direct memory >> size >> >> > > limit, >> >> > > > > we >> >> > > > > > > may >> >> > > > > > > > > > still >> >> > > > > > > > > > > encounter direct memory oom if the GC cleaning >> >> memory >> >> > > > slower >> >> > > > > > > than >> >> > > > > > > > > the >> >> > > > > > > > > > > direct memory allocation. >> >> > > > > > > > > > > >> >> > > > > > > > > > > Am I understanding this correctly? >> >> > > > > > > > > > > >> >> > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > >> >> > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < >> >> > > > > > > [hidden email] >> >> > > > > > > > > > > .invalid> >> >> > > > > > > > > > > wrote: >> >> > > > > > > > > > > >> >> > > > > > > > > > > > Hi stephan: >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > About option 2: >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > if additional threads not cleanly shut down >> before >> >> we >> >> > can >> >> > > > > exit >> >> > > > > > > the >> >> > > > > > > > > > task: >> >> > > > > > > > > > > > In the current case of memory reuse, it has >> freed up >> >> > the >> >> > > > > memory >> >> > > > > > > it >> >> > > > > > > > > > > > uses. If this memory is used by other tasks and >> >> > > > asynchronous >> >> > > > > > > > threads >> >> > > > > > > > > > > > of exited task may still be writing, there will >> be >> >> > > > > concurrent >> >> > > > > > > > > security >> >> > > > > > > > > > > > problems, and even lead to errors in user >> computing >> >> > > > results. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > So I think this is a serious and intolerable >> bug, No >> >> > > matter >> >> > > > > > what >> >> > > > > > > > the >> >> > > > > > > > > > > > option is, it should be avoided. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > About direct memory cleaned by GC: >> >> > > > > > > > > > > > I don't think it is a good idea, I've >> encountered so >> >> > many >> >> > > > > > > > situations >> >> > > > > > > > > > > > that it's too late for GC to cause DirectMemory >> >> OOM. >> >> > > > Release >> >> > > > > > and >> >> > > > > > > > > > > > allocate DirectMemory depend on the type of user >> >> job, >> >> > > > which >> >> > > > > is >> >> > > > > > > > > > > > often beyond our control. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Best, >> >> > > > > > > > > > > > Jingsong Lee >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > >> >> > ------------------------------------------------------------------ >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 >> >> > > > > > > > > > > > To:dev <[hidden email]> >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory >> >> > > Configuration >> >> > > > > for >> >> > > > > > > > > > > > TaskExecutors >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > My main concern with option 2 (manually release >> >> memory) >> >> > > is >> >> > > > > that >> >> > > > > > > > > > segfaults >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user >> >> ends. >> >> > So >> >> > > we >> >> > > > > > need >> >> > > > > > > to >> >> > > > > > > > > > > > guarantee that this never happens. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data >> >> structures / >> >> > > > > > algorithms >> >> > > > > > > > > with >> >> > > > > > > > > > > > additional threads, like hash table spill/read >> and >> >> > > sorting >> >> > > > > > > threads. >> >> > > > > > > > > We >> >> > > > > > > > > > > need >> >> > > > > > > > > > > > to ensure that these cleanly shut down before we >> can >> >> > exit >> >> > > > the >> >> > > > > > > task. >> >> > > > > > > > > > > > I am not sure that we have that guaranteed >> already, >> >> > > that's >> >> > > > > why >> >> > > > > > > > option >> >> > > > > > > > > > 1.1 >> >> > > > > > > > > > > > seemed simpler to me. >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < >> >> > > > > > > > [hidden email]> >> >> > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in >> >> this >> >> > > way >> >> > > > > > really >> >> > > > > > > > > makes >> >> > > > > > > > > > > > > things easier to understand. >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the >> >> moment. I >> >> > > > think >> >> > > > > it >> >> > > > > > > is >> >> > > > > > > > > not >> >> > > > > > > > > > > that >> >> > > > > > > > > > > > > difficult to keep it segfault safe for memory >> >> > manager, >> >> > > as >> >> > > > > > long >> >> > > > > > > as >> >> > > > > > > > > we >> >> > > > > > > > > > > > always >> >> > > > > > > > > > > > > de-allocate the memory segment when it is >> released >> >> > from >> >> > > > the >> >> > > > > > > > memory >> >> > > > > > > > > > > > > consumers. Only if the memory consumer continue >> >> using >> >> > > the >> >> > > > > > > buffer >> >> > > > > > > > of >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > segment after releasing it, in which case we do >> >> want >> >> > > the >> >> > > > > job >> >> > > > > > to >> >> > > > > > > > > fail >> >> > > > > > > > > > so >> >> > > > > > > > > > > > we >> >> > > > > > > > > > > > > detect the memory leak early. >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a good >> idea. >> >> > Not >> >> > > > only >> >> > > > > > > > because >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > assumption (regular GC is enough to clean >> direct >> >> > > buffers) >> >> > > > > may >> >> > > > > > > not >> >> > > > > > > > > > > always >> >> > > > > > > > > > > > be >> >> > > > > > > > > > > > > true, but also it makes harder for finding >> >> problems >> >> > in >> >> > > > > cases >> >> > > > > > of >> >> > > > > > > > > > memory >> >> > > > > > > > > > > > > overuse. E.g., user configured some direct >> memory >> >> for >> >> > > the >> >> > > > > > user >> >> > > > > > > > > > > libraries. >> >> > > > > > > > > > > > > If the library actually use more direct memory >> >> then >> >> > > > > > configured, >> >> > > > > > > > > which >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are still >> in >> >> > use, >> >> > > > may >> >> > > > > > lead >> >> > > > > > > > to >> >> > > > > > > > > > > > overuse >> >> > > > > > > > > > > > > of the total container memory. In that case, >> if it >> >> > > didn't >> >> > > > > > touch >> >> > > > > > > > the >> >> > > > > > > > > > JVM >> >> > > > > > > > > > > > > default max direct memory limit, we cannot get >> a >> >> > direct >> >> > > > > > memory >> >> > > > > > > > OOM >> >> > > > > > > > > > and >> >> > > > > > > > > > > it >> >> > > > > > > > > > > > > will become super hard to understand which >> part of >> >> > the >> >> > > > > > > > > configuration >> >> > > > > > > > > > > need >> >> > > > > > > > > > > > > to be updated. >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.1, it has the similar problem as >> >> 1.2, if >> >> > > the >> >> > > > > > > > exceeded >> >> > > > > > > > > > > direct >> >> > > > > > > > > > > > > memory does not reach the max direct memory >> limit >> >> > > > specified >> >> > > > > > by >> >> > > > > > > > the >> >> > > > > > > > > > > > > dedicated parameter. I think it is slightly >> better >> >> > than >> >> > > > > 1.2, >> >> > > > > > > only >> >> > > > > > > > > > > because >> >> > > > > > > > > > > > > we can tune the parameter. >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen < >> >> > > > > > [hidden email] >> >> > > > > > > > >> >> > > > > > > > > > wrote: >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" >> discussion, >> >> > maybe >> >> > > > let >> >> > > > > > me >> >> > > > > > > > > > > summarize >> >> > > > > > > > > > > > > it a >> >> > > > > > > > > > > > > > bit differently: >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > We have the following two options: >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by >> the >> >> > GC. >> >> > > > That >> >> > > > > > > makes >> >> > > > > > > > > it >> >> > > > > > > > > > > > > segfault >> >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC in >> >> case >> >> > > > > > > > de-allocation >> >> > > > > > > > > > and >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments happens >> >> > quickly, >> >> > > > > which >> >> > > > > > > is >> >> > > > > > > > > > often >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > case during batch scheduling or task restart. >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option >> 1.1) >> >> is >> >> > one >> >> > > > way >> >> > > > > > to >> >> > > > > > > do >> >> > > > > > > > > > this >> >> > > > > > > > > > > > > > - Another way could be to have a dedicated >> >> > > > bookkeeping >> >> > > > > in >> >> > > > > > > the >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a >> >> > number >> >> > > > > > > > independent >> >> > > > > > > > > of >> >> > > > > > > > > > > the >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the >> >> memory >> >> > > for >> >> > > > > the >> >> > > > > > > > > > > > > MemorySegments >> >> > > > > > > > > > > > > > (option 2). That way we need not worry about >> >> > > triggering >> >> > > > > GC >> >> > > > > > by >> >> > > > > > > > > some >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to >> >> > prevent >> >> > > > > > > > segfaults. >> >> > > > > > > > > We >> >> > > > > > > > > > > > need >> >> > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > be very careful about when we release the >> memory >> >> > > > segments >> >> > > > > > > (only >> >> > > > > > > > > in >> >> > > > > > > > > > > the >> >> > > > > > > > > > > > > > cleanup phase of the main thread). >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably need to >> >> set >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to >> >> > > "off_heap_managed_memory + >> >> > > > > > > > > > > direct_memory" >> >> > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved >> >> memory >> >> > > > pool. >> >> > > > > > > > Because >> >> > > > > > > > > if >> >> > > > > > > > > > > we >> >> > > > > > > > > > > > > just >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to >> >> > > > > "off_heap_managed_memory + >> >> > > > > > > > > > > > > jvm_overhead", >> >> > > > > > > > > > > > > > then there will be times when that entire >> >> memory is >> >> > > > > > allocated >> >> > > > > > > > by >> >> > > > > > > > > > > direct >> >> > > > > > > > > > > > > > buffers and we have nothing left for the JVM >> >> > > overhead. >> >> > > > So >> >> > > > > > we >> >> > > > > > > > > either >> >> > > > > > > > > > > > need >> >> > > > > > > > > > > > > a >> >> > > > > > > > > > > > > > way to compensate for that (again some safety >> >> > margin >> >> > > > > cutoff >> >> > > > > > > > > value) >> >> > > > > > > > > > or >> >> > > > > > > > > > > > we >> >> > > > > > > > > > > > > > will exceed container memory. >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be aware >> >> that >> >> > it >> >> > > > > takes >> >> > > > > > > > > > elaborate >> >> > > > > > > > > > > > > logic >> >> > > > > > > > > > > > > > to push recycling of direct buffers without >> >> always >> >> > > > > > > triggering a >> >> > > > > > > > > > full >> >> > > > > > > > > > > > GC. >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > My first guess is that the options will be >> >> easiest >> >> > to >> >> > > > do >> >> > > > > in >> >> > > > > > > the >> >> > > > > > > > > > > > following >> >> > > > > > > > > > > > > > order: >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated direct_memory >> >> > > > parameter, >> >> > > > > as >> >> > > > > > > > > > discussed >> >> > > > > > > > > > > > > > above. We would need to find a way to set the >> >> > > > > direct_memory >> >> > > > > > > > > > parameter >> >> > > > > > > > > > > > by >> >> > > > > > > > > > > > > > default. We could start with 64 MB and see >> how >> >> it >> >> > > goes >> >> > > > in >> >> > > > > > > > > practice. >> >> > > > > > > > > > > One >> >> > > > > > > > > > > > > > danger I see is that setting this loo low can >> >> > cause a >> >> > > > > bunch >> >> > > > > > > of >> >> > > > > > > > > > > > additional >> >> > > > > > > > > > > > > > GCs compared to before (we need to watch this >> >> > > > carefully). >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite simple to >> >> > > implement, >> >> > > > > we >> >> > > > > > > > could >> >> > > > > > > > > > try >> >> > > > > > > > > > > > how >> >> > > > > > > > > > > > > > segfault safe we are at the moment. >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the >> >> > > > > > > > "-XX:MaxDirectMemorySize" >> >> > > > > > > > > > > > > parameter >> >> > > > > > > > > > > > > > at all and assume that all the direct memory >> >> > > > allocations >> >> > > > > > that >> >> > > > > > > > the >> >> > > > > > > > > > JVM >> >> > > > > > > > > > > > and >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned >> up >> >> > fast >> >> > > > > enough >> >> > > > > > > > > through >> >> > > > > > > > > > > > > regular >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid >> assumption, >> >> > > > though. >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > Best, >> >> > > > > > > > > > > > > > Stephan >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song >> < >> >> > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was >> >> > wondering >> >> > > > > > whether >> >> > > > > > > > we >> >> > > > > > > > > > can >> >> > > > > > > > > > > > > avoid >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap >> managed >> >> > memory >> >> > > > and >> >> > > > > > > > network >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > with >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a second >> >> > > thought, >> >> > > > I >> >> > > > > > > think >> >> > > > > > > > > even >> >> > > > > > > > > > > for >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for >> off-heap >> >> > > > managed >> >> > > > > > > memory >> >> > > > > > > > > > could >> >> > > > > > > > > > > > > cause >> >> > > > > > > > > > > > > > > problems. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what >> proposed >> >> in >> >> > > this >> >> > > > > > FLIP >> >> > > > > > > it >> >> > > > > > > > > to >> >> > > > > > > > > > > have >> >> > > > > > > > > > > > > > both >> >> > > > > > > > > > > > > > > off-heap managed memory and network memory >> >> > > allocated >> >> > > > > > > through >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are >> >> > practically >> >> > > > > > native >> >> > > > > > > > > memory >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only >> >> parts >> >> > of >> >> > > > > > memory >> >> > > > > > > > > > limited >> >> > > > > > > > > > > by >> >> > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > max direct memory are task off-heap memory >> and >> >> > JVM >> >> > > > > > > overhead, >> >> > > > > > > > > > which >> >> > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the >> JVM >> >> max >> >> > > > > direct >> >> > > > > > > > memory >> >> > > > > > > > > > to. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till >> Rohrmann >> >> < >> >> > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I >> >> > > understand >> >> > > > > the >> >> > > > > > > two >> >> > > > > > > > > > > > > alternatives >> >> > > > > > > > > > > > > > > > now. >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 because >> it >> >> > makes >> >> > > > > > things >> >> > > > > > > > > > > explicit. >> >> > > > > > > > > > > > If >> >> > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear >> that >> >> we >> >> > > might >> >> > > > > end >> >> > > > > > > up >> >> > > > > > > > > in a >> >> > > > > > > > > > > > > similar >> >> > > > > > > > > > > > > > > > situation as we are currently in: The >> user >> >> > might >> >> > > > see >> >> > > > > > that >> >> > > > > > > > her >> >> > > > > > > > > > > > process >> >> > > > > > > > > > > > > > > gets >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why >> this >> >> is >> >> > > the >> >> > > > > > case. >> >> > > > > > > > > > > > > Consequently, >> >> > > > > > > > > > > > > > > she >> >> > > > > > > > > > > > > > > > tries to decrease the process memory size >> >> > > (similar >> >> > > > to >> >> > > > > > > > > > increasing >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > cutoff >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the >> extra >> >> > > direct >> >> > > > > > > memory. >> >> > > > > > > > > > Even >> >> > > > > > > > > > > > > worse, >> >> > > > > > > > > > > > > > > she >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which >> are >> >> not >> >> > > > fully >> >> > > > > > used >> >> > > > > > > > and >> >> > > > > > > > > > > hence >> >> > > > > > > > > > > > > > won't >> >> > > > > > > > > > > > > > > > change the overall memory consumption. >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong >> >> Song < >> >> > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete >> >> example >> >> > > Till. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following >> scenario. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap >> Memory + >> >> JVM >> >> > > > > > > Overhead): >> >> > > > > > > > > > 200MB >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> >> Metaspace, >> >> > > > > > Off-Heap >> >> > > > > > > > > > Managed >> >> > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set >> >> > > -XX:MaxDirectMemorySize >> >> > > > > to >> >> > > > > > > > 200MB. >> >> > > > > > > > > > > > > > > > > For alternative 3, we set >> >> > > -XX:MaxDirectMemorySize >> >> > > > > to >> >> > > > > > a >> >> > > > > > > > very >> >> > > > > > > > > > > large >> >> > > > > > > > > > > > > > > value, >> >> > > > > > > > > > > > > > > > > let's say 1TB. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of >> Task >> >> > > > Off-Heap >> >> > > > > > > Memory >> >> > > > > > > > > and >> >> > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > Overhead >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 >> >> and >> >> > > > > > > alternative 3 >> >> > > > > > > > > > > should >> >> > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > same utility. Setting larger >> >> > > > > -XX:MaxDirectMemorySize >> >> > > > > > > will >> >> > > > > > > > > not >> >> > > > > > > > > > > > > reduce >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of >> Task >> >> > > > Off-Heap >> >> > > > > > > Memory >> >> > > > > > > > > and >> >> > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from >> frequent >> >> OOM. >> >> > > To >> >> > > > > > avoid >> >> > > > > > > > > that, >> >> > > > > > > > > > > the >> >> > > > > > > > > > > > > only >> >> > > > > > > > > > > > > > > > thing >> >> > > > > > > > > > > > > > > > > user can do is to modify the >> >> configuration >> >> > > and >> >> > > > > > > > increase >> >> > > > > > > > > > JVM >> >> > > > > > > > > > > > > Direct >> >> > > > > > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM >> Overhead). >> >> > Let's >> >> > > > say >> >> > > > > > > that >> >> > > > > > > > > user >> >> > > > > > > > > > > > > > increases >> >> > > > > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will >> >> reduce >> >> > the >> >> > > > > total >> >> > > > > > > > size >> >> > > > > > > > > of >> >> > > > > > > > > > > > other >> >> > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total >> process >> >> > > memory >> >> > > > > > > remains >> >> > > > > > > > > > 1GB. >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no >> >> chance of >> >> > > > > direct >> >> > > > > > > OOM. >> >> > > > > > > > > > There >> >> > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > chances >> >> > > > > > > > > > > > > > > > > of exceeding the total process >> memory >> >> > limit, >> >> > > > but >> >> > > > > > > given >> >> > > > > > > > > > that >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > process >> >> > > > > > > > > > > > > > > > > may >> >> > > > > > > > > > > > > > > > > not use up all the reserved native >> >> memory >> >> > > > > > (Off-Heap >> >> > > > > > > > > > Managed >> >> > > > > > > > > > > > > > Memory, >> >> > > > > > > > > > > > > > > > > Network >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the >> actual >> >> > direct >> >> > > > > > memory >> >> > > > > > > > > usage >> >> > > > > > > > > > is >> >> > > > > > > > > > > > > > > slightly >> >> > > > > > > > > > > > > > > > > above >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user >> probably >> >> do >> >> > > not >> >> > > > > need >> >> > > > > > > to >> >> > > > > > > > > > change >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > configurations. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> >> > > perspective, a >> >> > > > > > > > feasible >> >> > > > > > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower >> >> resource >> >> > > > > > > utilization >> >> > > > > > > > > > > compared >> >> > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > alternative 3. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till >> >> > Rohrmann >> >> > > < >> >> > > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me >> understand >> >> the >> >> > > > > > difference >> >> > > > > > > > > > between >> >> > > > > > > > > > > > > > > > > alternative 2 >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization >> >> > > Xintong. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> >> XX:MaxDirectMemorySize >> >> > > to >> >> > > > > Task >> >> > > > > > > > > > Off-Heap >> >> > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that >> >> this >> >> > > size >> >> > > > > is >> >> > > > > > > too >> >> > > > > > > > > low >> >> > > > > > > > > > > > > > resulting >> >> > > > > > > > > > > > > > > > in a >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> >> potentially >> >> > an >> >> > > > OOM. >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> >> XX:MaxDirectMemorySize >> >> > > to >> >> > > > > > > > something >> >> > > > > > > > > > > larger >> >> > > > > > > > > > > > > > than >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course >> >> reduce >> >> > > the >> >> > > > > > sizes >> >> > > > > > > of >> >> > > > > > > > > the >> >> > > > > > > > > > > > other >> >> > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > types. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result >> in an >> >> > > under >> >> > > > > > > > > utilization >> >> > > > > > > > > > of >> >> > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> >> alternative 3 >> >> > > > > > strictly >> >> > > > > > > > > sets a >> >> > > > > > > > > > > > > higher >> >> > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only >> >> little, >> >> > > > then I >> >> > > > > > > would >> >> > > > > > > > > > > expect >> >> > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under >> >> > > > > utilization. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang >> >> Wang < >> >> > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large >> max >> >> > direct >> >> > > > > > memory >> >> > > > > > > > size >> >> > > > > > > > > > > when >> >> > > > > > > > > > > > we >> >> > > > > > > > > > > > > > do >> >> > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native >> >> memory. >> >> > If >> >> > > > the >> >> > > > > > > direct >> >> > > > > > > > > > > > > > > > memory,including >> >> > > > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > direct memory and framework direct >> >> > > > memory,could >> >> > > > > > be >> >> > > > > > > > > > > calculated >> >> > > > > > > > > > > > > > > > > > > correctly,then >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct >> memory >> >> > with >> >> > > > > fixed >> >> > > > > > > > > value. >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and >> >> k8s,we >> >> > > > need >> >> > > > > to >> >> > > > > > > > check >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid >> >> > > submitting >> >> > > > > > > > > successfully >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > > > failing >> >> > > > > > > > > > > > > > > > > in >> >> > > > > > > > > > > > > > > > > > > the flink master. >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < >> [hidden email] >> >> > > > > >于2019年8月13日 >> >> > > > > > > > > > 周二22:07写道: >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you >> are >> >> > > right >> >> > > > > that >> >> > > > > > > we >> >> > > > > > > > > > should >> >> > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > include >> >> > > > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. >> >> This >> >> > > FLIP >> >> > > > > > should >> >> > > > > > > > > > > > concentrate >> >> > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> >> > TaskExecutors, >> >> > > > > with >> >> > > > > > > > > minimum >> >> > > > > > > > > > > > > > > involvement >> >> > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> >> > alternative >> >> > > 3 >> >> > > > > may >> >> > > > > > > not >> >> > > > > > > > > > having >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > same >> >> > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > reservation issue that >> alternative 2 >> >> > > does, >> >> > > > > but >> >> > > > > > at >> >> > > > > > > > the >> >> > > > > > > > > > > cost >> >> > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > risk >> >> > > > > > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > using memory at the container >> level, >> >> > > which >> >> > > > is >> >> > > > > > not >> >> > > > > > > > > good. >> >> > > > > > > > > > > My >> >> > > > > > > > > > > > > > point >> >> > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and >> "JVM >> >> > > > > Overhead" >> >> > > > > > > are >> >> > > > > > > > > not >> >> > > > > > > > > > > easy >> >> > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > config. >> >> > > > > > > > > > > > > > > > > > > For >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might >> configure >> >> > them >> >> > > > > > higher >> >> > > > > > > > than >> >> > > > > > > > > > > what >> >> > > > > > > > > > > > > > > actually >> >> > > > > > > > > > > > > > > > > > > needed, >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct >> OOM. >> >> For >> >> > > > > > > alternative >> >> > > > > > > > > 3, >> >> > > > > > > > > > > > users >> >> > > > > > > > > > > > > do >> >> > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > get >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not >> config >> >> the >> >> > > two >> >> > > > > > > options >> >> > > > > > > > > > > > > aggressively >> >> > > > > > > > > > > > > > > > high. >> >> > > > > > > > > > > > > > > > > > But >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of >> >> overall >> >> > > > > container >> >> > > > > > > > > memory >> >> > > > > > > > > > > > usage >> >> > > > > > > > > > > > > > > > exceeds >> >> > > > > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > > > > budget. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM >> Till >> >> > > > > Rohrmann < >> >> > > > > > > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP >> >> > Xintong. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already >> >> looks >> >> > > quite >> >> > > > > > good. >> >> > > > > > > > > > > > Concerning >> >> > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > first >> >> > > > > > > > > > > > > > > > > > > open >> >> > > > > > > > > > > > > > > > > > > > > question about allocating >> memory >> >> > > > segments, >> >> > > > > I >> >> > > > > > > was >> >> > > > > > > > > > > > wondering >> >> > > > > > > > > > > > > > > > whether >> >> > > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the >> >> > context >> >> > > > of >> >> > > > > > this >> >> > > > > > > > > FLIP >> >> > > > > > > > > > or >> >> > > > > > > > > > > > > > whether >> >> > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without >> >> > knowing >> >> > > > all >> >> > > > > > > > > details, >> >> > > > > > > > > > I >> >> > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > > concerned >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope >> of >> >> this >> >> > > > FLIP >> >> > > > > > too >> >> > > > > > > > much >> >> > > > > > > > > > > > because >> >> > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call >> >> sites >> >> > of >> >> > > > the >> >> > > > > > > > > > > MemoryManager >> >> > > > > > > > > > > > > > where >> >> > > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > > > allocate >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should >> >> mainly >> >> > be >> >> > > > > batch >> >> > > > > > > > > > > operators). >> >> > > > > > > > > > > > > The >> >> > > > > > > > > > > > > > > > > addition >> >> > > > > > > > > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to >> the >> >> > > > > > > MemoryManager >> >> > > > > > > > > > should >> >> > > > > > > > > > > > not >> >> > > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > affected >> >> > > > > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that >> this is >> >> > the >> >> > > > only >> >> > > > > > > point >> >> > > > > > > > > of >> >> > > > > > > > > > > > > > > interaction >> >> > > > > > > > > > > > > > > > a >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with >> the >> >> > > > > > > MemoryManager. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> >> question >> >> > > about >> >> > > > > > > setting >> >> > > > > > > > > or >> >> > > > > > > > > > > not >> >> > > > > > > > > > > > > > > setting >> >> > > > > > > > > > > > > > > > a >> >> > > > > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would >> also >> >> be >> >> > > > > > interested >> >> > > > > > > > why >> >> > > > > > > > > > > Yang >> >> > > > > > > > > > > > > Wang >> >> > > > > > > > > > > > > > > > > thinks >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. >> My >> >> > > concern >> >> > > > > > about >> >> > > > > > > > > this >> >> > > > > > > > > > > > would >> >> > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we >> >> are >> >> > now >> >> > > > > with >> >> > > > > > > the >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> >> > > > > > > > > > > > > > > > > > > If >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools are >> not >> >> > > > clearly >> >> > > > > > > > > separated >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > can >> >> > > > > > > > > > > > > > > > spill >> >> > > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is >> quite >> >> > hard >> >> > > > to >> >> > > > > > > > > understand >> >> > > > > > > > > > > > what >> >> > > > > > > > > > > > > > > > exactly >> >> > > > > > > > > > > > > > > > > > > > causes a >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for using >> >> too >> >> > > much >> >> > > > > > > memory. >> >> > > > > > > > > This >> >> > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > then >> >> > > > > > > > > > > > > > > > > > easily >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation >> what >> >> we >> >> > > have >> >> > > > > with >> >> > > > > > > the >> >> > > > > > > > > > > > > > cutoff-ratio. >> >> > > > > > > > > > > > > > > > So >> >> > > > > > > > > > > > > > > > > > why >> >> > > > > > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value >> for >> >> max >> >> > > > direct >> >> > > > > > > > memory >> >> > > > > > > > > > and >> >> > > > > > > > > > > > > giving >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > an >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he >> runs >> >> into >> >> > > an >> >> > > > > OOM. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would >> alternative 2 >> >> > lead >> >> > > to >> >> > > > > > lower >> >> > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > utilization >> >> > > > > > > > > > > > > > > > > > than >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the >> >> direct >> >> > > > > memory >> >> > > > > > > to a >> >> > > > > > > > > > > higher >> >> > > > > > > > > > > > > > value? >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM >> >> > Xintong >> >> > > > > Song < >> >> > > > > > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, >> Yang. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large >> max >> >> > > direct >> >> > > > > > > memory >> >> > > > > > > > > size >> >> > > > > > > > > > > > > > > definitely >> >> > > > > > > > > > > > > > > > > has >> >> > > > > > > > > > > > > > > > > > > some >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not >> >> worry >> >> > > about >> >> > > > > > > direct >> >> > > > > > > > > OOM, >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > don't >> >> > > > > > > > > > > > > > > > > > even >> >> > > > > > > > > > > > > > > > > > > > > need >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network >> >> > memory >> >> > > > with >> >> > > > > > > > > > > > > > Unsafe.allocate() . >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also some >> >> down >> >> > > sides >> >> > > > > of >> >> > > > > > > > doing >> >> > > > > > > > > > > this. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think >> of is >> >> > that >> >> > > > if >> >> > > > > a >> >> > > > > > > task >> >> > > > > > > > > > > > executor >> >> > > > > > > > > > > > > > > > > container >> >> > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing >> >> memory, >> >> > it >> >> > > > > could >> >> > > > > > > be >> >> > > > > > > > > hard >> >> > > > > > > > > > > for >> >> > > > > > > > > > > > > use >> >> > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > know >> >> > > > > > > > > > > > > > > > > > > > which >> >> > > > > > > > > > > > > > > > > > > > > > part >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is >> that >> >> the >> >> > > JVM >> >> > > > > > never >> >> > > > > > > > > > trigger >> >> > > > > > > > > > > GC >> >> > > > > > > > > > > > > due >> >> > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > reaching >> >> > > > > > > > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, >> because >> >> the >> >> > > > limit >> >> > > > > > is >> >> > > > > > > > too >> >> > > > > > > > > > high >> >> > > > > > > > > > > > to >> >> > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > reached. >> >> > > > > > > > > > > > > > > > > > > > That >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on >> >> heap >> >> > > > memory >> >> > > > > to >> >> > > > > > > > > trigger >> >> > > > > > > > > > > GC >> >> > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > release >> >> > > > > > > > > > > > > > > > > > > > direct >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a >> >> problem >> >> > in >> >> > > > > cases >> >> > > > > > > > where >> >> > > > > > > > > > we >> >> > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > more >> >> > > > > > > > > > > > > > > > > > direct >> >> > > > > > > > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap >> >> > activity >> >> > > > to >> >> > > > > > > > trigger >> >> > > > > > > > > > the >> >> > > > > > > > > > > > GC. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your >> reasons >> >> > for >> >> > > > > > > preferring >> >> > > > > > > > > > > > setting a >> >> > > > > > > > > > > > > > > very >> >> > > > > > > > > > > > > > > > > > large >> >> > > > > > > > > > > > > > > > > > > > > value, >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I >> >> > > > overlooked. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict >> between >> >> > > > multiple >> >> > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I >> think we >> >> > > should >> >> > > > > > throw >> >> > > > > > > > an >> >> > > > > > > > > > > error. >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the >> >> > client >> >> > > > side >> >> > > > > > is >> >> > > > > > > a >> >> > > > > > > > > good >> >> > > > > > > > > > > > idea, >> >> > > > > > > > > > > > > > so >> >> > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > > > > Yarn / >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the >> problem >> >> > > before >> >> > > > > > > > submitting >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > Flink >> >> > > > > > > > > > > > > > > > > > cluster, >> >> > > > > > > > > > > > > > > > > > > > > which >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on >> the >> >> > > client >> >> > > > > side >> >> > > > > > > > > > checking, >> >> > > > > > > > > > > > > > because >> >> > > > > > > > > > > > > > > > for >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster >> TaskManagers >> >> on >> >> > > > > > different >> >> > > > > > > > > > machines >> >> > > > > > > > > > > > may >> >> > > > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > > > > > > different >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the client >> >> does >> >> > > see >> >> > > > > > that. >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 >> PM >> >> Yang >> >> > > > Wang >> >> > > > > < >> >> > > > > > > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed >> >> > proposal. >> >> > > > > After >> >> > > > > > > all >> >> > > > > > > > > the >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more >> >> > > powerful >> >> > > > to >> >> > > > > > > > control >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > flink >> >> > > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > > > usage. I >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions >> about >> >> it. >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct >> Memory >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate >> user >> >> > direct >> >> > > > > > memory >> >> > > > > > > > and >> >> > > > > > > > > > > native >> >> > > > > > > > > > > > > > > memory. >> >> > > > > > > > > > > > > > > > > > They >> >> > > > > > > > > > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > > > > > > > all >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap >> >> memory. >> >> > > > > Right? >> >> > > > > > > So i >> >> > > > > > > > > > don’t >> >> > > > > > > > > > > > > think >> >> > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > > > > set >> >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize >> >> > > > properly. I >> >> > > > > > > > prefer >> >> > > > > > > > > > > > leaving >> >> > > > > > > > > > > > > > it a >> >> > > > > > > > > > > > > > > > > very >> >> > > > > > > > > > > > > > > > > > > > large >> >> > > > > > > > > > > > > > > > > > > > > > > value. >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and >> fine-grained >> >> > > > > > > memory(network >> >> > > > > > > > > > > memory, >> >> > > > > > > > > > > > > > > managed >> >> > > > > > > > > > > > > > > > > > > memory, >> >> > > > > > > > > > > > > > > > > > > > > > etc.) >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total >> process >> >> > > memory, >> >> > > > > how >> >> > > > > > do >> >> > > > > > > > we >> >> > > > > > > > > > deal >> >> > > > > > > > > > > > > with >> >> > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > > situation? >> >> > > > > > > > > > > > > > > > > > > > > > Do >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory >> >> > > > > configuration >> >> > > > > > > in >> >> > > > > > > > > > > client? >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> >> > > [hidden email]> >> >> > > > > > > > > > 于2019年8月7日周三 >> >> > > > > > > > > > > > > > > 下午10:14写道: >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a >> >> > > discussion >> >> > > > > > > thread >> >> > > > > > > > on >> >> > > > > > > > > > > > > "FLIP-49: >> >> > > > > > > > > > > > > > > > > Unified >> >> > > > > > > > > > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> >> > > > TaskExecutors"[1], >> >> > > > > > > where >> >> > > > > > > > we >> >> > > > > > > > > > > > > describe >> >> > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > improve >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> >> > > configurations. >> >> > > > > The >> >> > > > > > > > FLIP >> >> > > > > > > > > > > > document >> >> > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > mostly >> >> > > > > > > > > > > > > > > > > > > > based >> >> > > > > > > > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > > > > > > an >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> >> Management >> >> > > and >> >> > > > > > > > > > Configuration >> >> > > > > > > > > > > > > > > > > Reloaded"[2] >> >> > > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from >> follow-up >> >> > > > > discussions >> >> > > > > > > > both >> >> > > > > > > > > > > online >> >> > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > > offline. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses >> several >> >> > > > > > shortcomings >> >> > > > > > > of >> >> > > > > > > > > > > current >> >> > > > > > > > > > > > > > > (Flink >> >> > > > > > > > > > > > > > > > > 1.9) >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> >> > > configuration. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different >> configuration >> >> > for >> >> > > > > > > Streaming >> >> > > > > > > > > and >> >> > > > > > > > > > > > Batch. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and >> difficult >> >> > > > > > configuration >> >> > > > > > > of >> >> > > > > > > > > > > RocksDB >> >> > > > > > > > > > > > > in >> >> > > > > > > > > > > > > > > > > > Streaming. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, >> uncertain >> >> and >> >> > > > hard >> >> > > > > to >> >> > > > > > > > > > > understand. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the >> >> > problems >> >> > > > can >> >> > > > > > be >> >> > > > > > > > > > > summarized >> >> > > > > > > > > > > > > as >> >> > > > > > > > > > > > > > > > > follows. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory >> manager >> >> to >> >> > > also >> >> > > > > > > account >> >> > > > > > > > > for >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > usage >> >> > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > state >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how >> TaskExecutor >> >> > > memory >> >> > > > > is >> >> > > > > > > > > > > partitioned >> >> > > > > > > > > > > > > > > > accounted >> >> > > > > > > > > > > > > > > > > > > > > individual >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations >> and >> >> > pools. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> >> > > configuration >> >> > > > > > > options >> >> > > > > > > > > and >> >> > > > > > > > > > > > > > > calculations >> >> > > > > > > > > > > > > > > > > > > logics. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details >> in >> >> the >> >> > > > FLIP >> >> > > > > > wiki >> >> > > > > > > > > > > document >> >> > > > > > > > > > > > > [1]. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the >> early >> >> > > design >> >> > > > > doc >> >> > > > > > > [2] >> >> > > > > > > > is >> >> > > > > > > > > > out >> >> > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > sync, >> >> > > > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > > it >> >> > > > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the >> >> > > discussion >> >> > > > in >> >> > > > > > > this >> >> > > > > > > > > > > mailing >> >> > > > > > > > > > > > > list >> >> > > > > > > > > > > > > > > > > > thread.) >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your >> >> > > feedbacks. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song >> < >> >> > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was >> >> > wondering >> >> > > > > > whether >> >> > > > > > > > we >> >> > > > > > > > > > can >> >> > > > > > > > > > > > > avoid >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap >> managed >> >> > memory >> >> > > > and >> >> > > > > > > > network >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > with >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a second >> >> > > thought, >> >> > > > I >> >> > > > > > > think >> >> > > > > > > > > even >> >> > > > > > > > > > > for >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for >> off-heap >> >> > > > managed >> >> > > > > > > memory >> >> > > > > > > > > > could >> >> > > > > > > > > > > > > cause >> >> > > > > > > > > > > > > > > problems. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what >> proposed >> >> in >> >> > > this >> >> > > > > > FLIP >> >> > > > > > > it >> >> > > > > > > > > to >> >> > > > > > > > > > > have >> >> > > > > > > > > > > > > > both >> >> > > > > > > > > > > > > > > off-heap managed memory and network memory >> >> > > allocated >> >> > > > > > > through >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are >> >> > practically >> >> > > > > > native >> >> > > > > > > > > memory >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only >> >> parts >> >> > of >> >> > > > > > memory >> >> > > > > > > > > > limited >> >> > > > > > > > > > > by >> >> > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > max direct memory are task off-heap memory >> and >> >> > JVM >> >> > > > > > > overhead, >> >> > > > > > > > > > which >> >> > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the >> JVM >> >> max >> >> > > > > direct >> >> > > > > > > > memory >> >> > > > > > > > > > to. >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till >> Rohrmann >> >> < >> >> > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I >> >> > > understand >> >> > > > > the >> >> > > > > > > two >> >> > > > > > > > > > > > > alternatives >> >> > > > > > > > > > > > > > > > now. >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 because >> it >> >> > makes >> >> > > > > > things >> >> > > > > > > > > > > explicit. >> >> > > > > > > > > > > > If >> >> > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear >> that >> >> we >> >> > > might >> >> > > > > end >> >> > > > > > > up >> >> > > > > > > > > in a >> >> > > > > > > > > > > > > similar >> >> > > > > > > > > > > > > > > > situation as we are currently in: The >> user >> >> > might >> >> > > > see >> >> > > > > > that >> >> > > > > > > > her >> >> > > > > > > > > > > > process >> >> > > > > > > > > > > > > > > gets >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why >> this >> >> is >> >> > > the >> >> > > > > > case. >> >> > > > > > > > > > > > > Consequently, >> >> > > > > > > > > > > > > > > she >> >> > > > > > > > > > > > > > > > tries to decrease the process memory size >> >> > > (similar >> >> > > > to >> >> > > > > > > > > > increasing >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > cutoff >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the >> extra >> >> > > direct >> >> > > > > > > memory. >> >> > > > > > > > > > Even >> >> > > > > > > > > > > > > worse, >> >> > > > > > > > > > > > > > > she >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which >> are >> >> not >> >> > > > fully >> >> > > > > > used >> >> > > > > > > > and >> >> > > > > > > > > > > hence >> >> > > > > > > > > > > > > > won't >> >> > > > > > > > > > > > > > > > change the overall memory consumption. >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong >> >> Song < >> >> > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete >> >> example >> >> > > Till. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following >> scenario. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap >> Memory + >> >> JVM >> >> > > > > > > Overhead): >> >> > > > > > > > > > 200MB >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> >> Metaspace, >> >> > > > > > Off-Heap >> >> > > > > > > > > > Managed >> >> > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set >> >> > > -XX:MaxDirectMemorySize >> >> > > > > to >> >> > > > > > > > 200MB. >> >> > > > > > > > > > > > > > > > > For alternative 3, we set >> >> > > -XX:MaxDirectMemorySize >> >> > > > > to >> >> > > > > > a >> >> > > > > > > > very >> >> > > > > > > > > > > large >> >> > > > > > > > > > > > > > > value, >> >> > > > > > > > > > > > > > > > > let's say 1TB. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of >> Task >> >> > > > Off-Heap >> >> > > > > > > Memory >> >> > > > > > > > > and >> >> > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > Overhead >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2 >> >> and >> >> > > > > > > alternative 3 >> >> > > > > > > > > > > should >> >> > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > same utility. Setting larger >> >> > > > > -XX:MaxDirectMemorySize >> >> > > > > > > will >> >> > > > > > > > > not >> >> > > > > > > > > > > > > reduce >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of >> Task >> >> > > > Off-Heap >> >> > > > > > > Memory >> >> > > > > > > > > and >> >> > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from >> frequent >> >> OOM. >> >> > > To >> >> > > > > > avoid >> >> > > > > > > > > that, >> >> > > > > > > > > > > the >> >> > > > > > > > > > > > > only >> >> > > > > > > > > > > > > > > > thing >> >> > > > > > > > > > > > > > > > > user can do is to modify the >> >> configuration >> >> > > and >> >> > > > > > > > increase >> >> > > > > > > > > > JVM >> >> > > > > > > > > > > > > Direct >> >> > > > > > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM >> Overhead). >> >> > Let's >> >> > > > say >> >> > > > > > > that >> >> > > > > > > > > user >> >> > > > > > > > > > > > > > increases >> >> > > > > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will >> >> reduce >> >> > the >> >> > > > > total >> >> > > > > > > > size >> >> > > > > > > > > of >> >> > > > > > > > > > > > other >> >> > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total >> process >> >> > > memory >> >> > > > > > > remains >> >> > > > > > > > > > 1GB. >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no >> >> chance of >> >> > > > > direct >> >> > > > > > > OOM. >> >> > > > > > > > > > There >> >> > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > chances >> >> > > > > > > > > > > > > > > > > of exceeding the total process >> memory >> >> > limit, >> >> > > > but >> >> > > > > > > given >> >> > > > > > > > > > that >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > process >> >> > > > > > > > > > > > > > > > > may >> >> > > > > > > > > > > > > > > > > not use up all the reserved native >> >> memory >> >> > > > > > (Off-Heap >> >> > > > > > > > > > Managed >> >> > > > > > > > > > > > > > Memory, >> >> > > > > > > > > > > > > > > > > Network >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the >> actual >> >> > direct >> >> > > > > > memory >> >> > > > > > > > > usage >> >> > > > > > > > > > is >> >> > > > > > > > > > > > > > > slightly >> >> > > > > > > > > > > > > > > > > above >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user >> probably >> >> do >> >> > > not >> >> > > > > need >> >> > > > > > > to >> >> > > > > > > > > > change >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > configurations. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> >> > > perspective, a >> >> > > > > > > > feasible >> >> > > > > > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower >> >> resource >> >> > > > > > > utilization >> >> > > > > > > > > > > compared >> >> > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > alternative 3. >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till >> >> > Rohrmann >> >> > > < >> >> > > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me >> understand >> >> the >> >> > > > > > difference >> >> > > > > > > > > > between >> >> > > > > > > > > > > > > > > > > alternative 2 >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization >> >> > > Xintong. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> >> XX:MaxDirectMemorySize >> >> > > to >> >> > > > > Task >> >> > > > > > > > > > Off-Heap >> >> > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > JVM >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that >> >> this >> >> > > size >> >> > > > > is >> >> > > > > > > too >> >> > > > > > > > > low >> >> > > > > > > > > > > > > > resulting >> >> > > > > > > > > > > > > > > > in a >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> >> potentially >> >> > an >> >> > > > OOM. >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> >> XX:MaxDirectMemorySize >> >> > > to >> >> > > > > > > > something >> >> > > > > > > > > > > larger >> >> > > > > > > > > > > > > > than >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course >> >> reduce >> >> > > the >> >> > > > > > sizes >> >> > > > > > > of >> >> > > > > > > > > the >> >> > > > > > > > > > > > other >> >> > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > types. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result >> in an >> >> > > under >> >> > > > > > > > > utilization >> >> > > > > > > > > > of >> >> > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> >> alternative 3 >> >> > > > > > strictly >> >> > > > > > > > > sets a >> >> > > > > > > > > > > > > higher >> >> > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only >> >> little, >> >> > > > then I >> >> > > > > > > would >> >> > > > > > > > > > > expect >> >> > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under >> >> > > > > utilization. >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang >> >> Wang < >> >> > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large >> max >> >> > direct >> >> > > > > > memory >> >> > > > > > > > size >> >> > > > > > > > > > > when >> >> > > > > > > > > > > > we >> >> > > > > > > > > > > > > > do >> >> > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native >> >> memory. >> >> > If >> >> > > > the >> >> > > > > > > direct >> >> > > > > > > > > > > > > > > > memory,including >> >> > > > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > direct memory and framework direct >> >> > > > memory,could >> >> > > > > > be >> >> > > > > > > > > > > calculated >> >> > > > > > > > > > > > > > > > > > > correctly,then >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct >> memory >> >> > with >> >> > > > > fixed >> >> > > > > > > > > value. >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and >> >> k8s,we >> >> > > > need >> >> > > > > to >> >> > > > > > > > check >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid >> >> > > submitting >> >> > > > > > > > > successfully >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > > > failing >> >> > > > > > > > > > > > > > > > > in >> >> > > > > > > > > > > > > > > > > > > the flink master. >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < >> [hidden email] >> >> > > > > >于2019年8月13日 >> >> > > > > > > > > > 周二22:07写道: >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you >> are >> >> > > right >> >> > > > > that >> >> > > > > > > we >> >> > > > > > > > > > should >> >> > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > include >> >> > > > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP. >> >> This >> >> > > FLIP >> >> > > > > > should >> >> > > > > > > > > > > > concentrate >> >> > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> >> > TaskExecutors, >> >> > > > > with >> >> > > > > > > > > minimum >> >> > > > > > > > > > > > > > > involvement >> >> > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> >> > alternative >> >> > > 3 >> >> > > > > may >> >> > > > > > > not >> >> > > > > > > > > > having >> >> > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > same >> >> > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > reservation issue that >> alternative 2 >> >> > > does, >> >> > > > > but >> >> > > > > > at >> >> > > > > > > > the >> >> > > > > > > > > > > cost >> >> > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > risk >> >> > > > > > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > using memory at the container >> level, >> >> > > which >> >> > > > is >> >> > > > > > not >> >> > > > > > > > > good. >> >> > > > > > > > > > > My >> >> > > > > > > > > > > > > > point >> >> > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and >> "JVM >> >> > > > > Overhead" >> >> > > > > > > are >> >> > > > > > > > > not >> >> > > > > > > > > > > easy >> >> > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > config. >> >> > > > > > > > > > > > > > > > > > > For >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might >> configure >> >> > them >> >> > > > > > higher >> >> > > > > > > > than >> >> > > > > > > > > > > what >> >> > > > > > > > > > > > > > > actually >> >> > > > > > > > > > > > > > > > > > > needed, >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct >> OOM. >> >> For >> >> > > > > > > alternative >> >> > > > > > > > > 3, >> >> > > > > > > > > > > > users >> >> > > > > > > > > > > > > do >> >> > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > get >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not >> config >> >> the >> >> > > two >> >> > > > > > > options >> >> > > > > > > > > > > > > aggressively >> >> > > > > > > > > > > > > > > > high. >> >> > > > > > > > > > > > > > > > > > But >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of >> >> overall >> >> > > > > container >> >> > > > > > > > > memory >> >> > > > > > > > > > > > usage >> >> > > > > > > > > > > > > > > > exceeds >> >> > > > > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > > > > budget. >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM >> Till >> >> > > > > Rohrmann < >> >> > > > > > > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP >> >> > Xintong. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already >> >> looks >> >> > > quite >> >> > > > > > good. >> >> > > > > > > > > > > > Concerning >> >> > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > first >> >> > > > > > > > > > > > > > > > > > > open >> >> > > > > > > > > > > > > > > > > > > > > question about allocating >> memory >> >> > > > segments, >> >> > > > > I >> >> > > > > > > was >> >> > > > > > > > > > > > wondering >> >> > > > > > > > > > > > > > > > whether >> >> > > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the >> >> > context >> >> > > > of >> >> > > > > > this >> >> > > > > > > > > FLIP >> >> > > > > > > > > > or >> >> > > > > > > > > > > > > > whether >> >> > > > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without >> >> > knowing >> >> > > > all >> >> > > > > > > > > details, >> >> > > > > > > > > > I >> >> > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > > concerned >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope >> of >> >> this >> >> > > > FLIP >> >> > > > > > too >> >> > > > > > > > much >> >> > > > > > > > > > > > because >> >> > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call >> >> sites >> >> > of >> >> > > > the >> >> > > > > > > > > > > MemoryManager >> >> > > > > > > > > > > > > > where >> >> > > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > > > allocate >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should >> >> mainly >> >> > be >> >> > > > > batch >> >> > > > > > > > > > > operators). >> >> > > > > > > > > > > > > The >> >> > > > > > > > > > > > > > > > > addition >> >> > > > > > > > > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to >> the >> >> > > > > > > MemoryManager >> >> > > > > > > > > > should >> >> > > > > > > > > > > > not >> >> > > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > affected >> >> > > > > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that >> this is >> >> > the >> >> > > > only >> >> > > > > > > point >> >> > > > > > > > > of >> >> > > > > > > > > > > > > > > interaction >> >> > > > > > > > > > > > > > > > a >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with >> the >> >> > > > > > > MemoryManager. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> >> question >> >> > > about >> >> > > > > > > setting >> >> > > > > > > > > or >> >> > > > > > > > > > > not >> >> > > > > > > > > > > > > > > setting >> >> > > > > > > > > > > > > > > > a >> >> > > > > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would >> also >> >> be >> >> > > > > > interested >> >> > > > > > > > why >> >> > > > > > > > > > > Yang >> >> > > > > > > > > > > > > Wang >> >> > > > > > > > > > > > > > > > > thinks >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. >> My >> >> > > concern >> >> > > > > > about >> >> > > > > > > > > this >> >> > > > > > > > > > > > would >> >> > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > > > would >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we >> >> are >> >> > now >> >> > > > > with >> >> > > > > > > the >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> >> > > > > > > > > > > > > > > > > > > If >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools are >> not >> >> > > > clearly >> >> > > > > > > > > separated >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > can >> >> > > > > > > > > > > > > > > > spill >> >> > > > > > > > > > > > > > > > > > over >> >> > > > > > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is >> quite >> >> > hard >> >> > > > to >> >> > > > > > > > > understand >> >> > > > > > > > > > > > what >> >> > > > > > > > > > > > > > > > exactly >> >> > > > > > > > > > > > > > > > > > > > causes a >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for using >> >> too >> >> > > much >> >> > > > > > > memory. >> >> > > > > > > > > This >> >> > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > then >> >> > > > > > > > > > > > > > > > > > easily >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation >> what >> >> we >> >> > > have >> >> > > > > with >> >> > > > > > > the >> >> > > > > > > > > > > > > > cutoff-ratio. >> >> > > > > > > > > > > > > > > > So >> >> > > > > > > > > > > > > > > > > > why >> >> > > > > > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value >> for >> >> max >> >> > > > direct >> >> > > > > > > > memory >> >> > > > > > > > > > and >> >> > > > > > > > > > > > > giving >> >> > > > > > > > > > > > > > > the >> >> > > > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > an >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he >> runs >> >> into >> >> > > an >> >> > > > > OOM. >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would >> alternative 2 >> >> > lead >> >> > > to >> >> > > > > > lower >> >> > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > utilization >> >> > > > > > > > > > > > > > > > > > than >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the >> >> direct >> >> > > > > memory >> >> > > > > > > to a >> >> > > > > > > > > > > higher >> >> > > > > > > > > > > > > > value? >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, >> >> > > > > > > > > > > > > > > > > > > > > Till >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM >> >> > Xintong >> >> > > > > Song < >> >> > > > > > > > > > > > > > > > [hidden email] >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, >> Yang. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large >> max >> >> > > direct >> >> > > > > > > memory >> >> > > > > > > > > size >> >> > > > > > > > > > > > > > > definitely >> >> > > > > > > > > > > > > > > > > has >> >> > > > > > > > > > > > > > > > > > > some >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not >> >> worry >> >> > > about >> >> > > > > > > direct >> >> > > > > > > > > OOM, >> >> > > > > > > > > > > and >> >> > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > don't >> >> > > > > > > > > > > > > > > > > > even >> >> > > > > > > > > > > > > > > > > > > > > need >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network >> >> > memory >> >> > > > with >> >> > > > > > > > > > > > > > Unsafe.allocate() . >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also some >> >> down >> >> > > sides >> >> > > > > of >> >> > > > > > > > doing >> >> > > > > > > > > > > this. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think >> of is >> >> > that >> >> > > > if >> >> > > > > a >> >> > > > > > > task >> >> > > > > > > > > > > > executor >> >> > > > > > > > > > > > > > > > > container >> >> > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing >> >> memory, >> >> > it >> >> > > > > could >> >> > > > > > > be >> >> > > > > > > > > hard >> >> > > > > > > > > > > for >> >> > > > > > > > > > > > > use >> >> > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > know >> >> > > > > > > > > > > > > > > > > > > > which >> >> > > > > > > > > > > > > > > > > > > > > > part >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is overused. >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is >> that >> >> the >> >> > > JVM >> >> > > > > > never >> >> > > > > > > > > > trigger >> >> > > > > > > > > > > GC >> >> > > > > > > > > > > > > due >> >> > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > reaching >> >> > > > > > > > > > > > > > > > > > > > > max >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, >> because >> >> the >> >> > > > limit >> >> > > > > > is >> >> > > > > > > > too >> >> > > > > > > > > > high >> >> > > > > > > > > > > > to >> >> > > > > > > > > > > > > be >> >> > > > > > > > > > > > > > > > > > reached. >> >> > > > > > > > > > > > > > > > > > > > That >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay on >> >> heap >> >> > > > memory >> >> > > > > to >> >> > > > > > > > > trigger >> >> > > > > > > > > > > GC >> >> > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > release >> >> > > > > > > > > > > > > > > > > > > > direct >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a >> >> problem >> >> > in >> >> > > > > cases >> >> > > > > > > > where >> >> > > > > > > > > > we >> >> > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > more >> >> > > > > > > > > > > > > > > > > > direct >> >> > > > > > > > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough heap >> >> > activity >> >> > > > to >> >> > > > > > > > trigger >> >> > > > > > > > > > the >> >> > > > > > > > > > > > GC. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your >> reasons >> >> > for >> >> > > > > > > preferring >> >> > > > > > > > > > > > setting a >> >> > > > > > > > > > > > > > > very >> >> > > > > > > > > > > > > > > > > > large >> >> > > > > > > > > > > > > > > > > > > > > value, >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I >> >> > > > overlooked. >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict >> between >> >> > > > multiple >> >> > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > user >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I >> think we >> >> > > should >> >> > > > > > throw >> >> > > > > > > > an >> >> > > > > > > > > > > error. >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the >> >> > client >> >> > > > side >> >> > > > > > is >> >> > > > > > > a >> >> > > > > > > > > good >> >> > > > > > > > > > > > idea, >> >> > > > > > > > > > > > > > so >> >> > > > > > > > > > > > > > > > that >> >> > > > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > > > > Yarn / >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the >> problem >> >> > > before >> >> > > > > > > > submitting >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > Flink >> >> > > > > > > > > > > > > > > > > > cluster, >> >> > > > > > > > > > > > > > > > > > > > > which >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on >> the >> >> > > client >> >> > > > > side >> >> > > > > > > > > > checking, >> >> > > > > > > > > > > > > > because >> >> > > > > > > > > > > > > > > > for >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster >> TaskManagers >> >> on >> >> > > > > > different >> >> > > > > > > > > > machines >> >> > > > > > > > > > > > may >> >> > > > > > > > > > > > > > > have >> >> > > > > > > > > > > > > > > > > > > > different >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the client >> >> does >> >> > > see >> >> > > > > > that. >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 >> PM >> >> Yang >> >> > > > Wang >> >> > > > > < >> >> > > > > > > > > > > > > > > > [hidden email]> >> >> > > > > > > > > > > > > > > > > > > > wrote: >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed >> >> > proposal. >> >> > > > > After >> >> > > > > > > all >> >> > > > > > > > > the >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > configuration >> >> > > > > > > > > > > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more >> >> > > powerful >> >> > > > to >> >> > > > > > > > control >> >> > > > > > > > > > the >> >> > > > > > > > > > > > > flink >> >> > > > > > > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > > > > > > > usage. I >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions >> about >> >> it. >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct >> Memory >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate >> user >> >> > direct >> >> > > > > > memory >> >> > > > > > > > and >> >> > > > > > > > > > > native >> >> > > > > > > > > > > > > > > memory. >> >> > > > > > > > > > > > > > > > > > They >> >> > > > > > > > > > > > > > > > > > > > are >> >> > > > > > > > > > > > > > > > > > > > > > all >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap >> >> memory. >> >> > > > > Right? >> >> > > > > > > So i >> >> > > > > > > > > > don’t >> >> > > > > > > > > > > > > think >> >> > > > > > > > > > > > > > > we >> >> > > > > > > > > > > > > > > > > > could >> >> > > > > > > > > > > > > > > > > > > > not >> >> > > > > > > > > > > > > > > > > > > > > > set >> >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize >> >> > > > properly. I >> >> > > > > > > > prefer >> >> > > > > > > > > > > > leaving >> >> > > > > > > > > > > > > > it a >> >> > > > > > > > > > > > > > > > > very >> >> > > > > > > > > > > > > > > > > > > > large >> >> > > > > > > > > > > > > > > > > > > > > > > value. >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and >> fine-grained >> >> > > > > > > memory(network >> >> > > > > > > > > > > memory, >> >> > > > > > > > > > > > > > > managed >> >> > > > > > > > > > > > > > > > > > > memory, >> >> > > > > > > > > > > > > > > > > > > > > > etc.) >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total >> process >> >> > > memory, >> >> > > > > how >> >> > > > > > do >> >> > > > > > > > we >> >> > > > > > > > > > deal >> >> > > > > > > > > > > > > with >> >> > > > > > > > > > > > > > > this >> >> > > > > > > > > > > > > > > > > > > > > situation? >> >> > > > > > > > > > > > > > > > > > > > > > Do >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory >> >> > > > > configuration >> >> > > > > > > in >> >> > > > > > > > > > > client? >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> >> > > [hidden email]> >> >> > > > > > > > > > 于2019年8月7日周三 >> >> > > > > > > > > > > > > > > 下午10:14写道: >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a >> >> > > discussion >> >> > > > > > > thread >> >> > > > > > > > on >> >> > > > > > > > > > > > > "FLIP-49: >> >> > > > > > > > > > > > > > > > > Unified >> >> > > > > > > > > > > > > > > > > > > > > Memory >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> >> > > > TaskExecutors"[1], >> >> > > > > > > where >> >> > > > > > > > we >> >> > > > > > > > > > > > > describe >> >> > > > > > > > > > > > > > > how >> >> > > > > > > > > > > > > > > > to >> >> > > > > > > > > > > > > > > > > > > > improve >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> >> > > configurations. >> >> > > > > The >> >> > > > > > > > FLIP >> >> > > > > > > > > > > > document >> >> > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > mostly >> >> > > > > > > > > > > > > > > > > > > > based >> >> > > > > > > > > > > > > > > > > > > > > > on >> >> > > > > > > > > > > > > > > > > > > > > > > an >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> >> Management >> >> > > and >> >> > > > > > > > > > Configuration >> >> > > > > > > > > > > > > > > > > Reloaded"[2] >> >> > > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from >> follow-up >> >> > > > > discussions >> >> > > > > > > > both >> >> > > > > > > > > > > online >> >> > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > > offline. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses >> several >> >> > > > > > shortcomings >> >> > > > > > > of >> >> > > > > > > > > > > current >> >> > > > > > > > > > > > > > > (Flink >> >> > > > > > > > > > > > > > > > > 1.9) >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> >> > > configuration. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different >> configuration >> >> > for >> >> > > > > > > Streaming >> >> > > > > > > > > and >> >> > > > > > > > > > > > Batch. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and >> difficult >> >> > > > > > configuration >> >> > > > > > > of >> >> > > > > > > > > > > RocksDB >> >> > > > > > > > > > > > > in >> >> > > > > > > > > > > > > > > > > > Streaming. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, >> uncertain >> >> and >> >> > > > hard >> >> > > > > to >> >> > > > > > > > > > > understand. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the >> >> > problems >> >> > > > can >> >> > > > > > be >> >> > > > > > > > > > > summarized >> >> > > > > > > > > > > > > as >> >> > > > > > > > > > > > > > > > > follows. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory >> manager >> >> to >> >> > > also >> >> > > > > > > account >> >> > > > > > > > > for >> >> > > > > > > > > > > > memory >> >> > > > > > > > > > > > > > > usage >> >> > > > > > > > > > > > > > > > > by >> >> > > > > > > > > > > > > > > > > > > > state >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how >> TaskExecutor >> >> > > memory >> >> > > > > is >> >> > > > > > > > > > > partitioned >> >> > > > > > > > > > > > > > > > accounted >> >> > > > > > > > > > > > > > > > > > > > > individual >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations >> and >> >> > pools. >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> >> > > configuration >> >> > > > > > > options >> >> > > > > > > > > and >> >> > > > > > > > > > > > > > > calculations >> >> > > > > > > > > > > > > > > > > > > logics. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details >> in >> >> the >> >> > > > FLIP >> >> > > > > > wiki >> >> > > > > > > > > > > document >> >> > > > > > > > > > > > > [1]. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the >> early >> >> > > design >> >> > > > > doc >> >> > > > > > > [2] >> >> > > > > > > > is >> >> > > > > > > > > > out >> >> > > > > > > > > > > > of >> >> > > > > > > > > > > > > > > sync, >> >> > > > > > > > > > > > > > > > > and >> >> > > > > > > > > > > > > > > > > > it >> >> > > > > > > > > > > > > > > > > > > > is >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the >> >> > > discussion >> >> > > > in >> >> > > > > > > this >> >> > > > > > > > > > > mailing >> >> > > > > > > > > > > > > list >> >> > > > > > > > > > > > > > > > > > thread.) >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your >> >> > > feedbacks. >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> >> > >> > |
I also agree that all the configuration should be calculated out of
TaskManager. So a full configuration should be generated before TaskManager started. Override the calculated configurations through -D now seems better. Best, Yang Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > I just updated the FLIP wiki page [1], with the following changes: > > - Network memory uses JVM direct memory, and is accounted when setting > JVM max direct memory size parameter. > - Use dynamic configurations (`-Dkey=value`) to pass calculated memory > configs into TaskExecutors, instead of ENV variables. > - Remove 'supporting memory reservation' from the scope of this FLIP. > > @till @stephan, please take another look see if there are any other > concerns. > > Thank you~ > > Xintong Song > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email]> > wrote: > > > Sorry for the late response. > > > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail > > in PR. > > - Regarding passing parameters into the `TaskExecutor`, +1 for using > > dynamic configuration at the moment, given that there are more questions > to > > be discussed to have a general framework for overwriting configurations > > with ENV variables. > > - Regarding memory reservation, I double checked with Yu and he will take > > care of it. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> > > wrote: > > > >> What I forgot to add is that we could tackle specifying the > configuration > >> fully in an incremental way and that the full specification should be > the > >> desired end state. > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> > >> wrote: > >> > >> > I think our goal should be that the configuration is fully specified > >> when > >> > the process is started. By considering the internal calculation step > to > >> be > >> > rather validate existing values and calculate missing ones, these two > >> > proposal shouldn't even conflict (given determinism). > >> > > >> > Since we don't want to change an existing flink-conf.yaml, specifying > >> the > >> > full configuration would require to pass in the options differently. > >> > > >> > One way could be the ENV variables approach. The reason why I'm trying > >> to > >> > exclude this feature from the FLIP is that I believe it needs a bit > more > >> > discussion. Just some questions which come to my mind: What would be > the > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator which > is > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > >> > separator what would be the order of precedence if there are two ENV > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > >> > precedence of env variable vs. dynamic configuration value specified > >> via -D? > >> > > >> > Another approach could be to pass in the dynamic configuration values > >> via > >> > `-Dkey=value` to the Flink process. For that we don't have to change > >> > anything because the functionality already exists. > >> > > >> > Cheers, > >> > Till > >> > > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> > wrote: > >> > > >> >> I see. Under the assumption of strict determinism that should work. > >> >> > >> >> The original proposal had this point "don't compute inside the TM, > >> compute > >> >> outside and supply a full config", because that sounded more > intuitive. > >> >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <[hidden email] > > > >> >> wrote: > >> >> > >> >> > My understanding was that before starting the Flink process we > call a > >> >> > utility which calculates these values. I assume that this utility > >> will > >> >> do > >> >> > the calculation based on a set of configured values (process > memory, > >> >> flink > >> >> > memory, network memory etc.). Assuming that these values don't > differ > >> >> from > >> >> > the values with which the JVM is started, it should be possible to > >> >> > recompute them in the Flink process in order to set the values. > >> >> > > >> >> > > >> >> > > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> > >> wrote: > >> >> > > >> >> > > When computing the values in the JVM process after it started, > how > >> >> would > >> >> > > you deal with values like Max Direct Memory, Metaspace size. > native > >> >> > memory > >> >> > > reservation (reduce heap size), etc? All the values that are > >> >> parameters > >> >> > to > >> >> > > the JVM process and that need to be supplied at process startup? > >> >> > > > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > >> [hidden email]> > >> >> > > wrote: > >> >> > > > >> >> > > > Thanks for the clarification. I have some more comments: > >> >> > > > > >> >> > > > - I would actually split the logic to compute the process > memory > >> >> > > > requirements and storing the values into two things. E.g. one > >> could > >> >> > name > >> >> > > > the former TaskExecutorProcessUtility and the latter > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR > >> since > >> >> it's > >> >> > > > just a naming detail. > >> >> > > > > >> >> > > > - Generally, I'm not opposed to making configuration values > >> >> overridable > >> >> > > by > >> >> > > > ENV variables. I think this is a very good idea and makes the > >> >> > > > configurability of Flink processes easier. However, I think > that > >> >> adding > >> >> > > > this functionality should not be part of this FLIP because it > >> would > >> >> > > simply > >> >> > > > widen the scope unnecessarily. > >> >> > > > > >> >> > > > The reasons why I believe it is unnecessary are the following: > >> For > >> >> Yarn > >> >> > > we > >> >> > > > already create write a flink-conf.yaml which could be populated > >> with > >> >> > the > >> >> > > > memory settings. For the other processes it should not make a > >> >> > difference > >> >> > > > whether the loaded Configuration is populated with the memory > >> >> settings > >> >> > > from > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to compute > >> the > >> >> > > missing > >> >> > > > values from the loaded configuration. If the latter would not > be > >> >> > possible > >> >> > > > (wrong or missing configuration values), then we should not > have > >> >> been > >> >> > > able > >> >> > > > to actually start the process in the first place. > >> >> > > > > >> >> > > > - Concerning the memory reservation: I agree with you that we > >> need > >> >> the > >> >> > > > memory reservation functionality to make streaming jobs work > with > >> >> > > "managed" > >> >> > > > memory. However, w/o this functionality the whole Flip would > >> already > >> >> > > bring > >> >> > > > a good amount of improvements to our users when running batch > >> jobs. > >> >> > > > Moreover, by keeping the scope smaller we can complete the FLIP > >> >> faster. > >> >> > > > Hence, I would propose to address the memory reservation > >> >> functionality > >> >> > > as a > >> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). > >> >> > > > > >> >> > > > Cheers, > >> >> > > > Till > >> >> > > > > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > >> [hidden email]> > >> >> > > wrote: > >> >> > > > > >> >> > > > > Just add my 2 cents. > >> >> > > > > > >> >> > > > > Using environment variables to override the configuration for > >> >> > different > >> >> > > > > taskmanagers is better. > >> >> > > > > We do not need to generate dedicated flink-conf.yaml for all > >> >> > > > taskmanagers. > >> >> > > > > A common flink-conf.yam and different environment variables > are > >> >> > enough. > >> >> > > > > By reducing the distributed cached files, it could make > >> launching > >> >> a > >> >> > > > > taskmanager faster. > >> >> > > > > > >> >> > > > > Stephan gives a good suggestion that we could move the logic > >> into > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > >> >> > > > > Maybe the client could also benefit from this. Different > users > >> do > >> >> not > >> >> > > > have > >> >> > > > > to export FLINK_CONF_DIR to update few config options. > >> >> > > > > > >> >> > > > > > >> >> > > > > Best, > >> >> > > > > Yang > >> >> > > > > > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > >> >> > > > > > >> >> > > > > > One note on the Environment Variables and Configuration > >> >> discussion. > >> >> > > > > > > >> >> > > > > > My understanding is that passed ENV variables are added to > >> the > >> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" > >> method > >> >> (or > >> >> > > > > > similar). > >> >> > > > > > For all the code inside Flink, it looks like the data was > in > >> the > >> >> > > config > >> >> > > > > to > >> >> > > > > > start with, just that the scripts that compute the > variables > >> can > >> >> > pass > >> >> > > > the > >> >> > > > > > values to the process without actually needing to write a > >> file. > >> >> > > > > > > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" method > >> would > >> >> > take > >> >> > > > any > >> >> > > > > > ENV variable prefixed with "flink" and add it as a config > >> key. > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > >> >> > > > "taskmanager.memory.size: > >> >> > > > > > 2g". > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > >> >> > [hidden email]> > >> >> > > > > > wrote: > >> >> > > > > > > >> >> > > > > > > Thanks for the comments, Till. > >> >> > > > > > > > >> >> > > > > > > I've also seen your comments on the wiki page, but let's > >> keep > >> >> the > >> >> > > > > > > discussion here. > >> >> > > > > > > > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think > about > >> >> > naming > >> >> > > it > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > >> >> > > > > > > - Regarding passing memory configurations into task > >> executors, > >> >> > I'm > >> >> > > in > >> >> > > > > > favor > >> >> > > > > > > of do it via environment variables rather than > >> configurations, > >> >> > with > >> >> > > > the > >> >> > > > > > > following two reasons. > >> >> > > > > > > - It is easier to keep the memory options once > calculate > >> >> not to > >> >> > > be > >> >> > > > > > > changed with environment variables rather than > >> configurations. > >> >> > > > > > > - I'm not sure whether we should write the > configuration > >> in > >> >> > > startup > >> >> > > > > > > scripts. Writing changes into the configuration files > when > >> >> > running > >> >> > > > the > >> >> > > > > > > startup scripts does not sounds right to me. Or we could > >> make > >> >> a > >> >> > > copy > >> >> > > > of > >> >> > > > > > > configuration files per flink cluster, and make the task > >> >> executor > >> >> > > to > >> >> > > > > load > >> >> > > > > > > from the copy, and clean up the copy after the cluster is > >> >> > shutdown, > >> >> > > > > which > >> >> > > > > > > is complicated. (I think this is also what Stephan means > in > >> >> his > >> >> > > > comment > >> >> > > > > > on > >> >> > > > > > > the wiki page?) > >> >> > > > > > > - Regarding reserving memory, I think this change should > be > >> >> > > included > >> >> > > > in > >> >> > > > > > > this FLIP. I think a big part of motivations of this FLIP > >> is > >> >> to > >> >> > > unify > >> >> > > > > > > memory configuration for streaming / batch and make it > easy > >> >> for > >> >> > > > > > configuring > >> >> > > > > > > rocksdb memory. If we don't support memory reservation, > >> then > >> >> > > > streaming > >> >> > > > > > jobs > >> >> > > > > > > cannot use managed memory (neither on-heap or off-heap), > >> which > >> >> > > makes > >> >> > > > > this > >> >> > > > > > > FLIP incomplete. > >> >> > > > > > > - Regarding network memory, I think you are right. I > think > >> we > >> >> > > > probably > >> >> > > > > > > don't need to change network stack from using direct > >> memory to > >> >> > > using > >> >> > > > > > unsafe > >> >> > > > > > > native memory. Network memory size is deterministic, > >> cannot be > >> >> > > > reserved > >> >> > > > > > as > >> >> > > > > > > managed memory does, and cannot be overused. I think it > >> also > >> >> > works > >> >> > > if > >> >> > > > > we > >> >> > > > > > > simply keep using direct memory for network and include > it > >> in > >> >> jvm > >> >> > > max > >> >> > > > > > > direct memory size. > >> >> > > > > > > > >> >> > > > > > > Thank you~ > >> >> > > > > > > > >> >> > > > > > > Xintong Song > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > >> >> > > [hidden email]> > >> >> > > > > > > wrote: > >> >> > > > > > > > >> >> > > > > > > > Hi Xintong, > >> >> > > > > > > > > >> >> > > > > > > > thanks for addressing the comments and adding a more > >> >> detailed > >> >> > > > > > > > implementation plan. I have a couple of comments > >> concerning > >> >> the > >> >> > > > > > > > implementation plan: > >> >> > > > > > > > > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > >> >> descriptive. > >> >> > > > > Choosing > >> >> > > > > > a > >> >> > > > > > > > different name could help here. > >> >> > > > > > > > - I'm not sure whether I would pass the memory > >> >> configuration to > >> >> > > the > >> >> > > > > > > > TaskExecutor via environment variables. I think it > would > >> be > >> >> > > better > >> >> > > > to > >> >> > > > > > > write > >> >> > > > > > > > it into the configuration one uses to start the TM > >> process. > >> >> > > > > > > > - If possible, I would exclude the memory reservation > >> from > >> >> this > >> >> > > > FLIP > >> >> > > > > > and > >> >> > > > > > > > add this as part of a dedicated FLIP. > >> >> > > > > > > > - If possible, then I would exclude changes to the > >> network > >> >> > stack > >> >> > > > from > >> >> > > > > > > this > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory > >> needed > >> >> by > >> >> > > the > >> >> > > > > > > network > >> >> > > > > > > > stack is the framework direct memory requirement. > >> Changing > >> >> how > >> >> > > the > >> >> > > > > > memory > >> >> > > > > > > > is allocated can happen in a second step. This would > keep > >> >> the > >> >> > > scope > >> >> > > > > of > >> >> > > > > > > this > >> >> > > > > > > > FLIP smaller. > >> >> > > > > > > > > >> >> > > > > > > > Cheers, > >> >> > > > > > > > Till > >> >> > > > > > > > > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > >> >> > > > [hidden email]> > >> >> > > > > > > > wrote: > >> >> > > > > > > > > >> >> > > > > > > > > Hi everyone, > >> >> > > > > > > > > > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], with > the > >> >> > > following > >> >> > > > > > > changes. > >> >> > > > > > > > > > >> >> > > > > > > > > - Removed open question regarding MemorySegment > >> >> > allocation. > >> >> > > As > >> >> > > > > > > > > discussed, we exclude this topic from the scope of > >> this > >> >> > > FLIP. > >> >> > > > > > > > > - Updated content about JVM direct memory > parameter > >> >> > > according > >> >> > > > to > >> >> > > > > > > > recent > >> >> > > > > > > > > discussions, and moved the other options to > >> "Rejected > >> >> > > > > > Alternatives" > >> >> > > > > > > > for > >> >> > > > > > > > > the > >> >> > > > > > > > > moment. > >> >> > > > > > > > > - Added implementation steps. > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > Thank you~ > >> >> > > > > > > > > > >> >> > > > > > > > > Xintong Song > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > [1] > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > >> >> > [hidden email] > >> >> > > > > >> >> > > > > > wrote: > >> >> > > > > > > > > > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users before > >> task > >> >> > > dispose > >> >> > > > > and > >> >> > > > > > > > > memory > >> >> > > > > > > > > > release": I agree, that's how it should be. Let's > >> try it > >> >> > out. > >> >> > > > > > > > > > > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait > >> for > >> >> GC > >> >> > > when > >> >> > > > > > > > allocating > >> >> > > > > > > > > > direct memory buffer": There seems to be pretty > >> >> elaborate > >> >> > > logic > >> >> > > > > to > >> >> > > > > > > free > >> >> > > > > > > > > > buffers when allocating new ones. See > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > >> >> > > > > > > > > > > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default > works > >> >> (like > >> >> > > > going > >> >> > > > > > > with > >> >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" > at > >> >> all), > >> >> > > > then > >> >> > > > > I > >> >> > > > > > > > think > >> >> > > > > > > > > it > >> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if > we > >> use > >> >> > > > RocksDB. > >> >> > > > > > > That > >> >> > > > > > > > > is a > >> >> > > > > > > > > > big if, though, I honestly have no idea :D Would be > >> >> good to > >> >> > > > > > > understand > >> >> > > > > > > > > > this, though, because this would affect option (2) > >> and > >> >> > option > >> >> > > > > > (1.2). > >> >> > > > > > > > > > > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > >> >> > > > > > [hidden email]> > >> >> > > > > > > > > > wrote: > >> >> > > > > > > > > > > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Let me try to summarize your points. Please > correct > >> >> me if > >> >> > > I'm > >> >> > > > > > > wrong. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > - Memory consumers should always avoid > returning > >> >> > memory > >> >> > > > > > segments > >> >> > > > > > > > to > >> >> > > > > > > > > > > memory manager while there are still > un-cleaned > >> >> > > > structures / > >> >> > > > > > > > threads > >> >> > > > > > > > > > > that > >> >> > > > > > > > > > > may use the memory. Otherwise, it would cause > >> >> serious > >> >> > > > > problems > >> >> > > > > > > by > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > multiple consumers trying to use the same > memory > >> >> > > segment. > >> >> > > > > > > > > > > - JVM does not wait for GC when allocating > >> direct > >> >> > memory > >> >> > > > > > buffer. > >> >> > > > > > > > > > > Therefore even we set proper max direct memory > >> size > >> >> > > limit, > >> >> > > > > we > >> >> > > > > > > may > >> >> > > > > > > > > > still > >> >> > > > > > > > > > > encounter direct memory oom if the GC cleaning > >> >> memory > >> >> > > > slower > >> >> > > > > > > than > >> >> > > > > > > > > the > >> >> > > > > > > > > > > direct memory allocation. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Am I understanding this correctly? > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > >> >> > > > > > > [hidden email] > >> >> > > > > > > > > > > .invalid> > >> >> > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Hi stephan: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > About option 2: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > if additional threads not cleanly shut down > >> before > >> >> we > >> >> > can > >> >> > > > > exit > >> >> > > > > > > the > >> >> > > > > > > > > > task: > >> >> > > > > > > > > > > > In the current case of memory reuse, it has > >> freed up > >> >> > the > >> >> > > > > memory > >> >> > > > > > > it > >> >> > > > > > > > > > > > uses. If this memory is used by other tasks > and > >> >> > > > asynchronous > >> >> > > > > > > > threads > >> >> > > > > > > > > > > > of exited task may still be writing, there > will > >> be > >> >> > > > > concurrent > >> >> > > > > > > > > security > >> >> > > > > > > > > > > > problems, and even lead to errors in user > >> computing > >> >> > > > results. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > So I think this is a serious and intolerable > >> bug, No > >> >> > > matter > >> >> > > > > > what > >> >> > > > > > > > the > >> >> > > > > > > > > > > > option is, it should be avoided. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > >> encountered so > >> >> > many > >> >> > > > > > > > situations > >> >> > > > > > > > > > > > that it's too late for GC to cause > DirectMemory > >> >> OOM. > >> >> > > > Release > >> >> > > > > > and > >> >> > > > > > > > > > > > allocate DirectMemory depend on the type of > user > >> >> job, > >> >> > > > which > >> >> > > > > is > >> >> > > > > > > > > > > > often beyond our control. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > Jingsong Lee > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > >> >> > ------------------------------------------------------------------ > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > >> >> > > > > > > > > > > > To:dev <[hidden email]> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > >> >> > > Configuration > >> >> > > > > for > >> >> > > > > > > > > > > > TaskExecutors > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > My main concern with option 2 (manually release > >> >> memory) > >> >> > > is > >> >> > > > > that > >> >> > > > > > > > > > segfaults > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user > >> >> ends. > >> >> > So > >> >> > > we > >> >> > > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > > > guarantee that this never happens. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data > >> >> structures / > >> >> > > > > > algorithms > >> >> > > > > > > > > with > >> >> > > > > > > > > > > > additional threads, like hash table spill/read > >> and > >> >> > > sorting > >> >> > > > > > > threads. > >> >> > > > > > > > > We > >> >> > > > > > > > > > > need > >> >> > > > > > > > > > > > to ensure that these cleanly shut down before > we > >> can > >> >> > exit > >> >> > > > the > >> >> > > > > > > task. > >> >> > > > > > > > > > > > I am not sure that we have that guaranteed > >> already, > >> >> > > that's > >> >> > > > > why > >> >> > > > > > > > option > >> >> > > > > > > > > > 1.1 > >> >> > > > > > > > > > > > seemed simpler to me. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > >> >> > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized > in > >> >> this > >> >> > > way > >> >> > > > > > really > >> >> > > > > > > > > makes > >> >> > > > > > > > > > > > > things easier to understand. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the > >> >> moment. I > >> >> > > > think > >> >> > > > > it > >> >> > > > > > > is > >> >> > > > > > > > > not > >> >> > > > > > > > > > > that > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for memory > >> >> > manager, > >> >> > > as > >> >> > > > > > long > >> >> > > > > > > as > >> >> > > > > > > > > we > >> >> > > > > > > > > > > > always > >> >> > > > > > > > > > > > > de-allocate the memory segment when it is > >> released > >> >> > from > >> >> > > > the > >> >> > > > > > > > memory > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer > continue > >> >> using > >> >> > > the > >> >> > > > > > > buffer > >> >> > > > > > > > of > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > segment after releasing it, in which case we > do > >> >> want > >> >> > > the > >> >> > > > > job > >> >> > > > > > to > >> >> > > > > > > > > fail > >> >> > > > > > > > > > so > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > detect the memory leak early. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a good > >> idea. > >> >> > Not > >> >> > > > only > >> >> > > > > > > > because > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > assumption (regular GC is enough to clean > >> direct > >> >> > > buffers) > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > > always > >> >> > > > > > > > > > > > be > >> >> > > > > > > > > > > > > true, but also it makes harder for finding > >> >> problems > >> >> > in > >> >> > > > > cases > >> >> > > > > > of > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > overuse. E.g., user configured some direct > >> memory > >> >> for > >> >> > > the > >> >> > > > > > user > >> >> > > > > > > > > > > libraries. > >> >> > > > > > > > > > > > > If the library actually use more direct > memory > >> >> then > >> >> > > > > > configured, > >> >> > > > > > > > > which > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are > still > >> in > >> >> > use, > >> >> > > > may > >> >> > > > > > lead > >> >> > > > > > > > to > >> >> > > > > > > > > > > > overuse > >> >> > > > > > > > > > > > > of the total container memory. In that case, > >> if it > >> >> > > didn't > >> >> > > > > > touch > >> >> > > > > > > > the > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > default max direct memory limit, we cannot > get > >> a > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > OOM > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > it > >> >> > > > > > > > > > > > > will become super hard to understand which > >> part of > >> >> > the > >> >> > > > > > > > > configuration > >> >> > > > > > > > > > > need > >> >> > > > > > > > > > > > > to be updated. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.1, it has the similar problem as > >> >> 1.2, if > >> >> > > the > >> >> > > > > > > > exceeded > >> >> > > > > > > > > > > direct > >> >> > > > > > > > > > > > > memory does not reach the max direct memory > >> limit > >> >> > > > specified > >> >> > > > > > by > >> >> > > > > > > > the > >> >> > > > > > > > > > > > > dedicated parameter. I think it is slightly > >> better > >> >> > than > >> >> > > > > 1.2, > >> >> > > > > > > only > >> >> > > > > > > > > > > because > >> >> > > > > > > > > > > > > we can tune the parameter. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen > < > >> >> > > > > > [hidden email] > >> >> > > > > > > > > >> >> > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > >> discussion, > >> >> > maybe > >> >> > > > let > >> >> > > > > > me > >> >> > > > > > > > > > > summarize > >> >> > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > bit differently: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > We have the following two options: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated > by > >> the > >> >> > GC. > >> >> > > > That > >> >> > > > > > > makes > >> >> > > > > > > > > it > >> >> > > > > > > > > > > > > segfault > >> >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC > in > >> >> case > >> >> > > > > > > > de-allocation > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments > happens > >> >> > quickly, > >> >> > > > > which > >> >> > > > > > > is > >> >> > > > > > > > > > often > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > case during batch scheduling or task > restart. > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option > >> 1.1) > >> >> is > >> >> > one > >> >> > > > way > >> >> > > > > > to > >> >> > > > > > > do > >> >> > > > > > > > > > this > >> >> > > > > > > > > > > > > > - Another way could be to have a > dedicated > >> >> > > > bookkeeping > >> >> > > > > in > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this > is a > >> >> > number > >> >> > > > > > > > independent > >> >> > > > > > > > > of > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate > the > >> >> memory > >> >> > > for > >> >> > > > > the > >> >> > > > > > > > > > > > > MemorySegments > >> >> > > > > > > > > > > > > > (option 2). That way we need not worry > about > >> >> > > triggering > >> >> > > > > GC > >> >> > > > > > by > >> >> > > > > > > > > some > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder > to > >> >> > prevent > >> >> > > > > > > > segfaults. > >> >> > > > > > > > > We > >> >> > > > > > > > > > > > need > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > be very careful about when we release the > >> memory > >> >> > > > segments > >> >> > > > > > > (only > >> >> > > > > > > > > in > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably need > to > >> >> set > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > >> >> > > "off_heap_managed_memory + > >> >> > > > > > > > > > > direct_memory" > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved > >> >> memory > >> >> > > > pool. > >> >> > > > > > > > Because > >> >> > > > > > > > > if > >> >> > > > > > > > > > > we > >> >> > > > > > > > > > > > > just > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > >> >> > > > > "off_heap_managed_memory + > >> >> > > > > > > > > > > > > jvm_overhead", > >> >> > > > > > > > > > > > > > then there will be times when that entire > >> >> memory is > >> >> > > > > > allocated > >> >> > > > > > > > by > >> >> > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > buffers and we have nothing left for the > JVM > >> >> > > overhead. > >> >> > > > So > >> >> > > > > > we > >> >> > > > > > > > > either > >> >> > > > > > > > > > > > need > >> >> > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > way to compensate for that (again some > safety > >> >> > margin > >> >> > > > > cutoff > >> >> > > > > > > > > value) > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > will exceed container memory. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be > aware > >> >> that > >> >> > it > >> >> > > > > takes > >> >> > > > > > > > > > elaborate > >> >> > > > > > > > > > > > > logic > >> >> > > > > > > > > > > > > > to push recycling of direct buffers without > >> >> always > >> >> > > > > > > triggering a > >> >> > > > > > > > > > full > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > My first guess is that the options will be > >> >> easiest > >> >> > to > >> >> > > > do > >> >> > > > > in > >> >> > > > > > > the > >> >> > > > > > > > > > > > following > >> >> > > > > > > > > > > > > > order: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > direct_memory > >> >> > > > parameter, > >> >> > > > > as > >> >> > > > > > > > > > discussed > >> >> > > > > > > > > > > > > > above. We would need to find a way to set > the > >> >> > > > > direct_memory > >> >> > > > > > > > > > parameter > >> >> > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > default. We could start with 64 MB and see > >> how > >> >> it > >> >> > > goes > >> >> > > > in > >> >> > > > > > > > > practice. > >> >> > > > > > > > > > > One > >> >> > > > > > > > > > > > > > danger I see is that setting this loo low > can > >> >> > cause a > >> >> > > > > bunch > >> >> > > > > > > of > >> >> > > > > > > > > > > > additional > >> >> > > > > > > > > > > > > > GCs compared to before (we need to watch > this > >> >> > > > carefully). > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite simple > to > >> >> > > implement, > >> >> > > > > we > >> >> > > > > > > > could > >> >> > > > > > > > > > try > >> >> > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > >> >> > > > > > > > > > > > > parameter > >> >> > > > > > > > > > > > > > at all and assume that all the direct > memory > >> >> > > > allocations > >> >> > > > > > that > >> >> > > > > > > > the > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be > cleaned > >> up > >> >> > fast > >> >> > > > > enough > >> >> > > > > > > > > through > >> >> > > > > > > > > > > > > regular > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > >> assumption, > >> >> > > > though. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > Stephan > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song > >> < > >> >> > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> >> > wondering > >> >> > > > > > whether > >> >> > > > > > > > we > >> >> > > > > > > > > > can > >> >> > > > > > > > > > > > > avoid > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> managed > >> >> > memory > >> >> > > > and > >> >> > > > > > > > network > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > >> >> > > thought, > >> >> > > > I > >> >> > > > > > > think > >> >> > > > > > > > > even > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > >> off-heap > >> >> > > > managed > >> >> > > > > > > memory > >> >> > > > > > > > > > could > >> >> > > > > > > > > > > > > cause > >> >> > > > > > > > > > > > > > > problems. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> proposed > >> >> in > >> >> > > this > >> >> > > > > > FLIP > >> >> > > > > > > it > >> >> > > > > > > > > to > >> >> > > > > > > > > > > have > >> >> > > > > > > > > > > > > > both > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > memory > >> >> > > allocated > >> >> > > > > > > through > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> >> > practically > >> >> > > > > > native > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > only > >> >> parts > >> >> > of > >> >> > > > > > memory > >> >> > > > > > > > > > limited > >> >> > > > > > > > > > > by > >> >> > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > memory > >> and > >> >> > JVM > >> >> > > > > > > overhead, > >> >> > > > > > > > > > which > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > >> JVM > >> >> max > >> >> > > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > to. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> Rohrmann > >> >> < > >> >> > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> >> > > understand > >> >> > > > > the > >> >> > > > > > > two > >> >> > > > > > > > > > > > > alternatives > >> >> > > > > > > > > > > > > > > > now. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > because > >> it > >> >> > makes > >> >> > > > > > things > >> >> > > > > > > > > > > explicit. > >> >> > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > >> that > >> >> we > >> >> > > might > >> >> > > > > end > >> >> > > > > > > up > >> >> > > > > > > > > in a > >> >> > > > > > > > > > > > > similar > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > >> user > >> >> > might > >> >> > > > see > >> >> > > > > > that > >> >> > > > > > > > her > >> >> > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > gets > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why > >> this > >> >> is > >> >> > > the > >> >> > > > > > case. > >> >> > > > > > > > > > > > > Consequently, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > >> >> > > (similar > >> >> > > > to > >> >> > > > > > > > > > increasing > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > cutoff > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > >> extra > >> >> > > direct > >> >> > > > > > > memory. > >> >> > > > > > > > > > Even > >> >> > > > > > > > > > > > > worse, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > >> are > >> >> not > >> >> > > > fully > >> >> > > > > > used > >> >> > > > > > > > and > >> >> > > > > > > > > > > hence > >> >> > > > > > > > > > > > > > won't > >> >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > Xintong > >> >> Song < > >> >> > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> >> example > >> >> > > Till. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> scenario. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> Memory + > >> >> JVM > >> >> > > > > > > Overhead): > >> >> > > > > > > > > > 200MB > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> >> Metaspace, > >> >> > > > > > Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > > > 200MB. > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > a > >> >> > > > > > > > very > >> >> > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > Overhead > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > alternative 2 > >> >> and > >> >> > > > > > > alternative 3 > >> >> > > > > > > > > > > should > >> >> > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> >> > > > > -XX:MaxDirectMemorySize > >> >> > > > > > > will > >> >> > > > > > > > > not > >> >> > > > > > > > > > > > > reduce > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> frequent > >> >> OOM. > >> >> > > To > >> >> > > > > > avoid > >> >> > > > > > > > > that, > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > only > >> >> > > > > > > > > > > > > > > > thing > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> >> configuration > >> >> > > and > >> >> > > > > > > > increase > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > Direct > >> >> > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> Overhead). > >> >> > Let's > >> >> > > > say > >> >> > > > > > > that > >> >> > > > > > > > > user > >> >> > > > > > > > > > > > > > increases > >> >> > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> >> reduce > >> >> > the > >> >> > > > > total > >> >> > > > > > > > size > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > >> process > >> >> > > memory > >> >> > > > > > > remains > >> >> > > > > > > > > > 1GB. > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> >> chance of > >> >> > > > > direct > >> >> > > > > > > OOM. > >> >> > > > > > > > > > There > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > chances > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > >> memory > >> >> > limit, > >> >> > > > but > >> >> > > > > > > given > >> >> > > > > > > > > > that > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> >> memory > >> >> > > > > > (Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > > Memory, > >> >> > > > > > > > > > > > > > > > > Network > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > >> actual > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > > usage > >> >> > > > > > > > > > is > >> >> > > > > > > > > > > > > > > slightly > >> >> > > > > > > > > > > > > > > > > above > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> probably > >> >> do > >> >> > > not > >> >> > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > change > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > configurations. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> >> > > perspective, a > >> >> > > > > > > > feasible > >> >> > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> >> resource > >> >> > > > > > > utilization > >> >> > > > > > > > > > > compared > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> >> > Rohrmann > >> >> > > < > >> >> > > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> understand > >> >> the > >> >> > > > > > difference > >> >> > > > > > > > > > between > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > >> >> > > Xintong. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > Task > >> >> > > > > > > > > > Off-Heap > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > >> >> this > >> >> > > size > >> >> > > > > is > >> >> > > > > > > too > >> >> > > > > > > > > low > >> >> > > > > > > > > > > > > > resulting > >> >> > > > > > > > > > > > > > > > in a > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> >> potentially > >> >> > an > >> >> > > > OOM. > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > > > > something > >> >> > > > > > > > > > > larger > >> >> > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> >> reduce > >> >> > > the > >> >> > > > > > sizes > >> >> > > > > > > of > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > types. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > >> in an > >> >> > > under > >> >> > > > > > > > > utilization > >> >> > > > > > > > > > of > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> >> alternative 3 > >> >> > > > > > strictly > >> >> > > > > > > > > sets a > >> >> > > > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> >> little, > >> >> > > > then I > >> >> > > > > > > would > >> >> > > > > > > > > > > expect > >> >> > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > >> >> > > > > utilization. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > Yang > >> >> Wang < > >> >> > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large > >> max > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > size > >> >> > > > > > > > > > > when > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> >> memory. > >> >> > If > >> >> > > > the > >> >> > > > > > > direct > >> >> > > > > > > > > > > > > > > > memory,including > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > direct > >> >> > > > memory,could > >> >> > > > > > be > >> >> > > > > > > > > > > calculated > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > >> memory > >> >> > with > >> >> > > > > fixed > >> >> > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > and > >> >> k8s,we > >> >> > > > need > >> >> > > > > to > >> >> > > > > > > > check > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> >> > > submitting > >> >> > > > > > > > > successfully > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > failing > >> >> > > > > > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> [hidden email] > >> >> > > > > >于2019年8月13日 > >> >> > > > > > > > > > 周二22:07写道: > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > you > >> are > >> >> > > right > >> >> > > > > that > >> >> > > > > > > we > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > include > >> >> > > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > FLIP. > >> >> This > >> >> > > FLIP > >> >> > > > > > should > >> >> > > > > > > > > > > > concentrate > >> >> > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> >> > TaskExecutors, > >> >> > > > > with > >> >> > > > > > > > > minimum > >> >> > > > > > > > > > > > > > > involvement > >> >> > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> >> > alternative > >> >> > > 3 > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > same > >> >> > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> alternative 2 > >> >> > > does, > >> >> > > > > but > >> >> > > > > > at > >> >> > > > > > > > the > >> >> > > > > > > > > > > cost > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > risk > >> >> > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > >> level, > >> >> > > which > >> >> > > > is > >> >> > > > > > not > >> >> > > > > > > > > good. > >> >> > > > > > > > > > > My > >> >> > > > > > > > > > > > > > point > >> >> > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > >> "JVM > >> >> > > > > Overhead" > >> >> > > > > > > are > >> >> > > > > > > > > not > >> >> > > > > > > > > > > easy > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > config. > >> >> > > > > > > > > > > > > > > > > > > For > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> configure > >> >> > them > >> >> > > > > > higher > >> >> > > > > > > > than > >> >> > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > actually > >> >> > > > > > > > > > > > > > > > > > > needed, > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > >> OOM. > >> >> For > >> >> > > > > > > alternative > >> >> > > > > > > > > 3, > >> >> > > > > > > > > > > > users > >> >> > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > get > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > >> config > >> >> the > >> >> > > two > >> >> > > > > > > options > >> >> > > > > > > > > > > > > aggressively > >> >> > > > > > > > > > > > > > > > high. > >> >> > > > > > > > > > > > > > > > > > But > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> >> overall > >> >> > > > > container > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > exceeds > >> >> > > > > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > >> Till > >> >> > > > > Rohrmann < > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > FLIP > >> >> > Xintong. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> >> looks > >> >> > > quite > >> >> > > > > > good. > >> >> > > > > > > > > > > > Concerning > >> >> > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > first > >> >> > > > > > > > > > > > > > > > > > > open > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > >> memory > >> >> > > > segments, > >> >> > > > > I > >> >> > > > > > > was > >> >> > > > > > > > > > > > wondering > >> >> > > > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > >> >> > context > >> >> > > > of > >> >> > > > > > this > >> >> > > > > > > > > FLIP > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > >> >> > knowing > >> >> > > > all > >> >> > > > > > > > > details, > >> >> > > > > > > > > > I > >> >> > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > > concerned > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > >> of > >> >> this > >> >> > > > FLIP > >> >> > > > > > too > >> >> > > > > > > > much > >> >> > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > call > >> >> sites > >> >> > of > >> >> > > > the > >> >> > > > > > > > > > > MemoryManager > >> >> > > > > > > > > > > > > > where > >> >> > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> >> mainly > >> >> > be > >> >> > > > > batch > >> >> > > > > > > > > > > operators). > >> >> > > > > > > > > > > > > The > >> >> > > > > > > > > > > > > > > > > addition > >> >> > > > > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > to > >> the > >> >> > > > > > > MemoryManager > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > affected > >> >> > > > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > >> this is > >> >> > the > >> >> > > > only > >> >> > > > > > > point > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > > > > interaction > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > >> the > >> >> > > > > > > MemoryManager. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> >> question > >> >> > > about > >> >> > > > > > > setting > >> >> > > > > > > > > or > >> >> > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > setting > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > >> also > >> >> be > >> >> > > > > > interested > >> >> > > > > > > > why > >> >> > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > Wang > >> >> > > > > > > > > > > > > > > > > thinks > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > best. > >> My > >> >> > > concern > >> >> > > > > > about > >> >> > > > > > > > > this > >> >> > > > > > > > > > > > would > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > >> >> are > >> >> > now > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> >> > > > > > > > > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > are > >> not > >> >> > > > clearly > >> >> > > > > > > > > separated > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > can > >> >> > > > > > > > > > > > > > > > spill > >> >> > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > >> quite > >> >> > hard > >> >> > > > to > >> >> > > > > > > > > understand > >> >> > > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > > exactly > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > >> >> too > >> >> > > much > >> >> > > > > > > memory. > >> >> > > > > > > > > This > >> >> > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > then > >> >> > > > > > > > > > > > > > > > > > easily > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > >> what > >> >> we > >> >> > > have > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> >> > > > > > > > > > > > > > > > So > >> >> > > > > > > > > > > > > > > > > > why > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > >> for > >> >> max > >> >> > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > giving > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > >> runs > >> >> into > >> >> > > an > >> >> > > > > OOM. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> alternative 2 > >> >> > lead > >> >> > > to > >> >> > > > > > lower > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > utilization > >> >> > > > > > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > the > >> >> direct > >> >> > > > > memory > >> >> > > > > > > to a > >> >> > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > value? > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > AM > >> >> > Xintong > >> >> > > > > Song < > >> >> > > > > > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > >> Yang. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > large > >> max > >> >> > > direct > >> >> > > > > > > memory > >> >> > > > > > > > > size > >> >> > > > > > > > > > > > > > > definitely > >> >> > > > > > > > > > > > > > > > > has > >> >> > > > > > > > > > > > > > > > > > > some > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> >> worry > >> >> > > about > >> >> > > > > > > direct > >> >> > > > > > > > > OOM, > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't > >> >> > > > > > > > > > > > > > > > > > even > >> >> > > > > > > > > > > > > > > > > > > > > need > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > >> >> > memory > >> >> > > > with > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > some > >> >> down > >> >> > > sides > >> >> > > > > of > >> >> > > > > > > > doing > >> >> > > > > > > > > > > this. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > >> of is > >> >> > that > >> >> > > > if > >> >> > > > > a > >> >> > > > > > > task > >> >> > > > > > > > > > > > executor > >> >> > > > > > > > > > > > > > > > > container > >> >> > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> >> memory, > >> >> > it > >> >> > > > > could > >> >> > > > > > > be > >> >> > > > > > > > > hard > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > use > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > know > >> >> > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > >> that > >> >> the > >> >> > > JVM > >> >> > > > > > never > >> >> > > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > due > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > reaching > >> >> > > > > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > >> because > >> >> the > >> >> > > > limit > >> >> > > > > > is > >> >> > > > > > > > too > >> >> > > > > > > > > > high > >> >> > > > > > > > > > > > to > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > reached. > >> >> > > > > > > > > > > > > > > > > > > > That > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > on > >> >> heap > >> >> > > > memory > >> >> > > > > to > >> >> > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > release > >> >> > > > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> >> problem > >> >> > in > >> >> > > > > cases > >> >> > > > > > > > where > >> >> > > > > > > > > > we > >> >> > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > more > >> >> > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > heap > >> >> > activity > >> >> > > > to > >> >> > > > > > > > trigger > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > >> reasons > >> >> > for > >> >> > > > > > > preferring > >> >> > > > > > > > > > > > setting a > >> >> > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else > I > >> >> > > > overlooked. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > >> between > >> >> > > > multiple > >> >> > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > >> think we > >> >> > > should > >> >> > > > > > throw > >> >> > > > > > > > an > >> >> > > > > > > > > > > error. > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > >> >> > client > >> >> > > > side > >> >> > > > > > is > >> >> > > > > > > a > >> >> > > > > > > > > good > >> >> > > > > > > > > > > > idea, > >> >> > > > > > > > > > > > > > so > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > >> problem > >> >> > > before > >> >> > > > > > > > submitting > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > Flink > >> >> > > > > > > > > > > > > > > > > > cluster, > >> >> > > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > >> the > >> >> > > client > >> >> > > > > side > >> >> > > > > > > > > > checking, > >> >> > > > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> TaskManagers > >> >> on > >> >> > > > > > different > >> >> > > > > > > > > > machines > >> >> > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > different > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > >> >> does > >> >> > > see > >> >> > > > > > that. > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > >> PM > >> >> Yang > >> >> > > > Wang > >> >> > > > > < > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> >> > proposal. > >> >> > > > > After > >> >> > > > > > > all > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > >> >> > > powerful > >> >> > > > to > >> >> > > > > > > > control > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > flink > >> >> > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > >> about > >> >> it. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > >> Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > >> user > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > and > >> >> > > > > > > > > > > native > >> >> > > > > > > > > > > > > > > memory. > >> >> > > > > > > > > > > > > > > > > > They > >> >> > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> >> memory. > >> >> > > > > Right? > >> >> > > > > > > So i > >> >> > > > > > > > > > don’t > >> >> > > > > > > > > > > > > think > >> >> > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > >> >> > > > properly. I > >> >> > > > > > > > prefer > >> >> > > > > > > > > > > > leaving > >> >> > > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> fine-grained > >> >> > > > > > > memory(network > >> >> > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > managed > >> >> > > > > > > > > > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> process > >> >> > > memory, > >> >> > > > > how > >> >> > > > > > do > >> >> > > > > > > > we > >> >> > > > > > > > > > deal > >> >> > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > >> >> > > > > configuration > >> >> > > > > > > in > >> >> > > > > > > > > > > client? > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> >> > > [hidden email]> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start > a > >> >> > > discussion > >> >> > > > > > > thread > >> >> > > > > > > > on > >> >> > > > > > > > > > > > > "FLIP-49: > >> >> > > > > > > > > > > > > > > > > Unified > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> >> > > > TaskExecutors"[1], > >> >> > > > > > > where > >> >> > > > > > > > we > >> >> > > > > > > > > > > > > describe > >> >> > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > improve > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configurations. > >> >> > > > > The > >> >> > > > > > > > FLIP > >> >> > > > > > > > > > > > document > >> >> > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > mostly > >> >> > > > > > > > > > > > > > > > > > > > based > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> >> Management > >> >> > > and > >> >> > > > > > > > > > Configuration > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> >> > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> follow-up > >> >> > > > > discussions > >> >> > > > > > > > both > >> >> > > > > > > > > > > online > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > offline. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > >> several > >> >> > > > > > shortcomings > >> >> > > > > > > of > >> >> > > > > > > > > > > current > >> >> > > > > > > > > > > > > > > (Flink > >> >> > > > > > > > > > > > > > > > > 1.9) > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configuration. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> configuration > >> >> > for > >> >> > > > > > > Streaming > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > Batch. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> difficult > >> >> > > > > > configuration > >> >> > > > > > > of > >> >> > > > > > > > > > > RocksDB > >> >> > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> uncertain > >> >> and > >> >> > > > hard > >> >> > > > > to > >> >> > > > > > > > > > > understand. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > the > >> >> > problems > >> >> > > > can > >> >> > > > > > be > >> >> > > > > > > > > > > summarized > >> >> > > > > > > > > > > > > as > >> >> > > > > > > > > > > > > > > > > follows. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> manager > >> >> to > >> >> > > also > >> >> > > > > > > account > >> >> > > > > > > > > for > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > state > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> TaskExecutor > >> >> > > memory > >> >> > > > > is > >> >> > > > > > > > > > > partitioned > >> >> > > > > > > > > > > > > > > > accounted > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > >> and > >> >> > pools. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> >> > > configuration > >> >> > > > > > > options > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > > > > calculations > >> >> > > > > > > > > > > > > > > > > > > logics. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > details > >> in > >> >> the > >> >> > > > FLIP > >> >> > > > > > wiki > >> >> > > > > > > > > > > document > >> >> > > > > > > > > > > > > [1]. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > >> early > >> >> > > design > >> >> > > > > doc > >> >> > > > > > > [2] > >> >> > > > > > > > is > >> >> > > > > > > > > > out > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > sync, > >> >> > > > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > it > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> >> > > discussion > >> >> > > > in > >> >> > > > > > > this > >> >> > > > > > > > > > > mailing > >> >> > > > > > > > > > > > > list > >> >> > > > > > > > > > > > > > > > > > thread.) > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> >> > > feedbacks. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song > >> < > >> >> > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> >> > wondering > >> >> > > > > > whether > >> >> > > > > > > > we > >> >> > > > > > > > > > can > >> >> > > > > > > > > > > > > avoid > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> managed > >> >> > memory > >> >> > > > and > >> >> > > > > > > > network > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > >> >> > > thought, > >> >> > > > I > >> >> > > > > > > think > >> >> > > > > > > > > even > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > >> off-heap > >> >> > > > managed > >> >> > > > > > > memory > >> >> > > > > > > > > > could > >> >> > > > > > > > > > > > > cause > >> >> > > > > > > > > > > > > > > problems. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> proposed > >> >> in > >> >> > > this > >> >> > > > > > FLIP > >> >> > > > > > > it > >> >> > > > > > > > > to > >> >> > > > > > > > > > > have > >> >> > > > > > > > > > > > > > both > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > memory > >> >> > > allocated > >> >> > > > > > > through > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> >> > practically > >> >> > > > > > native > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > only > >> >> parts > >> >> > of > >> >> > > > > > memory > >> >> > > > > > > > > > limited > >> >> > > > > > > > > > > by > >> >> > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > memory > >> and > >> >> > JVM > >> >> > > > > > > overhead, > >> >> > > > > > > > > > which > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > >> JVM > >> >> max > >> >> > > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > to. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> Rohrmann > >> >> < > >> >> > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> >> > > understand > >> >> > > > > the > >> >> > > > > > > two > >> >> > > > > > > > > > > > > alternatives > >> >> > > > > > > > > > > > > > > > now. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > because > >> it > >> >> > makes > >> >> > > > > > things > >> >> > > > > > > > > > > explicit. > >> >> > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > >> that > >> >> we > >> >> > > might > >> >> > > > > end > >> >> > > > > > > up > >> >> > > > > > > > > in a > >> >> > > > > > > > > > > > > similar > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > >> user > >> >> > might > >> >> > > > see > >> >> > > > > > that > >> >> > > > > > > > her > >> >> > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > gets > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why > >> this > >> >> is > >> >> > > the > >> >> > > > > > case. > >> >> > > > > > > > > > > > > Consequently, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > >> >> > > (similar > >> >> > > > to > >> >> > > > > > > > > > increasing > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > cutoff > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > >> extra > >> >> > > direct > >> >> > > > > > > memory. > >> >> > > > > > > > > > Even > >> >> > > > > > > > > > > > > worse, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > >> are > >> >> not > >> >> > > > fully > >> >> > > > > > used > >> >> > > > > > > > and > >> >> > > > > > > > > > > hence > >> >> > > > > > > > > > > > > > won't > >> >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > Xintong > >> >> Song < > >> >> > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> >> example > >> >> > > Till. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> scenario. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> Memory + > >> >> JVM > >> >> > > > > > > Overhead): > >> >> > > > > > > > > > 200MB > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> >> Metaspace, > >> >> > > > > > Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > > > 200MB. > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > a > >> >> > > > > > > > very > >> >> > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > Overhead > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > alternative 2 > >> >> and > >> >> > > > > > > alternative 3 > >> >> > > > > > > > > > > should > >> >> > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> >> > > > > -XX:MaxDirectMemorySize > >> >> > > > > > > will > >> >> > > > > > > > > not > >> >> > > > > > > > > > > > > reduce > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> frequent > >> >> OOM. > >> >> > > To > >> >> > > > > > avoid > >> >> > > > > > > > > that, > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > only > >> >> > > > > > > > > > > > > > > > thing > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> >> configuration > >> >> > > and > >> >> > > > > > > > increase > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > Direct > >> >> > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> Overhead). > >> >> > Let's > >> >> > > > say > >> >> > > > > > > that > >> >> > > > > > > > > user > >> >> > > > > > > > > > > > > > increases > >> >> > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> >> reduce > >> >> > the > >> >> > > > > total > >> >> > > > > > > > size > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > >> process > >> >> > > memory > >> >> > > > > > > remains > >> >> > > > > > > > > > 1GB. > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> >> chance of > >> >> > > > > direct > >> >> > > > > > > OOM. > >> >> > > > > > > > > > There > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > chances > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > >> memory > >> >> > limit, > >> >> > > > but > >> >> > > > > > > given > >> >> > > > > > > > > > that > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> >> memory > >> >> > > > > > (Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > > Memory, > >> >> > > > > > > > > > > > > > > > > Network > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > >> actual > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > > usage > >> >> > > > > > > > > > is > >> >> > > > > > > > > > > > > > > slightly > >> >> > > > > > > > > > > > > > > > > above > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> probably > >> >> do > >> >> > > not > >> >> > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > change > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > configurations. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> >> > > perspective, a > >> >> > > > > > > > feasible > >> >> > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> >> resource > >> >> > > > > > > utilization > >> >> > > > > > > > > > > compared > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> >> > Rohrmann > >> >> > > < > >> >> > > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> understand > >> >> the > >> >> > > > > > difference > >> >> > > > > > > > > > between > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > >> >> > > Xintong. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > Task > >> >> > > > > > > > > > Off-Heap > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > >> >> this > >> >> > > size > >> >> > > > > is > >> >> > > > > > > too > >> >> > > > > > > > > low > >> >> > > > > > > > > > > > > > resulting > >> >> > > > > > > > > > > > > > > > in a > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> >> potentially > >> >> > an > >> >> > > > OOM. > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > > > > something > >> >> > > > > > > > > > > larger > >> >> > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> >> reduce > >> >> > > the > >> >> > > > > > sizes > >> >> > > > > > > of > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > types. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > >> in an > >> >> > > under > >> >> > > > > > > > > utilization > >> >> > > > > > > > > > of > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> >> alternative 3 > >> >> > > > > > strictly > >> >> > > > > > > > > sets a > >> >> > > > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> >> little, > >> >> > > > then I > >> >> > > > > > > would > >> >> > > > > > > > > > > expect > >> >> > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > >> >> > > > > utilization. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > Yang > >> >> Wang < > >> >> > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large > >> max > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > size > >> >> > > > > > > > > > > when > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> >> memory. > >> >> > If > >> >> > > > the > >> >> > > > > > > direct > >> >> > > > > > > > > > > > > > > > memory,including > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > direct > >> >> > > > memory,could > >> >> > > > > > be > >> >> > > > > > > > > > > calculated > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > >> memory > >> >> > with > >> >> > > > > fixed > >> >> > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > and > >> >> k8s,we > >> >> > > > need > >> >> > > > > to > >> >> > > > > > > > check > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> >> > > submitting > >> >> > > > > > > > > successfully > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > failing > >> >> > > > > > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> [hidden email] > >> >> > > > > >于2019年8月13日 > >> >> > > > > > > > > > 周二22:07写道: > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > you > >> are > >> >> > > right > >> >> > > > > that > >> >> > > > > > > we > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > include > >> >> > > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > FLIP. > >> >> This > >> >> > > FLIP > >> >> > > > > > should > >> >> > > > > > > > > > > > concentrate > >> >> > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> >> > TaskExecutors, > >> >> > > > > with > >> >> > > > > > > > > minimum > >> >> > > > > > > > > > > > > > > involvement > >> >> > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> >> > alternative > >> >> > > 3 > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > same > >> >> > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> alternative 2 > >> >> > > does, > >> >> > > > > but > >> >> > > > > > at > >> >> > > > > > > > the > >> >> > > > > > > > > > > cost > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > risk > >> >> > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > >> level, > >> >> > > which > >> >> > > > is > >> >> > > > > > not > >> >> > > > > > > > > good. > >> >> > > > > > > > > > > My > >> >> > > > > > > > > > > > > > point > >> >> > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > >> "JVM > >> >> > > > > Overhead" > >> >> > > > > > > are > >> >> > > > > > > > > not > >> >> > > > > > > > > > > easy > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > config. > >> >> > > > > > > > > > > > > > > > > > > For > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> configure > >> >> > them > >> >> > > > > > higher > >> >> > > > > > > > than > >> >> > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > actually > >> >> > > > > > > > > > > > > > > > > > > needed, > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > >> OOM. > >> >> For > >> >> > > > > > > alternative > >> >> > > > > > > > > 3, > >> >> > > > > > > > > > > > users > >> >> > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > get > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > >> config > >> >> the > >> >> > > two > >> >> > > > > > > options > >> >> > > > > > > > > > > > > aggressively > >> >> > > > > > > > > > > > > > > > high. > >> >> > > > > > > > > > > > > > > > > > But > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> >> overall > >> >> > > > > container > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > exceeds > >> >> > > > > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > >> Till > >> >> > > > > Rohrmann < > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > FLIP > >> >> > Xintong. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> >> looks > >> >> > > quite > >> >> > > > > > good. > >> >> > > > > > > > > > > > Concerning > >> >> > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > first > >> >> > > > > > > > > > > > > > > > > > > open > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > >> memory > >> >> > > > segments, > >> >> > > > > I > >> >> > > > > > > was > >> >> > > > > > > > > > > > wondering > >> >> > > > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > >> >> > context > >> >> > > > of > >> >> > > > > > this > >> >> > > > > > > > > FLIP > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > >> >> > knowing > >> >> > > > all > >> >> > > > > > > > > details, > >> >> > > > > > > > > > I > >> >> > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > > concerned > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > >> of > >> >> this > >> >> > > > FLIP > >> >> > > > > > too > >> >> > > > > > > > much > >> >> > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > call > >> >> sites > >> >> > of > >> >> > > > the > >> >> > > > > > > > > > > MemoryManager > >> >> > > > > > > > > > > > > > where > >> >> > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> >> mainly > >> >> > be > >> >> > > > > batch > >> >> > > > > > > > > > > operators). > >> >> > > > > > > > > > > > > The > >> >> > > > > > > > > > > > > > > > > addition > >> >> > > > > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > to > >> the > >> >> > > > > > > MemoryManager > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > affected > >> >> > > > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > >> this is > >> >> > the > >> >> > > > only > >> >> > > > > > > point > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > > > > interaction > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > >> the > >> >> > > > > > > MemoryManager. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> >> question > >> >> > > about > >> >> > > > > > > setting > >> >> > > > > > > > > or > >> >> > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > setting > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > >> also > >> >> be > >> >> > > > > > interested > >> >> > > > > > > > why > >> >> > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > Wang > >> >> > > > > > > > > > > > > > > > > thinks > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > best. > >> My > >> >> > > concern > >> >> > > > > > about > >> >> > > > > > > > > this > >> >> > > > > > > > > > > > would > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > >> >> are > >> >> > now > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> >> > > > > > > > > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > are > >> not > >> >> > > > clearly > >> >> > > > > > > > > separated > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > can > >> >> > > > > > > > > > > > > > > > spill > >> >> > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > >> quite > >> >> > hard > >> >> > > > to > >> >> > > > > > > > > understand > >> >> > > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > > exactly > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > >> >> too > >> >> > > much > >> >> > > > > > > memory. > >> >> > > > > > > > > This > >> >> > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > then > >> >> > > > > > > > > > > > > > > > > > easily > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > >> what > >> >> we > >> >> > > have > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> >> > > > > > > > > > > > > > > > So > >> >> > > > > > > > > > > > > > > > > > why > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > >> for > >> >> max > >> >> > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > giving > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > >> runs > >> >> into > >> >> > > an > >> >> > > > > OOM. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> alternative 2 > >> >> > lead > >> >> > > to > >> >> > > > > > lower > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > utilization > >> >> > > > > > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > the > >> >> direct > >> >> > > > > memory > >> >> > > > > > > to a > >> >> > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > value? > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > AM > >> >> > Xintong > >> >> > > > > Song < > >> >> > > > > > > > > > > > > > > > [hidden email] > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > >> Yang. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > large > >> max > >> >> > > direct > >> >> > > > > > > memory > >> >> > > > > > > > > size > >> >> > > > > > > > > > > > > > > definitely > >> >> > > > > > > > > > > > > > > > > has > >> >> > > > > > > > > > > > > > > > > > > some > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> >> worry > >> >> > > about > >> >> > > > > > > direct > >> >> > > > > > > > > OOM, > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't > >> >> > > > > > > > > > > > > > > > > > even > >> >> > > > > > > > > > > > > > > > > > > > > need > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > >> >> > memory > >> >> > > > with > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > some > >> >> down > >> >> > > sides > >> >> > > > > of > >> >> > > > > > > > doing > >> >> > > > > > > > > > > this. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > >> of is > >> >> > that > >> >> > > > if > >> >> > > > > a > >> >> > > > > > > task > >> >> > > > > > > > > > > > executor > >> >> > > > > > > > > > > > > > > > > container > >> >> > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> >> memory, > >> >> > it > >> >> > > > > could > >> >> > > > > > > be > >> >> > > > > > > > > hard > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > use > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > know > >> >> > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > >> that > >> >> the > >> >> > > JVM > >> >> > > > > > never > >> >> > > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > due > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > reaching > >> >> > > > > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > >> because > >> >> the > >> >> > > > limit > >> >> > > > > > is > >> >> > > > > > > > too > >> >> > > > > > > > > > high > >> >> > > > > > > > > > > > to > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > reached. > >> >> > > > > > > > > > > > > > > > > > > > That > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > on > >> >> heap > >> >> > > > memory > >> >> > > > > to > >> >> > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > release > >> >> > > > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> >> problem > >> >> > in > >> >> > > > > cases > >> >> > > > > > > > where > >> >> > > > > > > > > > we > >> >> > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > more > >> >> > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > heap > >> >> > activity > >> >> > > > to > >> >> > > > > > > > trigger > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > >> reasons > >> >> > for > >> >> > > > > > > preferring > >> >> > > > > > > > > > > > setting a > >> >> > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else > I > >> >> > > > overlooked. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > >> between > >> >> > > > multiple > >> >> > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > >> think we > >> >> > > should > >> >> > > > > > throw > >> >> > > > > > > > an > >> >> > > > > > > > > > > error. > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > >> >> > client > >> >> > > > side > >> >> > > > > > is > >> >> > > > > > > a > >> >> > > > > > > > > good > >> >> > > > > > > > > > > > idea, > >> >> > > > > > > > > > > > > > so > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > >> problem > >> >> > > before > >> >> > > > > > > > submitting > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > Flink > >> >> > > > > > > > > > > > > > > > > > cluster, > >> >> > > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > >> the > >> >> > > client > >> >> > > > > side > >> >> > > > > > > > > > checking, > >> >> > > > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> TaskManagers > >> >> on > >> >> > > > > > different > >> >> > > > > > > > > > machines > >> >> > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > different > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > >> >> does > >> >> > > see > >> >> > > > > > that. > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > >> PM > >> >> Yang > >> >> > > > Wang > >> >> > > > > < > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> >> > proposal. > >> >> > > > > After > >> >> > > > > > > all > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > >> >> > > powerful > >> >> > > > to > >> >> > > > > > > > control > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > flink > >> >> > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > >> about > >> >> it. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > >> Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > >> user > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > and > >> >> > > > > > > > > > > native > >> >> > > > > > > > > > > > > > > memory. > >> >> > > > > > > > > > > > > > > > > > They > >> >> > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> >> memory. > >> >> > > > > Right? > >> >> > > > > > > So i > >> >> > > > > > > > > > don’t > >> >> > > > > > > > > > > > > think > >> >> > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > >> >> > > > properly. I > >> >> > > > > > > > prefer > >> >> > > > > > > > > > > > leaving > >> >> > > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> fine-grained > >> >> > > > > > > memory(network > >> >> > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > managed > >> >> > > > > > > > > > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> process > >> >> > > memory, > >> >> > > > > how > >> >> > > > > > do > >> >> > > > > > > > we > >> >> > > > > > > > > > deal > >> >> > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > >> >> > > > > configuration > >> >> > > > > > > in > >> >> > > > > > > > > > > client? > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> >> > > [hidden email]> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start > a > >> >> > > discussion > >> >> > > > > > > thread > >> >> > > > > > > > on > >> >> > > > > > > > > > > > > "FLIP-49: > >> >> > > > > > > > > > > > > > > > > Unified > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> >> > > > TaskExecutors"[1], > >> >> > > > > > > where > >> >> > > > > > > > we > >> >> > > > > > > > > > > > > describe > >> >> > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > improve > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configurations. > >> >> > > > > The > >> >> > > > > > > > FLIP > >> >> > > > > > > > > > > > document > >> >> > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > mostly > >> >> > > > > > > > > > > > > > > > > > > > based > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> >> Management > >> >> > > and > >> >> > > > > > > > > > Configuration > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> >> > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> follow-up > >> >> > > > > discussions > >> >> > > > > > > > both > >> >> > > > > > > > > > > online > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > offline. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > >> several > >> >> > > > > > shortcomings > >> >> > > > > > > of > >> >> > > > > > > > > > > current > >> >> > > > > > > > > > > > > > > (Flink > >> >> > > > > > > > > > > > > > > > > 1.9) > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configuration. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> configuration > >> >> > for > >> >> > > > > > > Streaming > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > Batch. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> difficult > >> >> > > > > > configuration > >> >> > > > > > > of > >> >> > > > > > > > > > > RocksDB > >> >> > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> uncertain > >> >> and > >> >> > > > hard > >> >> > > > > to > >> >> > > > > > > > > > > understand. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > the > >> >> > problems > >> >> > > > can > >> >> > > > > > be > >> >> > > > > > > > > > > summarized > >> >> > > > > > > > > > > > > as > >> >> > > > > > > > > > > > > > > > > follows. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> manager > >> >> to > >> >> > > also > >> >> > > > > > > account > >> >> > > > > > > > > for > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > state > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> TaskExecutor > >> >> > > memory > >> >> > > > > is > >> >> > > > > > > > > > > partitioned > >> >> > > > > > > > > > > > > > > > accounted > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > >> and > >> >> > pools. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> >> > > configuration > >> >> > > > > > > options > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > > > > calculations > >> >> > > > > > > > > > > > > > > > > > > logics. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > details > >> in > >> >> the > >> >> > > > FLIP > >> >> > > > > > wiki > >> >> > > > > > > > > > > document > >> >> > > > > > > > > > > > > [1]. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > >> early > >> >> > > design > >> >> > > > > doc > >> >> > > > > > > [2] > >> >> > > > > > > > is > >> >> > > > > > > > > > out > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > sync, > >> >> > > > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > it > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> >> > > discussion > >> >> > > > in > >> >> > > > > > > this > >> >> > > > > > > > > > > mailing > >> >> > > > > > > > > > > > > list > >> >> > > > > > > > > > > > > > > > > > thread.) > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> >> > > feedbacks. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > > >> > > > |
Hi All,
@Xitong thanks a lot for driving the discussion. I also reviewed the FLIP and it looks quite good to me. Here are some comments: - One thing I wanted to discuss is the backwards-compatibility with the previous user setups. We could list which options we plan to deprecate. From the first glance it looks possible to provide the same/similar behaviour for the setups relying on the deprecated options. E.g. setting taskmanager.memory.preallocate to true could override the new taskmanager.memory.managed.offheap-fraction to 1 etc. At the moment the FLIP just states that in some cases it may require re-configuring of cluster if migrated from prior versions. My suggestion is that we try to keep it backwards-compatible unless there is a good reason like some major complication for the implementation. Also couple of smaller things: - I suggest we remove TaskExecutorSpecifics from the FLIP and leave some general wording atm, like 'data structure to store' or 'utility classes'. When the classes are implemented, we put the concrete class names. This way we can avoid confusion and stale documents. - As I understand, if user task uses native memory (not direct memory, but e.g. unsafe.allocate or from external lib), there will be no explicit guard against exceeding 'task off heap memory'. Then user should still explicitly make sure that her/his direct buffer allocation plus any other memory usages does not exceed value announced as 'task off heap'. I guess there is no so much that can be done about it except mentioning in docs, similar to controlling the heap state backend. Thanks, Andrey On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> wrote: > I also agree that all the configuration should be calculated out of > TaskManager. > > So a full configuration should be generated before TaskManager started. > > Override the calculated configurations through -D now seems better. > > > > Best, > > Yang > > Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > > I just updated the FLIP wiki page [1], with the following changes: > > > > - Network memory uses JVM direct memory, and is accounted when setting > > JVM max direct memory size parameter. > > - Use dynamic configurations (`-Dkey=value`) to pass calculated memory > > configs into TaskExecutors, instead of ENV variables. > > - Remove 'supporting memory reservation' from the scope of this FLIP. > > > > @till @stephan, please take another look see if there are any other > > concerns. > > > > Thank you~ > > > > Xintong Song > > > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email]> > > wrote: > > > > > Sorry for the late response. > > > > > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the > detail > > > in PR. > > > - Regarding passing parameters into the `TaskExecutor`, +1 for using > > > dynamic configuration at the moment, given that there are more > questions > > to > > > be discussed to have a general framework for overwriting configurations > > > with ENV variables. > > > - Regarding memory reservation, I double checked with Yu and he will > take > > > care of it. > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> > > > wrote: > > > > > >> What I forgot to add is that we could tackle specifying the > > configuration > > >> fully in an incremental way and that the full specification should be > > the > > >> desired end state. > > >> > > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> > > >> wrote: > > >> > > >> > I think our goal should be that the configuration is fully specified > > >> when > > >> > the process is started. By considering the internal calculation step > > to > > >> be > > >> > rather validate existing values and calculate missing ones, these > two > > >> > proposal shouldn't even conflict (given determinism). > > >> > > > >> > Since we don't want to change an existing flink-conf.yaml, > specifying > > >> the > > >> > full configuration would require to pass in the options differently. > > >> > > > >> > One way could be the ENV variables approach. The reason why I'm > trying > > >> to > > >> > exclude this feature from the FLIP is that I believe it needs a bit > > more > > >> > discussion. Just some questions which come to my mind: What would be > > the > > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator > which > > is > > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > > >> > separator what would be the order of precedence if there are two ENV > > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > > >> > precedence of env variable vs. dynamic configuration value specified > > >> via -D? > > >> > > > >> > Another approach could be to pass in the dynamic configuration > values > > >> via > > >> > `-Dkey=value` to the Flink process. For that we don't have to change > > >> > anything because the functionality already exists. > > >> > > > >> > Cheers, > > >> > Till > > >> > > > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> > > wrote: > > >> > > > >> >> I see. Under the assumption of strict determinism that should work. > > >> >> > > >> >> The original proposal had this point "don't compute inside the TM, > > >> compute > > >> >> outside and supply a full config", because that sounded more > > intuitive. > > >> >> > > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > [hidden email] > > > > > >> >> wrote: > > >> >> > > >> >> > My understanding was that before starting the Flink process we > > call a > > >> >> > utility which calculates these values. I assume that this utility > > >> will > > >> >> do > > >> >> > the calculation based on a set of configured values (process > > memory, > > >> >> flink > > >> >> > memory, network memory etc.). Assuming that these values don't > > differ > > >> >> from > > >> >> > the values with which the JVM is started, it should be possible > to > > >> >> > recompute them in the Flink process in order to set the values. > > >> >> > > > >> >> > > > >> >> > > > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email]> > > >> wrote: > > >> >> > > > >> >> > > When computing the values in the JVM process after it started, > > how > > >> >> would > > >> >> > > you deal with values like Max Direct Memory, Metaspace size. > > native > > >> >> > memory > > >> >> > > reservation (reduce heap size), etc? All the values that are > > >> >> parameters > > >> >> > to > > >> >> > > the JVM process and that need to be supplied at process > startup? > > >> >> > > > > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > >> [hidden email]> > > >> >> > > wrote: > > >> >> > > > > >> >> > > > Thanks for the clarification. I have some more comments: > > >> >> > > > > > >> >> > > > - I would actually split the logic to compute the process > > memory > > >> >> > > > requirements and storing the values into two things. E.g. one > > >> could > > >> >> > name > > >> >> > > > the former TaskExecutorProcessUtility and the latter > > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR > > >> since > > >> >> it's > > >> >> > > > just a naming detail. > > >> >> > > > > > >> >> > > > - Generally, I'm not opposed to making configuration values > > >> >> overridable > > >> >> > > by > > >> >> > > > ENV variables. I think this is a very good idea and makes the > > >> >> > > > configurability of Flink processes easier. However, I think > > that > > >> >> adding > > >> >> > > > this functionality should not be part of this FLIP because it > > >> would > > >> >> > > simply > > >> >> > > > widen the scope unnecessarily. > > >> >> > > > > > >> >> > > > The reasons why I believe it is unnecessary are the > following: > > >> For > > >> >> Yarn > > >> >> > > we > > >> >> > > > already create write a flink-conf.yaml which could be > populated > > >> with > > >> >> > the > > >> >> > > > memory settings. For the other processes it should not make a > > >> >> > difference > > >> >> > > > whether the loaded Configuration is populated with the memory > > >> >> settings > > >> >> > > from > > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to > compute > > >> the > > >> >> > > missing > > >> >> > > > values from the loaded configuration. If the latter would not > > be > > >> >> > possible > > >> >> > > > (wrong or missing configuration values), then we should not > > have > > >> >> been > > >> >> > > able > > >> >> > > > to actually start the process in the first place. > > >> >> > > > > > >> >> > > > - Concerning the memory reservation: I agree with you that we > > >> need > > >> >> the > > >> >> > > > memory reservation functionality to make streaming jobs work > > with > > >> >> > > "managed" > > >> >> > > > memory. However, w/o this functionality the whole Flip would > > >> already > > >> >> > > bring > > >> >> > > > a good amount of improvements to our users when running batch > > >> jobs. > > >> >> > > > Moreover, by keeping the scope smaller we can complete the > FLIP > > >> >> faster. > > >> >> > > > Hence, I would propose to address the memory reservation > > >> >> functionality > > >> >> > > as a > > >> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). > > >> >> > > > > > >> >> > > > Cheers, > > >> >> > > > Till > > >> >> > > > > > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > >> [hidden email]> > > >> >> > > wrote: > > >> >> > > > > > >> >> > > > > Just add my 2 cents. > > >> >> > > > > > > >> >> > > > > Using environment variables to override the configuration > for > > >> >> > different > > >> >> > > > > taskmanagers is better. > > >> >> > > > > We do not need to generate dedicated flink-conf.yaml for > all > > >> >> > > > taskmanagers. > > >> >> > > > > A common flink-conf.yam and different environment variables > > are > > >> >> > enough. > > >> >> > > > > By reducing the distributed cached files, it could make > > >> launching > > >> >> a > > >> >> > > > > taskmanager faster. > > >> >> > > > > > > >> >> > > > > Stephan gives a good suggestion that we could move the > logic > > >> into > > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > >> >> > > > > Maybe the client could also benefit from this. Different > > users > > >> do > > >> >> not > > >> >> > > > have > > >> >> > > > > to export FLINK_CONF_DIR to update few config options. > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > Best, > > >> >> > > > > Yang > > >> >> > > > > > > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > > >> >> > > > > > > >> >> > > > > > One note on the Environment Variables and Configuration > > >> >> discussion. > > >> >> > > > > > > > >> >> > > > > > My understanding is that passed ENV variables are added > to > > >> the > > >> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" > > >> method > > >> >> (or > > >> >> > > > > > similar). > > >> >> > > > > > For all the code inside Flink, it looks like the data was > > in > > >> the > > >> >> > > config > > >> >> > > > > to > > >> >> > > > > > start with, just that the scripts that compute the > > variables > > >> can > > >> >> > pass > > >> >> > > > the > > >> >> > > > > > values to the process without actually needing to write a > > >> file. > > >> >> > > > > > > > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" method > > >> would > > >> >> > take > > >> >> > > > any > > >> >> > > > > > ENV variable prefixed with "flink" and add it as a config > > >> key. > > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > > >> >> > > > "taskmanager.memory.size: > > >> >> > > > > > 2g". > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > >> >> > [hidden email]> > > >> >> > > > > > wrote: > > >> >> > > > > > > > >> >> > > > > > > Thanks for the comments, Till. > > >> >> > > > > > > > > >> >> > > > > > > I've also seen your comments on the wiki page, but > let's > > >> keep > > >> >> the > > >> >> > > > > > > discussion here. > > >> >> > > > > > > > > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think > > about > > >> >> > naming > > >> >> > > it > > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > >> >> > > > > > > - Regarding passing memory configurations into task > > >> executors, > > >> >> > I'm > > >> >> > > in > > >> >> > > > > > favor > > >> >> > > > > > > of do it via environment variables rather than > > >> configurations, > > >> >> > with > > >> >> > > > the > > >> >> > > > > > > following two reasons. > > >> >> > > > > > > - It is easier to keep the memory options once > > calculate > > >> >> not to > > >> >> > > be > > >> >> > > > > > > changed with environment variables rather than > > >> configurations. > > >> >> > > > > > > - I'm not sure whether we should write the > > configuration > > >> in > > >> >> > > startup > > >> >> > > > > > > scripts. Writing changes into the configuration files > > when > > >> >> > running > > >> >> > > > the > > >> >> > > > > > > startup scripts does not sounds right to me. Or we > could > > >> make > > >> >> a > > >> >> > > copy > > >> >> > > > of > > >> >> > > > > > > configuration files per flink cluster, and make the > task > > >> >> executor > > >> >> > > to > > >> >> > > > > load > > >> >> > > > > > > from the copy, and clean up the copy after the cluster > is > > >> >> > shutdown, > > >> >> > > > > which > > >> >> > > > > > > is complicated. (I think this is also what Stephan > means > > in > > >> >> his > > >> >> > > > comment > > >> >> > > > > > on > > >> >> > > > > > > the wiki page?) > > >> >> > > > > > > - Regarding reserving memory, I think this change > should > > be > > >> >> > > included > > >> >> > > > in > > >> >> > > > > > > this FLIP. I think a big part of motivations of this > FLIP > > >> is > > >> >> to > > >> >> > > unify > > >> >> > > > > > > memory configuration for streaming / batch and make it > > easy > > >> >> for > > >> >> > > > > > configuring > > >> >> > > > > > > rocksdb memory. If we don't support memory reservation, > > >> then > > >> >> > > > streaming > > >> >> > > > > > jobs > > >> >> > > > > > > cannot use managed memory (neither on-heap or > off-heap), > > >> which > > >> >> > > makes > > >> >> > > > > this > > >> >> > > > > > > FLIP incomplete. > > >> >> > > > > > > - Regarding network memory, I think you are right. I > > think > > >> we > > >> >> > > > probably > > >> >> > > > > > > don't need to change network stack from using direct > > >> memory to > > >> >> > > using > > >> >> > > > > > unsafe > > >> >> > > > > > > native memory. Network memory size is deterministic, > > >> cannot be > > >> >> > > > reserved > > >> >> > > > > > as > > >> >> > > > > > > managed memory does, and cannot be overused. I think it > > >> also > > >> >> > works > > >> >> > > if > > >> >> > > > > we > > >> >> > > > > > > simply keep using direct memory for network and include > > it > > >> in > > >> >> jvm > > >> >> > > max > > >> >> > > > > > > direct memory size. > > >> >> > > > > > > > > >> >> > > > > > > Thank you~ > > >> >> > > > > > > > > >> >> > > > > > > Xintong Song > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > >> >> > > [hidden email]> > > >> >> > > > > > > wrote: > > >> >> > > > > > > > > >> >> > > > > > > > Hi Xintong, > > >> >> > > > > > > > > > >> >> > > > > > > > thanks for addressing the comments and adding a more > > >> >> detailed > > >> >> > > > > > > > implementation plan. I have a couple of comments > > >> concerning > > >> >> the > > >> >> > > > > > > > implementation plan: > > >> >> > > > > > > > > > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > > >> >> descriptive. > > >> >> > > > > Choosing > > >> >> > > > > > a > > >> >> > > > > > > > different name could help here. > > >> >> > > > > > > > - I'm not sure whether I would pass the memory > > >> >> configuration to > > >> >> > > the > > >> >> > > > > > > > TaskExecutor via environment variables. I think it > > would > > >> be > > >> >> > > better > > >> >> > > > to > > >> >> > > > > > > write > > >> >> > > > > > > > it into the configuration one uses to start the TM > > >> process. > > >> >> > > > > > > > - If possible, I would exclude the memory reservation > > >> from > > >> >> this > > >> >> > > > FLIP > > >> >> > > > > > and > > >> >> > > > > > > > add this as part of a dedicated FLIP. > > >> >> > > > > > > > - If possible, then I would exclude changes to the > > >> network > > >> >> > stack > > >> >> > > > from > > >> >> > > > > > > this > > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory > > >> needed > > >> >> by > > >> >> > > the > > >> >> > > > > > > network > > >> >> > > > > > > > stack is the framework direct memory requirement. > > >> Changing > > >> >> how > > >> >> > > the > > >> >> > > > > > memory > > >> >> > > > > > > > is allocated can happen in a second step. This would > > keep > > >> >> the > > >> >> > > scope > > >> >> > > > > of > > >> >> > > > > > > this > > >> >> > > > > > > > FLIP smaller. > > >> >> > > > > > > > > > >> >> > > > > > > > Cheers, > > >> >> > > > > > > > Till > > >> >> > > > > > > > > > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > >> >> > > > [hidden email]> > > >> >> > > > > > > > wrote: > > >> >> > > > > > > > > > >> >> > > > > > > > > Hi everyone, > > >> >> > > > > > > > > > > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], with > > the > > >> >> > > following > > >> >> > > > > > > changes. > > >> >> > > > > > > > > > > >> >> > > > > > > > > - Removed open question regarding MemorySegment > > >> >> > allocation. > > >> >> > > As > > >> >> > > > > > > > > discussed, we exclude this topic from the scope > of > > >> this > > >> >> > > FLIP. > > >> >> > > > > > > > > - Updated content about JVM direct memory > > parameter > > >> >> > > according > > >> >> > > > to > > >> >> > > > > > > > recent > > >> >> > > > > > > > > discussions, and moved the other options to > > >> "Rejected > > >> >> > > > > > Alternatives" > > >> >> > > > > > > > for > > >> >> > > > > > > > > the > > >> >> > > > > > > > > moment. > > >> >> > > > > > > > > - Added implementation steps. > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > >> >> > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > [1] > > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> >> > > > > > > > > > > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > > >> >> > [hidden email] > > >> >> > > > > > >> >> > > > > > wrote: > > >> >> > > > > > > > > > > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users > before > > >> task > > >> >> > > dispose > > >> >> > > > > and > > >> >> > > > > > > > > memory > > >> >> > > > > > > > > > release": I agree, that's how it should be. Let's > > >> try it > > >> >> > out. > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not > wait > > >> for > > >> >> GC > > >> >> > > when > > >> >> > > > > > > > allocating > > >> >> > > > > > > > > > direct memory buffer": There seems to be pretty > > >> >> elaborate > > >> >> > > logic > > >> >> > > > > to > > >> >> > > > > > > free > > >> >> > > > > > > > > > buffers when allocating new ones. See > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default > > works > > >> >> (like > > >> >> > > > going > > >> >> > > > > > > with > > >> >> > > > > > > > > > option 2 and not setting > "-XX:MaxDirectMemorySize" > > at > > >> >> all), > > >> >> > > > then > > >> >> > > > > I > > >> >> > > > > > > > think > > >> >> > > > > > > > > it > > >> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" > to > > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if > > we > > >> use > > >> >> > > > RocksDB. > > >> >> > > > > > > That > > >> >> > > > > > > > > is a > > >> >> > > > > > > > > > big if, though, I honestly have no idea :D Would > be > > >> >> good to > > >> >> > > > > > > understand > > >> >> > > > > > > > > > this, though, because this would affect option > (2) > > >> and > > >> >> > option > > >> >> > > > > > (1.2). > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > > >> >> > > > > > [hidden email]> > > >> >> > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > Let me try to summarize your points. Please > > correct > > >> >> me if > > >> >> > > I'm > > >> >> > > > > > > wrong. > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > - Memory consumers should always avoid > > returning > > >> >> > memory > > >> >> > > > > > segments > > >> >> > > > > > > > to > > >> >> > > > > > > > > > > memory manager while there are still > > un-cleaned > > >> >> > > > structures / > > >> >> > > > > > > > threads > > >> >> > > > > > > > > > > that > > >> >> > > > > > > > > > > may use the memory. Otherwise, it would > cause > > >> >> serious > > >> >> > > > > problems > > >> >> > > > > > > by > > >> >> > > > > > > > > > having > > >> >> > > > > > > > > > > multiple consumers trying to use the same > > memory > > >> >> > > segment. > > >> >> > > > > > > > > > > - JVM does not wait for GC when allocating > > >> direct > > >> >> > memory > > >> >> > > > > > buffer. > > >> >> > > > > > > > > > > Therefore even we set proper max direct > memory > > >> size > > >> >> > > limit, > > >> >> > > > > we > > >> >> > > > > > > may > > >> >> > > > > > > > > > still > > >> >> > > > > > > > > > > encounter direct memory oom if the GC > cleaning > > >> >> memory > > >> >> > > > slower > > >> >> > > > > > > than > > >> >> > > > > > > > > the > > >> >> > > > > > > > > > > direct memory allocation. > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > Am I understanding this correctly? > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > > >> >> > > > > > > [hidden email] > > >> >> > > > > > > > > > > .invalid> > > >> >> > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > Hi stephan: > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > About option 2: > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > if additional threads not cleanly shut down > > >> before > > >> >> we > > >> >> > can > > >> >> > > > > exit > > >> >> > > > > > > the > > >> >> > > > > > > > > > task: > > >> >> > > > > > > > > > > > In the current case of memory reuse, it has > > >> freed up > > >> >> > the > > >> >> > > > > memory > > >> >> > > > > > > it > > >> >> > > > > > > > > > > > uses. If this memory is used by other tasks > > and > > >> >> > > > asynchronous > > >> >> > > > > > > > threads > > >> >> > > > > > > > > > > > of exited task may still be writing, there > > will > > >> be > > >> >> > > > > concurrent > > >> >> > > > > > > > > security > > >> >> > > > > > > > > > > > problems, and even lead to errors in user > > >> computing > > >> >> > > > results. > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > So I think this is a serious and intolerable > > >> bug, No > > >> >> > > matter > > >> >> > > > > > what > > >> >> > > > > > > > the > > >> >> > > > > > > > > > > > option is, it should be avoided. > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > > >> encountered so > > >> >> > many > > >> >> > > > > > > > situations > > >> >> > > > > > > > > > > > that it's too late for GC to cause > > DirectMemory > > >> >> OOM. > > >> >> > > > Release > > >> >> > > > > > and > > >> >> > > > > > > > > > > > allocate DirectMemory depend on the type of > > user > > >> >> job, > > >> >> > > > which > > >> >> > > > > is > > >> >> > > > > > > > > > > > often beyond our control. > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > Best, > > >> >> > > > > > > > > > > > Jingsong Lee > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > >> >> > > ------------------------------------------------------------------ > > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > > >> >> > > Configuration > > >> >> > > > > for > > >> >> > > > > > > > > > > > TaskExecutors > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > My main concern with option 2 (manually > release > > >> >> memory) > > >> >> > > is > > >> >> > > > > that > > >> >> > > > > > > > > > segfaults > > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on > user > > >> >> ends. > > >> >> > So > > >> >> > > we > > >> >> > > > > > need > > >> >> > > > > > > to > > >> >> > > > > > > > > > > > guarantee that this never happens. > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data > > >> >> structures / > > >> >> > > > > > algorithms > > >> >> > > > > > > > > with > > >> >> > > > > > > > > > > > additional threads, like hash table > spill/read > > >> and > > >> >> > > sorting > > >> >> > > > > > > threads. > > >> >> > > > > > > > > We > > >> >> > > > > > > > > > > need > > >> >> > > > > > > > > > > > to ensure that these cleanly shut down before > > we > > >> can > > >> >> > exit > > >> >> > > > the > > >> >> > > > > > > task. > > >> >> > > > > > > > > > > > I am not sure that we have that guaranteed > > >> already, > > >> >> > > that's > > >> >> > > > > why > > >> >> > > > > > > > option > > >> >> > > > > > > > > > 1.1 > > >> >> > > > > > > > > > > > seemed simpler to me. > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song > < > > >> >> > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. > Summarized > > in > > >> >> this > > >> >> > > way > > >> >> > > > > > really > > >> >> > > > > > > > > makes > > >> >> > > > > > > > > > > > > things easier to understand. > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the > > >> >> moment. I > > >> >> > > > think > > >> >> > > > > it > > >> >> > > > > > > is > > >> >> > > > > > > > > not > > >> >> > > > > > > > > > > that > > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for > memory > > >> >> > manager, > > >> >> > > as > > >> >> > > > > > long > > >> >> > > > > > > as > > >> >> > > > > > > > > we > > >> >> > > > > > > > > > > > always > > >> >> > > > > > > > > > > > > de-allocate the memory segment when it is > > >> released > > >> >> > from > > >> >> > > > the > > >> >> > > > > > > > memory > > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer > > continue > > >> >> using > > >> >> > > the > > >> >> > > > > > > buffer > > >> >> > > > > > > > of > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > segment after releasing it, in which case > we > > do > > >> >> want > > >> >> > > the > > >> >> > > > > job > > >> >> > > > > > to > > >> >> > > > > > > > > fail > > >> >> > > > > > > > > > so > > >> >> > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > detect the memory leak early. > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a > good > > >> idea. > > >> >> > Not > > >> >> > > > only > > >> >> > > > > > > > because > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > assumption (regular GC is enough to clean > > >> direct > > >> >> > > buffers) > > >> >> > > > > may > > >> >> > > > > > > not > > >> >> > > > > > > > > > > always > > >> >> > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > true, but also it makes harder for finding > > >> >> problems > > >> >> > in > > >> >> > > > > cases > > >> >> > > > > > of > > >> >> > > > > > > > > > memory > > >> >> > > > > > > > > > > > > overuse. E.g., user configured some direct > > >> memory > > >> >> for > > >> >> > > the > > >> >> > > > > > user > > >> >> > > > > > > > > > > libraries. > > >> >> > > > > > > > > > > > > If the library actually use more direct > > memory > > >> >> then > > >> >> > > > > > configured, > > >> >> > > > > > > > > which > > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are > > still > > >> in > > >> >> > use, > > >> >> > > > may > > >> >> > > > > > lead > > >> >> > > > > > > > to > > >> >> > > > > > > > > > > > overuse > > >> >> > > > > > > > > > > > > of the total container memory. In that > case, > > >> if it > > >> >> > > didn't > > >> >> > > > > > touch > > >> >> > > > > > > > the > > >> >> > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > default max direct memory limit, we cannot > > get > > >> a > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > OOM > > >> >> > > > > > > > > > and > > >> >> > > > > > > > > > > it > > >> >> > > > > > > > > > > > > will become super hard to understand which > > >> part of > > >> >> > the > > >> >> > > > > > > > > configuration > > >> >> > > > > > > > > > > need > > >> >> > > > > > > > > > > > > to be updated. > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.1, it has the similar problem > as > > >> >> 1.2, if > > >> >> > > the > > >> >> > > > > > > > exceeded > > >> >> > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > memory does not reach the max direct memory > > >> limit > > >> >> > > > specified > > >> >> > > > > > by > > >> >> > > > > > > > the > > >> >> > > > > > > > > > > > > dedicated parameter. I think it is slightly > > >> better > > >> >> > than > > >> >> > > > > 1.2, > > >> >> > > > > > > only > > >> >> > > > > > > > > > > because > > >> >> > > > > > > > > > > > > we can tune the parameter. > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan > Ewen > > < > > >> >> > > > > > [hidden email] > > >> >> > > > > > > > > > >> >> > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > > >> discussion, > > >> >> > maybe > > >> >> > > > let > > >> >> > > > > > me > > >> >> > > > > > > > > > > summarize > > >> >> > > > > > > > > > > > > it a > > >> >> > > > > > > > > > > > > > bit differently: > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > We have the following two options: > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated > > by > > >> the > > >> >> > GC. > > >> >> > > > That > > >> >> > > > > > > makes > > >> >> > > > > > > > > it > > >> >> > > > > > > > > > > > > segfault > > >> >> > > > > > > > > > > > > > safe. But then we need a way to trigger > GC > > in > > >> >> case > > >> >> > > > > > > > de-allocation > > >> >> > > > > > > > > > and > > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments > > happens > > >> >> > quickly, > > >> >> > > > > which > > >> >> > > > > > > is > > >> >> > > > > > > > > > often > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > case during batch scheduling or task > > restart. > > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option > > >> 1.1) > > >> >> is > > >> >> > one > > >> >> > > > way > > >> >> > > > > > to > > >> >> > > > > > > do > > >> >> > > > > > > > > > this > > >> >> > > > > > > > > > > > > > - Another way could be to have a > > dedicated > > >> >> > > > bookkeeping > > >> >> > > > > in > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this > > is a > > >> >> > number > > >> >> > > > > > > > independent > > >> >> > > > > > > > > of > > >> >> > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate > > the > > >> >> memory > > >> >> > > for > > >> >> > > > > the > > >> >> > > > > > > > > > > > > MemorySegments > > >> >> > > > > > > > > > > > > > (option 2). That way we need not worry > > about > > >> >> > > triggering > > >> >> > > > > GC > > >> >> > > > > > by > > >> >> > > > > > > > > some > > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is > harder > > to > > >> >> > prevent > > >> >> > > > > > > > segfaults. > > >> >> > > > > > > > > We > > >> >> > > > > > > > > > > > need > > >> >> > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > be very careful about when we release the > > >> memory > > >> >> > > > segments > > >> >> > > > > > > (only > > >> >> > > > > > > > > in > > >> >> > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably > need > > to > > >> >> set > > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > >> >> > > "off_heap_managed_memory + > > >> >> > > > > > > > > > > direct_memory" > > >> >> > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate > reserved > > >> >> memory > > >> >> > > > pool. > > >> >> > > > > > > > Because > > >> >> > > > > > > > > if > > >> >> > > > > > > > > > > we > > >> >> > > > > > > > > > > > > just > > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > >> >> > > > > "off_heap_managed_memory + > > >> >> > > > > > > > > > > > > jvm_overhead", > > >> >> > > > > > > > > > > > > > then there will be times when that entire > > >> >> memory is > > >> >> > > > > > allocated > > >> >> > > > > > > > by > > >> >> > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > > buffers and we have nothing left for the > > JVM > > >> >> > > overhead. > > >> >> > > > So > > >> >> > > > > > we > > >> >> > > > > > > > > either > > >> >> > > > > > > > > > > > need > > >> >> > > > > > > > > > > > > a > > >> >> > > > > > > > > > > > > > way to compensate for that (again some > > safety > > >> >> > margin > > >> >> > > > > cutoff > > >> >> > > > > > > > > value) > > >> >> > > > > > > > > > or > > >> >> > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > will exceed container memory. > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be > > aware > > >> >> that > > >> >> > it > > >> >> > > > > takes > > >> >> > > > > > > > > > elaborate > > >> >> > > > > > > > > > > > > logic > > >> >> > > > > > > > > > > > > > to push recycling of direct buffers > without > > >> >> always > > >> >> > > > > > > triggering a > > >> >> > > > > > > > > > full > > >> >> > > > > > > > > > > > GC. > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > My first guess is that the options will > be > > >> >> easiest > > >> >> > to > > >> >> > > > do > > >> >> > > > > in > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > following > > >> >> > > > > > > > > > > > > > order: > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > > direct_memory > > >> >> > > > parameter, > > >> >> > > > > as > > >> >> > > > > > > > > > discussed > > >> >> > > > > > > > > > > > > > above. We would need to find a way to set > > the > > >> >> > > > > direct_memory > > >> >> > > > > > > > > > parameter > > >> >> > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > default. We could start with 64 MB and > see > > >> how > > >> >> it > > >> >> > > goes > > >> >> > > > in > > >> >> > > > > > > > > practice. > > >> >> > > > > > > > > > > One > > >> >> > > > > > > > > > > > > > danger I see is that setting this loo low > > can > > >> >> > cause a > > >> >> > > > > bunch > > >> >> > > > > > > of > > >> >> > > > > > > > > > > > additional > > >> >> > > > > > > > > > > > > > GCs compared to before (we need to watch > > this > > >> >> > > > carefully). > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite simple > > to > > >> >> > > implement, > > >> >> > > > > we > > >> >> > > > > > > > could > > >> >> > > > > > > > > > try > > >> >> > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > >> >> > > > > > > > > > > > > parameter > > >> >> > > > > > > > > > > > > > at all and assume that all the direct > > memory > > >> >> > > > allocations > > >> >> > > > > > that > > >> >> > > > > > > > the > > >> >> > > > > > > > > > JVM > > >> >> > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be > > cleaned > > >> up > > >> >> > fast > > >> >> > > > > enough > > >> >> > > > > > > > > through > > >> >> > > > > > > > > > > > > regular > > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > > >> assumption, > > >> >> > > > though. > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > Best, > > >> >> > > > > > > > > > > > > > Stephan > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > > Song > > >> < > > >> >> > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I > was > > >> >> > wondering > > >> >> > > > > > whether > > >> >> > > > > > > > we > > >> >> > > > > > > > > > can > > >> >> > > > > > > > > > > > > avoid > > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > > >> managed > > >> >> > memory > > >> >> > > > and > > >> >> > > > > > > > network > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > with > > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > > second > > >> >> > > thought, > > >> >> > > > I > > >> >> > > > > > > think > > >> >> > > > > > > > > even > > >> >> > > > > > > > > > > for > > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > > >> off-heap > > >> >> > > > managed > > >> >> > > > > > > memory > > >> >> > > > > > > > > > could > > >> >> > > > > > > > > > > > > cause > > >> >> > > > > > > > > > > > > > > problems. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > > >> proposed > > >> >> in > > >> >> > > this > > >> >> > > > > > FLIP > > >> >> > > > > > > it > > >> >> > > > > > > > > to > > >> >> > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > both > > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > > memory > > >> >> > > allocated > > >> >> > > > > > > through > > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > >> >> > practically > > >> >> > > > > > native > > >> >> > > > > > > > > memory > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > > only > > >> >> parts > > >> >> > of > > >> >> > > > > > memory > > >> >> > > > > > > > > > limited > > >> >> > > > > > > > > > > by > > >> >> > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > > memory > > >> and > > >> >> > JVM > > >> >> > > > > > > overhead, > > >> >> > > > > > > > > > which > > >> >> > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set > the > > >> JVM > > >> >> max > > >> >> > > > > direct > > >> >> > > > > > > > memory > > >> >> > > > > > > > > > to. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > > >> Rohrmann > > >> >> < > > >> >> > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > Xintong. I > > >> >> > > understand > > >> >> > > > > the > > >> >> > > > > > > two > > >> >> > > > > > > > > > > > > alternatives > > >> >> > > > > > > > > > > > > > > > now. > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > > because > > >> it > > >> >> > makes > > >> >> > > > > > things > > >> >> > > > > > > > > > > explicit. > > >> >> > > > > > > > > > > > If > > >> >> > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > > >> that > > >> >> we > > >> >> > > might > > >> >> > > > > end > > >> >> > > > > > > up > > >> >> > > > > > > > > in a > > >> >> > > > > > > > > > > > > similar > > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > > >> user > > >> >> > might > > >> >> > > > see > > >> >> > > > > > that > > >> >> > > > > > > > her > > >> >> > > > > > > > > > > > process > > >> >> > > > > > > > > > > > > > > gets > > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know > why > > >> this > > >> >> is > > >> >> > > the > > >> >> > > > > > case. > > >> >> > > > > > > > > > > > > Consequently, > > >> >> > > > > > > > > > > > > > > she > > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > > size > > >> >> > > (similar > > >> >> > > > to > > >> >> > > > > > > > > > increasing > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > cutoff > > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for > the > > >> extra > > >> >> > > direct > > >> >> > > > > > > memory. > > >> >> > > > > > > > > > Even > > >> >> > > > > > > > > > > > > worse, > > >> >> > > > > > > > > > > > > > > she > > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > which > > >> are > > >> >> not > > >> >> > > > fully > > >> >> > > > > > used > > >> >> > > > > > > > and > > >> >> > > > > > > > > > > hence > > >> >> > > > > > > > > > > > > > won't > > >> >> > > > > > > > > > > > > > > > change the overall memory > consumption. > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > > Xintong > > >> >> Song < > > >> >> > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > > >> >> example > > >> >> > > Till. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > > >> scenario. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > > >> Memory + > > >> >> JVM > > >> >> > > > > > > Overhead): > > >> >> > > > > > > > > > 200MB > > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > > >> >> Metaspace, > > >> >> > > > > > Off-Heap > > >> >> > > > > > > > > > Managed > > >> >> > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> >> > > -XX:MaxDirectMemorySize > > >> >> > > > > to > > >> >> > > > > > > > 200MB. > > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> >> > > -XX:MaxDirectMemorySize > > >> >> > > > > to > > >> >> > > > > > a > > >> >> > > > > > > > very > > >> >> > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > value, > > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage > of > > >> Task > > >> >> > > > Off-Heap > > >> >> > > > > > > Memory > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > Overhead > > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > alternative 2 > > >> >> and > > >> >> > > > > > > alternative 3 > > >> >> > > > > > > > > > > should > > >> >> > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> >> > > > > -XX:MaxDirectMemorySize > > >> >> > > > > > > will > > >> >> > > > > > > > > not > > >> >> > > > > > > > > > > > > reduce > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage > of > > >> Task > > >> >> > > > Off-Heap > > >> >> > > > > > > Memory > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > > then > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > > >> frequent > > >> >> OOM. > > >> >> > > To > > >> >> > > > > > avoid > > >> >> > > > > > > > > that, > > >> >> > > > > > > > > > > the > > >> >> > > > > > > > > > > > > only > > >> >> > > > > > > > > > > > > > > > thing > > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > > >> >> configuration > > >> >> > > and > > >> >> > > > > > > > increase > > >> >> > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > Direct > > >> >> > > > > > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > >> Overhead). > > >> >> > Let's > > >> >> > > > say > > >> >> > > > > > > that > > >> >> > > > > > > > > user > > >> >> > > > > > > > > > > > > > increases > > >> >> > > > > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this > will > > >> >> reduce > > >> >> > the > > >> >> > > > > total > > >> >> > > > > > > > size > > >> >> > > > > > > > > of > > >> >> > > > > > > > > > > > other > > >> >> > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > > >> process > > >> >> > > memory > > >> >> > > > > > > remains > > >> >> > > > > > > > > > 1GB. > > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > > >> >> chance of > > >> >> > > > > direct > > >> >> > > > > > > OOM. > > >> >> > > > > > > > > > There > > >> >> > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > chances > > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > > >> memory > > >> >> > limit, > > >> >> > > > but > > >> >> > > > > > > given > > >> >> > > > > > > > > > that > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > process > > >> >> > > > > > > > > > > > > > > > > may > > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > native > > >> >> memory > > >> >> > > > > > (Off-Heap > > >> >> > > > > > > > > > Managed > > >> >> > > > > > > > > > > > > > Memory, > > >> >> > > > > > > > > > > > > > > > > Network > > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > > >> actual > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > > usage > > >> >> > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > slightly > > >> >> > > > > > > > > > > > > > > > > above > > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > > >> probably > > >> >> do > > >> >> > > not > > >> >> > > > > need > > >> >> > > > > > > to > > >> >> > > > > > > > > > change > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > configurations. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > > >> >> > > perspective, a > > >> >> > > > > > > > feasible > > >> >> > > > > > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > > >> >> resource > > >> >> > > > > > > utilization > > >> >> > > > > > > > > > > compared > > >> >> > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > alternative 3. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM > Till > > >> >> > Rohrmann > > >> >> > > < > > >> >> > > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > > >> understand > > >> >> the > > >> >> > > > > > difference > > >> >> > > > > > > > > > between > > >> >> > > > > > > > > > > > > > > > > alternative 2 > > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > utilization > > >> >> > > Xintong. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> >> XX:MaxDirectMemorySize > > >> >> > > to > > >> >> > > > > Task > > >> >> > > > > > > > > > Off-Heap > > >> >> > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > > that > > >> >> this > > >> >> > > size > > >> >> > > > > is > > >> >> > > > > > > too > > >> >> > > > > > > > > low > > >> >> > > > > > > > > > > > > > resulting > > >> >> > > > > > > > > > > > > > > > in a > > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > > >> >> potentially > > >> >> > an > > >> >> > > > OOM. > > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> >> XX:MaxDirectMemorySize > > >> >> > > to > > >> >> > > > > > > > something > > >> >> > > > > > > > > > > larger > > >> >> > > > > > > > > > > > > > than > > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > course > > >> >> reduce > > >> >> > > the > > >> >> > > > > > sizes > > >> >> > > > > > > of > > >> >> > > > > > > > > the > > >> >> > > > > > > > > > > > other > > >> >> > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > types. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > result > > >> in an > > >> >> > > under > > >> >> > > > > > > > > utilization > > >> >> > > > > > > > > > of > > >> >> > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > > >> >> alternative 3 > > >> >> > > > > > strictly > > >> >> > > > > > > > > sets a > > >> >> > > > > > > > > > > > > higher > > >> >> > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use > only > > >> >> little, > > >> >> > > > then I > > >> >> > > > > > > would > > >> >> > > > > > > > > > > expect > > >> >> > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > > under > > >> >> > > > > utilization. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > > Yang > > >> >> Wang < > > >> >> > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > large > > >> max > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > size > > >> >> > > > > > > > > > > when > > >> >> > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > do > > >> >> > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > > >> >> memory. > > >> >> > If > > >> >> > > > the > > >> >> > > > > > > direct > > >> >> > > > > > > > > > > > > > > > memory,including > > >> >> > > > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > > direct > > >> >> > > > memory,could > > >> >> > > > > > be > > >> >> > > > > > > > > > > calculated > > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > > >> memory > > >> >> > with > > >> >> > > > > fixed > > >> >> > > > > > > > > value. > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > > and > > >> >> k8s,we > > >> >> > > > need > > >> >> > > > > to > > >> >> > > > > > > > check > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > configurations in client to > avoid > > >> >> > > submitting > > >> >> > > > > > > > > successfully > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > failing > > >> >> > > > > > > > > > > > > > > > > in > > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > >> [hidden email] > > >> >> > > > > >于2019年8月13日 > > >> >> > > > > > > > > > 周二22:07写道: > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > > you > > >> are > > >> >> > > right > > >> >> > > > > that > > >> >> > > > > > > we > > >> >> > > > > > > > > > should > > >> >> > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > include > > >> >> > > > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > > FLIP. > > >> >> This > > >> >> > > FLIP > > >> >> > > > > > should > > >> >> > > > > > > > > > > > concentrate > > >> >> > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > > >> >> > TaskExecutors, > > >> >> > > > > with > > >> >> > > > > > > > > minimum > > >> >> > > > > > > > > > > > > > > involvement > > >> >> > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > > >> >> > alternative > > >> >> > > 3 > > >> >> > > > > may > > >> >> > > > > > > not > > >> >> > > > > > > > > > having > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > same > > >> >> > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > >> alternative 2 > > >> >> > > does, > > >> >> > > > > but > > >> >> > > > > > at > > >> >> > > > > > > > the > > >> >> > > > > > > > > > > cost > > >> >> > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > risk > > >> >> > > > > > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > > >> level, > > >> >> > > which > > >> >> > > > is > > >> >> > > > > > not > > >> >> > > > > > > > > good. > > >> >> > > > > > > > > > > My > > >> >> > > > > > > > > > > > > > point > > >> >> > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" > and > > >> "JVM > > >> >> > > > > Overhead" > > >> >> > > > > > > are > > >> >> > > > > > > > > not > > >> >> > > > > > > > > > > easy > > >> >> > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > config. > > >> >> > > > > > > > > > > > > > > > > > > For > > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > > >> configure > > >> >> > them > > >> >> > > > > > higher > > >> >> > > > > > > > than > > >> >> > > > > > > > > > > what > > >> >> > > > > > > > > > > > > > > actually > > >> >> > > > > > > > > > > > > > > > > > > needed, > > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > direct > > >> OOM. > > >> >> For > > >> >> > > > > > > alternative > > >> >> > > > > > > > > 3, > > >> >> > > > > > > > > > > > users > > >> >> > > > > > > > > > > > > do > > >> >> > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > get > > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > > >> config > > >> >> the > > >> >> > > two > > >> >> > > > > > > options > > >> >> > > > > > > > > > > > > aggressively > > >> >> > > > > > > > > > > > > > > > high. > > >> >> > > > > > > > > > > > > > > > > > But > > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > > >> >> overall > > >> >> > > > > container > > >> >> > > > > > > > > memory > > >> >> > > > > > > > > > > > usage > > >> >> > > > > > > > > > > > > > > > exceeds > > >> >> > > > > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > > > > budget. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 > AM > > >> Till > > >> >> > > > > Rohrmann < > > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > > FLIP > > >> >> > Xintong. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > already > > >> >> looks > > >> >> > > quite > > >> >> > > > > > good. > > >> >> > > > > > > > > > > > Concerning > > >> >> > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > first > > >> >> > > > > > > > > > > > > > > > > > > open > > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > > >> memory > > >> >> > > > segments, > > >> >> > > > > I > > >> >> > > > > > > was > > >> >> > > > > > > > > > > > wondering > > >> >> > > > > > > > > > > > > > > > whether > > >> >> > > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > > the > > >> >> > context > > >> >> > > > of > > >> >> > > > > > this > > >> >> > > > > > > > > FLIP > > >> >> > > > > > > > > > or > > >> >> > > > > > > > > > > > > > whether > > >> >> > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > > Without > > >> >> > knowing > > >> >> > > > all > > >> >> > > > > > > > > details, > > >> >> > > > > > > > > > I > > >> >> > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > > concerned > > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the > scope > > >> of > > >> >> this > > >> >> > > > FLIP > > >> >> > > > > > too > > >> >> > > > > > > > much > > >> >> > > > > > > > > > > > because > > >> >> > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > > call > > >> >> sites > > >> >> > of > > >> >> > > > the > > >> >> > > > > > > > > > > MemoryManager > > >> >> > > > > > > > > > > > > > where > > >> >> > > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > > > allocate > > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > should > > >> >> mainly > > >> >> > be > > >> >> > > > > batch > > >> >> > > > > > > > > > > operators). > > >> >> > > > > > > > > > > > > The > > >> >> > > > > > > > > > > > > > > > > addition > > >> >> > > > > > > > > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > > to > > >> the > > >> >> > > > > > > MemoryManager > > >> >> > > > > > > > > > should > > >> >> > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > affected > > >> >> > > > > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > > >> this is > > >> >> > the > > >> >> > > > only > > >> >> > > > > > > point > > >> >> > > > > > > > > of > > >> >> > > > > > > > > > > > > > > interaction > > >> >> > > > > > > > > > > > > > > > a > > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have > with > > >> the > > >> >> > > > > > > MemoryManager. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > > >> >> question > > >> >> > > about > > >> >> > > > > > > setting > > >> >> > > > > > > > > or > > >> >> > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > setting > > >> >> > > > > > > > > > > > > > > > a > > >> >> > > > > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > would > > >> also > > >> >> be > > >> >> > > > > > interested > > >> >> > > > > > > > why > > >> >> > > > > > > > > > > Yang > > >> >> > > > > > > > > > > > > Wang > > >> >> > > > > > > > > > > > > > > > > thinks > > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > > best. > > >> My > > >> >> > > concern > > >> >> > > > > > about > > >> >> > > > > > > > > this > > >> >> > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation > as > > we > > >> >> are > > >> >> > now > > >> >> > > > > with > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> >> > > > > > > > > > > > > > > > > > > If > > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > > are > > >> not > > >> >> > > > clearly > > >> >> > > > > > > > > separated > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > can > > >> >> > > > > > > > > > > > > > > > spill > > >> >> > > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it > is > > >> quite > > >> >> > hard > > >> >> > > > to > > >> >> > > > > > > > > understand > > >> >> > > > > > > > > > > > what > > >> >> > > > > > > > > > > > > > > > exactly > > >> >> > > > > > > > > > > > > > > > > > > > causes a > > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > > using > > >> >> too > > >> >> > > much > > >> >> > > > > > > memory. > > >> >> > > > > > > > > This > > >> >> > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > then > > >> >> > > > > > > > > > > > > > > > > > easily > > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > > >> what > > >> >> we > > >> >> > > have > > >> >> > > > > with > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > > > cutoff-ratio. > > >> >> > > > > > > > > > > > > > > > So > > >> >> > > > > > > > > > > > > > > > > > why > > >> >> > > > > > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > value > > >> for > > >> >> max > > >> >> > > > direct > > >> >> > > > > > > > memory > > >> >> > > > > > > > > > and > > >> >> > > > > > > > > > > > > giving > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > an > > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > > >> runs > > >> >> into > > >> >> > > an > > >> >> > > > > OOM. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > >> alternative 2 > > >> >> > lead > > >> >> > > to > > >> >> > > > > > lower > > >> >> > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > utilization > > >> >> > > > > > > > > > > > > > > > > > than > > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > > the > > >> >> direct > > >> >> > > > > memory > > >> >> > > > > > > to a > > >> >> > > > > > > > > > > higher > > >> >> > > > > > > > > > > > > > value? > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > > AM > > >> >> > Xintong > > >> >> > > > > Song < > > >> >> > > > > > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > > >> Yang. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > Memory* > > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > > large > > >> max > > >> >> > > direct > > >> >> > > > > > > memory > > >> >> > > > > > > > > size > > >> >> > > > > > > > > > > > > > > definitely > > >> >> > > > > > > > > > > > > > > > > has > > >> >> > > > > > > > > > > > > > > > > > > some > > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do > not > > >> >> worry > > >> >> > > about > > >> >> > > > > > > direct > > >> >> > > > > > > > > OOM, > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > don't > > >> >> > > > > > > > > > > > > > > > > > even > > >> >> > > > > > > > > > > > > > > > > > > > > need > > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > > network > > >> >> > memory > > >> >> > > > with > > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > > some > > >> >> down > > >> >> > > sides > > >> >> > > > > of > > >> >> > > > > > > > doing > > >> >> > > > > > > > > > > this. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > think > > >> of is > > >> >> > that > > >> >> > > > if > > >> >> > > > > a > > >> >> > > > > > > task > > >> >> > > > > > > > > > > > executor > > >> >> > > > > > > > > > > > > > > > > container > > >> >> > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > overusing > > >> >> memory, > > >> >> > it > > >> >> > > > > could > > >> >> > > > > > > be > > >> >> > > > > > > > > hard > > >> >> > > > > > > > > > > for > > >> >> > > > > > > > > > > > > use > > >> >> > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > know > > >> >> > > > > > > > > > > > > > > > > > > > which > > >> >> > > > > > > > > > > > > > > > > > > > > > part > > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > > overused. > > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > > >> that > > >> >> the > > >> >> > > JVM > > >> >> > > > > > never > > >> >> > > > > > > > > > trigger > > >> >> > > > > > > > > > > GC > > >> >> > > > > > > > > > > > > due > > >> >> > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > reaching > > >> >> > > > > > > > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > > >> because > > >> >> the > > >> >> > > > limit > > >> >> > > > > > is > > >> >> > > > > > > > too > > >> >> > > > > > > > > > high > > >> >> > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > reached. > > >> >> > > > > > > > > > > > > > > > > > > > That > > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > > on > > >> >> heap > > >> >> > > > memory > > >> >> > > > > to > > >> >> > > > > > > > > trigger > > >> >> > > > > > > > > > > GC > > >> >> > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > release > > >> >> > > > > > > > > > > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be > a > > >> >> problem > > >> >> > in > > >> >> > > > > cases > > >> >> > > > > > > > where > > >> >> > > > > > > > > > we > > >> >> > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > more > > >> >> > > > > > > > > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > > heap > > >> >> > activity > > >> >> > > > to > > >> >> > > > > > > > trigger > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > GC. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > > >> reasons > > >> >> > for > > >> >> > > > > > > preferring > > >> >> > > > > > > > > > > > setting a > > >> >> > > > > > > > > > > > > > > very > > >> >> > > > > > > > > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > > > > > > > value, > > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything > else > > I > > >> >> > > > overlooked. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > > >> between > > >> >> > > > multiple > > >> >> > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > > >> think we > > >> >> > > should > > >> >> > > > > > throw > > >> >> > > > > > > > an > > >> >> > > > > > > > > > > error. > > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > > the > > >> >> > client > > >> >> > > > side > > >> >> > > > > > is > > >> >> > > > > > > a > > >> >> > > > > > > > > good > > >> >> > > > > > > > > > > > idea, > > >> >> > > > > > > > > > > > > > so > > >> >> > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > > >> problem > > >> >> > > before > > >> >> > > > > > > > submitting > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > Flink > > >> >> > > > > > > > > > > > > > > > > > cluster, > > >> >> > > > > > > > > > > > > > > > > > > > > which > > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely > on > > >> the > > >> >> > > client > > >> >> > > > > side > > >> >> > > > > > > > > > checking, > > >> >> > > > > > > > > > > > > > because > > >> >> > > > > > > > > > > > > > > > for > > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > >> TaskManagers > > >> >> on > > >> >> > > > > > different > > >> >> > > > > > > > > > machines > > >> >> > > > > > > > > > > > may > > >> >> > > > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > > > > > > different > > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > > client > > >> >> does > > >> >> > > see > > >> >> > > > > > that. > > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at > 5:09 > > >> PM > > >> >> Yang > > >> >> > > > Wang > > >> >> > > > > < > > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > detailed > > >> >> > proposal. > > >> >> > > > > After > > >> >> > > > > > > all > > >> >> > > > > > > > > the > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > > more > > >> >> > > powerful > > >> >> > > > to > > >> >> > > > > > > > control > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > flink > > >> >> > > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > > >> about > > >> >> it. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > > >> Memory > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > > >> user > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > and > > >> >> > > > > > > > > > > native > > >> >> > > > > > > > > > > > > > > memory. > > >> >> > > > > > > > > > > > > > > > > > They > > >> >> > > > > > > > > > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > > > > > > > all > > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > off-heap > > >> >> memory. > > >> >> > > > > Right? > > >> >> > > > > > > So i > > >> >> > > > > > > > > > don’t > > >> >> > > > > > > > > > > > > think > > >> >> > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > > > > set > > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > -XX:MaxDirectMemorySize > > >> >> > > > properly. I > > >> >> > > > > > > > prefer > > >> >> > > > > > > > > > > > leaving > > >> >> > > > > > > > > > > > > > it a > > >> >> > > > > > > > > > > > > > > > > very > > >> >> > > > > > > > > > > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > >> fine-grained > > >> >> > > > > > > memory(network > > >> >> > > > > > > > > > > memory, > > >> >> > > > > > > > > > > > > > > managed > > >> >> > > > > > > > > > > > > > > > > > > memory, > > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > > >> process > > >> >> > > memory, > > >> >> > > > > how > > >> >> > > > > > do > > >> >> > > > > > > > we > > >> >> > > > > > > > > > deal > > >> >> > > > > > > > > > > > > with > > >> >> > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > > memory > > >> >> > > > > configuration > > >> >> > > > > > > in > > >> >> > > > > > > > > > > client? > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> >> > > [hidden email]> > > >> >> > > > > > > > > > 于2019年8月7日周三 > > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > start > > a > > >> >> > > discussion > > >> >> > > > > > > thread > > >> >> > > > > > > > on > > >> >> > > > > > > > > > > > > "FLIP-49: > > >> >> > > > > > > > > > > > > > > > > Unified > > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > >> >> > > > TaskExecutors"[1], > > >> >> > > > > > > where > > >> >> > > > > > > > we > > >> >> > > > > > > > > > > > > describe > > >> >> > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > improve > > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> >> > > configurations. > > >> >> > > > > The > > >> >> > > > > > > > FLIP > > >> >> > > > > > > > > > > > document > > >> >> > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > mostly > > >> >> > > > > > > > > > > > > > > > > > > > based > > >> >> > > > > > > > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > > >> >> Management > > >> >> > > and > > >> >> > > > > > > > > > Configuration > > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> >> > > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > > >> follow-up > > >> >> > > > > discussions > > >> >> > > > > > > > both > > >> >> > > > > > > > > > > online > > >> >> > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > > offline. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > > >> several > > >> >> > > > > > shortcomings > > >> >> > > > > > > of > > >> >> > > > > > > > > > > current > > >> >> > > > > > > > > > > > > > > (Flink > > >> >> > > > > > > > > > > > > > > > > 1.9) > > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> >> > > configuration. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > >> configuration > > >> >> > for > > >> >> > > > > > > Streaming > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > > Batch. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > > >> difficult > > >> >> > > > > > configuration > > >> >> > > > > > > of > > >> >> > > > > > > > > > > RocksDB > > >> >> > > > > > > > > > > > > in > > >> >> > > > > > > > > > > > > > > > > > Streaming. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > > >> uncertain > > >> >> and > > >> >> > > > hard > > >> >> > > > > to > > >> >> > > > > > > > > > > understand. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > > the > > >> >> > problems > > >> >> > > > can > > >> >> > > > > > be > > >> >> > > > > > > > > > > summarized > > >> >> > > > > > > > > > > > > as > > >> >> > > > > > > > > > > > > > > > > follows. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > > >> manager > > >> >> to > > >> >> > > also > > >> >> > > > > > > account > > >> >> > > > > > > > > for > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > usage > > >> >> > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > state > > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > >> TaskExecutor > > >> >> > > memory > > >> >> > > > > is > > >> >> > > > > > > > > > > partitioned > > >> >> > > > > > > > > > > > > > > > accounted > > >> >> > > > > > > > > > > > > > > > > > > > > individual > > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > reservations > > >> and > > >> >> > pools. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > >> >> > > configuration > > >> >> > > > > > > options > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > > > > > calculations > > >> >> > > > > > > > > > > > > > > > > > > logics. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > > details > > >> in > > >> >> the > > >> >> > > > FLIP > > >> >> > > > > > wiki > > >> >> > > > > > > > > > > document > > >> >> > > > > > > > > > > > > [1]. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > > >> early > > >> >> > > design > > >> >> > > > > doc > > >> >> > > > > > > [2] > > >> >> > > > > > > > is > > >> >> > > > > > > > > > out > > >> >> > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > sync, > > >> >> > > > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > > it > > >> >> > > > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have > the > > >> >> > > discussion > > >> >> > > > in > > >> >> > > > > > > this > > >> >> > > > > > > > > > > mailing > > >> >> > > > > > > > > > > > > list > > >> >> > > > > > > > > > > > > > > > > > thread.) > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to > your > > >> >> > > feedbacks. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > > Song > > >> < > > >> >> > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I > was > > >> >> > wondering > > >> >> > > > > > whether > > >> >> > > > > > > > we > > >> >> > > > > > > > > > can > > >> >> > > > > > > > > > > > > avoid > > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > > >> managed > > >> >> > memory > > >> >> > > > and > > >> >> > > > > > > > network > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > with > > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > > second > > >> >> > > thought, > > >> >> > > > I > > >> >> > > > > > > think > > >> >> > > > > > > > > even > > >> >> > > > > > > > > > > for > > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > > >> off-heap > > >> >> > > > managed > > >> >> > > > > > > memory > > >> >> > > > > > > > > > could > > >> >> > > > > > > > > > > > > cause > > >> >> > > > > > > > > > > > > > > problems. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > > >> proposed > > >> >> in > > >> >> > > this > > >> >> > > > > > FLIP > > >> >> > > > > > > it > > >> >> > > > > > > > > to > > >> >> > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > both > > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > > memory > > >> >> > > allocated > > >> >> > > > > > > through > > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > > >> >> > practically > > >> >> > > > > > native > > >> >> > > > > > > > > memory > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > > only > > >> >> parts > > >> >> > of > > >> >> > > > > > memory > > >> >> > > > > > > > > > limited > > >> >> > > > > > > > > > > by > > >> >> > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > > memory > > >> and > > >> >> > JVM > > >> >> > > > > > > overhead, > > >> >> > > > > > > > > > which > > >> >> > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set > the > > >> JVM > > >> >> max > > >> >> > > > > direct > > >> >> > > > > > > > memory > > >> >> > > > > > > > > > to. > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > > >> Rohrmann > > >> >> < > > >> >> > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > Xintong. I > > >> >> > > understand > > >> >> > > > > the > > >> >> > > > > > > two > > >> >> > > > > > > > > > > > > alternatives > > >> >> > > > > > > > > > > > > > > > now. > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > > because > > >> it > > >> >> > makes > > >> >> > > > > > things > > >> >> > > > > > > > > > > explicit. > > >> >> > > > > > > > > > > > If > > >> >> > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > > >> that > > >> >> we > > >> >> > > might > > >> >> > > > > end > > >> >> > > > > > > up > > >> >> > > > > > > > > in a > > >> >> > > > > > > > > > > > > similar > > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > > >> user > > >> >> > might > > >> >> > > > see > > >> >> > > > > > that > > >> >> > > > > > > > her > > >> >> > > > > > > > > > > > process > > >> >> > > > > > > > > > > > > > > gets > > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know > why > > >> this > > >> >> is > > >> >> > > the > > >> >> > > > > > case. > > >> >> > > > > > > > > > > > > Consequently, > > >> >> > > > > > > > > > > > > > > she > > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > > size > > >> >> > > (similar > > >> >> > > > to > > >> >> > > > > > > > > > increasing > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > cutoff > > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for > the > > >> extra > > >> >> > > direct > > >> >> > > > > > > memory. > > >> >> > > > > > > > > > Even > > >> >> > > > > > > > > > > > > worse, > > >> >> > > > > > > > > > > > > > > she > > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > which > > >> are > > >> >> not > > >> >> > > > fully > > >> >> > > > > > used > > >> >> > > > > > > > and > > >> >> > > > > > > > > > > hence > > >> >> > > > > > > > > > > > > > won't > > >> >> > > > > > > > > > > > > > > > change the overall memory > consumption. > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > > Xintong > > >> >> Song < > > >> >> > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > > >> >> example > > >> >> > > Till. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > > >> scenario. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > > >> Memory + > > >> >> JVM > > >> >> > > > > > > Overhead): > > >> >> > > > > > > > > > 200MB > > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > > >> >> Metaspace, > > >> >> > > > > > Off-Heap > > >> >> > > > > > > > > > Managed > > >> >> > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> >> > > -XX:MaxDirectMemorySize > > >> >> > > > > to > > >> >> > > > > > > > 200MB. > > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> >> > > -XX:MaxDirectMemorySize > > >> >> > > > > to > > >> >> > > > > > a > > >> >> > > > > > > > very > > >> >> > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > value, > > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage > of > > >> Task > > >> >> > > > Off-Heap > > >> >> > > > > > > Memory > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > Overhead > > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > alternative 2 > > >> >> and > > >> >> > > > > > > alternative 3 > > >> >> > > > > > > > > > > should > > >> >> > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> >> > > > > -XX:MaxDirectMemorySize > > >> >> > > > > > > will > > >> >> > > > > > > > > not > > >> >> > > > > > > > > > > > > reduce > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage > of > > >> Task > > >> >> > > > Off-Heap > > >> >> > > > > > > Memory > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > > then > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > > >> frequent > > >> >> OOM. > > >> >> > > To > > >> >> > > > > > avoid > > >> >> > > > > > > > > that, > > >> >> > > > > > > > > > > the > > >> >> > > > > > > > > > > > > only > > >> >> > > > > > > > > > > > > > > > thing > > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > > >> >> configuration > > >> >> > > and > > >> >> > > > > > > > increase > > >> >> > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > Direct > > >> >> > > > > > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > >> Overhead). > > >> >> > Let's > > >> >> > > > say > > >> >> > > > > > > that > > >> >> > > > > > > > > user > > >> >> > > > > > > > > > > > > > increases > > >> >> > > > > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this > will > > >> >> reduce > > >> >> > the > > >> >> > > > > total > > >> >> > > > > > > > size > > >> >> > > > > > > > > of > > >> >> > > > > > > > > > > > other > > >> >> > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > > >> process > > >> >> > > memory > > >> >> > > > > > > remains > > >> >> > > > > > > > > > 1GB. > > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > > >> >> chance of > > >> >> > > > > direct > > >> >> > > > > > > OOM. > > >> >> > > > > > > > > > There > > >> >> > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > chances > > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > > >> memory > > >> >> > limit, > > >> >> > > > but > > >> >> > > > > > > given > > >> >> > > > > > > > > > that > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > process > > >> >> > > > > > > > > > > > > > > > > may > > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > native > > >> >> memory > > >> >> > > > > > (Off-Heap > > >> >> > > > > > > > > > Managed > > >> >> > > > > > > > > > > > > > Memory, > > >> >> > > > > > > > > > > > > > > > > Network > > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > > >> actual > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > > usage > > >> >> > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > slightly > > >> >> > > > > > > > > > > > > > > > > above > > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > > >> probably > > >> >> do > > >> >> > > not > > >> >> > > > > need > > >> >> > > > > > > to > > >> >> > > > > > > > > > change > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > configurations. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > > >> >> > > perspective, a > > >> >> > > > > > > > feasible > > >> >> > > > > > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > > >> >> resource > > >> >> > > > > > > utilization > > >> >> > > > > > > > > > > compared > > >> >> > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > alternative 3. > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM > Till > > >> >> > Rohrmann > > >> >> > > < > > >> >> > > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > > >> understand > > >> >> the > > >> >> > > > > > difference > > >> >> > > > > > > > > > between > > >> >> > > > > > > > > > > > > > > > > alternative 2 > > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > utilization > > >> >> > > Xintong. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> >> XX:MaxDirectMemorySize > > >> >> > > to > > >> >> > > > > Task > > >> >> > > > > > > > > > Off-Heap > > >> >> > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > JVM > > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > > that > > >> >> this > > >> >> > > size > > >> >> > > > > is > > >> >> > > > > > > too > > >> >> > > > > > > > > low > > >> >> > > > > > > > > > > > > > resulting > > >> >> > > > > > > > > > > > > > > > in a > > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > > >> >> potentially > > >> >> > an > > >> >> > > > OOM. > > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> >> XX:MaxDirectMemorySize > > >> >> > > to > > >> >> > > > > > > > something > > >> >> > > > > > > > > > > larger > > >> >> > > > > > > > > > > > > > than > > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > course > > >> >> reduce > > >> >> > > the > > >> >> > > > > > sizes > > >> >> > > > > > > of > > >> >> > > > > > > > > the > > >> >> > > > > > > > > > > > other > > >> >> > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > types. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > result > > >> in an > > >> >> > > under > > >> >> > > > > > > > > utilization > > >> >> > > > > > > > > > of > > >> >> > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > > >> >> alternative 3 > > >> >> > > > > > strictly > > >> >> > > > > > > > > sets a > > >> >> > > > > > > > > > > > > higher > > >> >> > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use > only > > >> >> little, > > >> >> > > > then I > > >> >> > > > > > > would > > >> >> > > > > > > > > > > expect > > >> >> > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > > under > > >> >> > > > > utilization. > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > > Yang > > >> >> Wang < > > >> >> > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > large > > >> max > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > size > > >> >> > > > > > > > > > > when > > >> >> > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > do > > >> >> > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > > >> >> memory. > > >> >> > If > > >> >> > > > the > > >> >> > > > > > > direct > > >> >> > > > > > > > > > > > > > > > memory,including > > >> >> > > > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > > direct > > >> >> > > > memory,could > > >> >> > > > > > be > > >> >> > > > > > > > > > > calculated > > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > > >> memory > > >> >> > with > > >> >> > > > > fixed > > >> >> > > > > > > > > value. > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > > and > > >> >> k8s,we > > >> >> > > > need > > >> >> > > > > to > > >> >> > > > > > > > check > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > configurations in client to > avoid > > >> >> > > submitting > > >> >> > > > > > > > > successfully > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > failing > > >> >> > > > > > > > > > > > > > > > > in > > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > >> [hidden email] > > >> >> > > > > >于2019年8月13日 > > >> >> > > > > > > > > > 周二22:07写道: > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > > you > > >> are > > >> >> > > right > > >> >> > > > > that > > >> >> > > > > > > we > > >> >> > > > > > > > > > should > > >> >> > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > include > > >> >> > > > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > > FLIP. > > >> >> This > > >> >> > > FLIP > > >> >> > > > > > should > > >> >> > > > > > > > > > > > concentrate > > >> >> > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > > >> >> > TaskExecutors, > > >> >> > > > > with > > >> >> > > > > > > > > minimum > > >> >> > > > > > > > > > > > > > > involvement > > >> >> > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > > >> >> > alternative > > >> >> > > 3 > > >> >> > > > > may > > >> >> > > > > > > not > > >> >> > > > > > > > > > having > > >> >> > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > same > > >> >> > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > >> alternative 2 > > >> >> > > does, > > >> >> > > > > but > > >> >> > > > > > at > > >> >> > > > > > > > the > > >> >> > > > > > > > > > > cost > > >> >> > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > risk > > >> >> > > > > > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > > >> level, > > >> >> > > which > > >> >> > > > is > > >> >> > > > > > not > > >> >> > > > > > > > > good. > > >> >> > > > > > > > > > > My > > >> >> > > > > > > > > > > > > > point > > >> >> > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" > and > > >> "JVM > > >> >> > > > > Overhead" > > >> >> > > > > > > are > > >> >> > > > > > > > > not > > >> >> > > > > > > > > > > easy > > >> >> > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > config. > > >> >> > > > > > > > > > > > > > > > > > > For > > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > > >> configure > > >> >> > them > > >> >> > > > > > higher > > >> >> > > > > > > > than > > >> >> > > > > > > > > > > what > > >> >> > > > > > > > > > > > > > > actually > > >> >> > > > > > > > > > > > > > > > > > > needed, > > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > direct > > >> OOM. > > >> >> For > > >> >> > > > > > > alternative > > >> >> > > > > > > > > 3, > > >> >> > > > > > > > > > > > users > > >> >> > > > > > > > > > > > > do > > >> >> > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > get > > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > > >> config > > >> >> the > > >> >> > > two > > >> >> > > > > > > options > > >> >> > > > > > > > > > > > > aggressively > > >> >> > > > > > > > > > > > > > > > high. > > >> >> > > > > > > > > > > > > > > > > > But > > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > > >> >> overall > > >> >> > > > > container > > >> >> > > > > > > > > memory > > >> >> > > > > > > > > > > > usage > > >> >> > > > > > > > > > > > > > > > exceeds > > >> >> > > > > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > > > > budget. > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 > AM > > >> Till > > >> >> > > > > Rohrmann < > > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > > FLIP > > >> >> > Xintong. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > already > > >> >> looks > > >> >> > > quite > > >> >> > > > > > good. > > >> >> > > > > > > > > > > > Concerning > > >> >> > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > first > > >> >> > > > > > > > > > > > > > > > > > > open > > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > > >> memory > > >> >> > > > segments, > > >> >> > > > > I > > >> >> > > > > > > was > > >> >> > > > > > > > > > > > wondering > > >> >> > > > > > > > > > > > > > > > whether > > >> >> > > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > > the > > >> >> > context > > >> >> > > > of > > >> >> > > > > > this > > >> >> > > > > > > > > FLIP > > >> >> > > > > > > > > > or > > >> >> > > > > > > > > > > > > > whether > > >> >> > > > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > > Without > > >> >> > knowing > > >> >> > > > all > > >> >> > > > > > > > > details, > > >> >> > > > > > > > > > I > > >> >> > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > > concerned > > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the > scope > > >> of > > >> >> this > > >> >> > > > FLIP > > >> >> > > > > > too > > >> >> > > > > > > > much > > >> >> > > > > > > > > > > > because > > >> >> > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > > call > > >> >> sites > > >> >> > of > > >> >> > > > the > > >> >> > > > > > > > > > > MemoryManager > > >> >> > > > > > > > > > > > > > where > > >> >> > > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > > > allocate > > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > should > > >> >> mainly > > >> >> > be > > >> >> > > > > batch > > >> >> > > > > > > > > > > operators). > > >> >> > > > > > > > > > > > > The > > >> >> > > > > > > > > > > > > > > > > addition > > >> >> > > > > > > > > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > > to > > >> the > > >> >> > > > > > > MemoryManager > > >> >> > > > > > > > > > should > > >> >> > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > affected > > >> >> > > > > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > > >> this is > > >> >> > the > > >> >> > > > only > > >> >> > > > > > > point > > >> >> > > > > > > > > of > > >> >> > > > > > > > > > > > > > > interaction > > >> >> > > > > > > > > > > > > > > > a > > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have > with > > >> the > > >> >> > > > > > > MemoryManager. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > > >> >> question > > >> >> > > about > > >> >> > > > > > > setting > > >> >> > > > > > > > > or > > >> >> > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > setting > > >> >> > > > > > > > > > > > > > > > a > > >> >> > > > > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > would > > >> also > > >> >> be > > >> >> > > > > > interested > > >> >> > > > > > > > why > > >> >> > > > > > > > > > > Yang > > >> >> > > > > > > > > > > > > Wang > > >> >> > > > > > > > > > > > > > > > > thinks > > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > > best. > > >> My > > >> >> > > concern > > >> >> > > > > > about > > >> >> > > > > > > > > this > > >> >> > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > > > would > > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation > as > > we > > >> >> are > > >> >> > now > > >> >> > > > > with > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> >> > > > > > > > > > > > > > > > > > > If > > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > > are > > >> not > > >> >> > > > clearly > > >> >> > > > > > > > > separated > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > can > > >> >> > > > > > > > > > > > > > > > spill > > >> >> > > > > > > > > > > > > > > > > > over > > >> >> > > > > > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it > is > > >> quite > > >> >> > hard > > >> >> > > > to > > >> >> > > > > > > > > understand > > >> >> > > > > > > > > > > > what > > >> >> > > > > > > > > > > > > > > > exactly > > >> >> > > > > > > > > > > > > > > > > > > > causes a > > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > > using > > >> >> too > > >> >> > > much > > >> >> > > > > > > memory. > > >> >> > > > > > > > > This > > >> >> > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > then > > >> >> > > > > > > > > > > > > > > > > > easily > > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > > >> what > > >> >> we > > >> >> > > have > > >> >> > > > > with > > >> >> > > > > > > the > > >> >> > > > > > > > > > > > > > cutoff-ratio. > > >> >> > > > > > > > > > > > > > > > So > > >> >> > > > > > > > > > > > > > > > > > why > > >> >> > > > > > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > value > > >> for > > >> >> max > > >> >> > > > direct > > >> >> > > > > > > > memory > > >> >> > > > > > > > > > and > > >> >> > > > > > > > > > > > > giving > > >> >> > > > > > > > > > > > > > > the > > >> >> > > > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > an > > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > > >> runs > > >> >> into > > >> >> > > an > > >> >> > > > > OOM. > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > >> alternative 2 > > >> >> > lead > > >> >> > > to > > >> >> > > > > > lower > > >> >> > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > utilization > > >> >> > > > > > > > > > > > > > > > > > than > > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > > the > > >> >> direct > > >> >> > > > > memory > > >> >> > > > > > > to a > > >> >> > > > > > > > > > > higher > > >> >> > > > > > > > > > > > > > value? > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> >> > > > > > > > > > > > > > > > > > > > > Till > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > > AM > > >> >> > Xintong > > >> >> > > > > Song < > > >> >> > > > > > > > > > > > > > > > [hidden email] > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > > >> Yang. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > Memory* > > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > > large > > >> max > > >> >> > > direct > > >> >> > > > > > > memory > > >> >> > > > > > > > > size > > >> >> > > > > > > > > > > > > > > definitely > > >> >> > > > > > > > > > > > > > > > > has > > >> >> > > > > > > > > > > > > > > > > > > some > > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do > not > > >> >> worry > > >> >> > > about > > >> >> > > > > > > direct > > >> >> > > > > > > > > OOM, > > >> >> > > > > > > > > > > and > > >> >> > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > don't > > >> >> > > > > > > > > > > > > > > > > > even > > >> >> > > > > > > > > > > > > > > > > > > > > need > > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > > network > > >> >> > memory > > >> >> > > > with > > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > > some > > >> >> down > > >> >> > > sides > > >> >> > > > > of > > >> >> > > > > > > > doing > > >> >> > > > > > > > > > > this. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > think > > >> of is > > >> >> > that > > >> >> > > > if > > >> >> > > > > a > > >> >> > > > > > > task > > >> >> > > > > > > > > > > > executor > > >> >> > > > > > > > > > > > > > > > > container > > >> >> > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > overusing > > >> >> memory, > > >> >> > it > > >> >> > > > > could > > >> >> > > > > > > be > > >> >> > > > > > > > > hard > > >> >> > > > > > > > > > > for > > >> >> > > > > > > > > > > > > use > > >> >> > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > know > > >> >> > > > > > > > > > > > > > > > > > > > which > > >> >> > > > > > > > > > > > > > > > > > > > > > part > > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > > overused. > > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > > >> that > > >> >> the > > >> >> > > JVM > > >> >> > > > > > never > > >> >> > > > > > > > > > trigger > > >> >> > > > > > > > > > > GC > > >> >> > > > > > > > > > > > > due > > >> >> > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > reaching > > >> >> > > > > > > > > > > > > > > > > > > > > max > > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > > >> because > > >> >> the > > >> >> > > > limit > > >> >> > > > > > is > > >> >> > > > > > > > too > > >> >> > > > > > > > > > high > > >> >> > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > be > > >> >> > > > > > > > > > > > > > > > > > reached. > > >> >> > > > > > > > > > > > > > > > > > > > That > > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > > on > > >> >> heap > > >> >> > > > memory > > >> >> > > > > to > > >> >> > > > > > > > > trigger > > >> >> > > > > > > > > > > GC > > >> >> > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > release > > >> >> > > > > > > > > > > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be > a > > >> >> problem > > >> >> > in > > >> >> > > > > cases > > >> >> > > > > > > > where > > >> >> > > > > > > > > > we > > >> >> > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > more > > >> >> > > > > > > > > > > > > > > > > > direct > > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > > heap > > >> >> > activity > > >> >> > > > to > > >> >> > > > > > > > trigger > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > GC. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > > >> reasons > > >> >> > for > > >> >> > > > > > > preferring > > >> >> > > > > > > > > > > > setting a > > >> >> > > > > > > > > > > > > > > very > > >> >> > > > > > > > > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > > > > > > > value, > > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything > else > > I > > >> >> > > > overlooked. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > > >> between > > >> >> > > > multiple > > >> >> > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > user > > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > > >> think we > > >> >> > > should > > >> >> > > > > > throw > > >> >> > > > > > > > an > > >> >> > > > > > > > > > > error. > > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > > the > > >> >> > client > > >> >> > > > side > > >> >> > > > > > is > > >> >> > > > > > > a > > >> >> > > > > > > > > good > > >> >> > > > > > > > > > > > idea, > > >> >> > > > > > > > > > > > > > so > > >> >> > > > > > > > > > > > > > > > that > > >> >> > > > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > > >> problem > > >> >> > > before > > >> >> > > > > > > > submitting > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > Flink > > >> >> > > > > > > > > > > > > > > > > > cluster, > > >> >> > > > > > > > > > > > > > > > > > > > > which > > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely > on > > >> the > > >> >> > > client > > >> >> > > > > side > > >> >> > > > > > > > > > checking, > > >> >> > > > > > > > > > > > > > because > > >> >> > > > > > > > > > > > > > > > for > > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > >> TaskManagers > > >> >> on > > >> >> > > > > > different > > >> >> > > > > > > > > > machines > > >> >> > > > > > > > > > > > may > > >> >> > > > > > > > > > > > > > > have > > >> >> > > > > > > > > > > > > > > > > > > > different > > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > > client > > >> >> does > > >> >> > > see > > >> >> > > > > > that. > > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at > 5:09 > > >> PM > > >> >> Yang > > >> >> > > > Wang > > >> >> > > > > < > > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > detailed > > >> >> > proposal. > > >> >> > > > > After > > >> >> > > > > > > all > > >> >> > > > > > > > > the > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > configuration > > >> >> > > > > > > > > > > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > > more > > >> >> > > powerful > > >> >> > > > to > > >> >> > > > > > > > control > > >> >> > > > > > > > > > the > > >> >> > > > > > > > > > > > > flink > > >> >> > > > > > > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > > >> about > > >> >> it. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > > >> Memory > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > > >> user > > >> >> > direct > > >> >> > > > > > memory > > >> >> > > > > > > > and > > >> >> > > > > > > > > > > native > > >> >> > > > > > > > > > > > > > > memory. > > >> >> > > > > > > > > > > > > > > > > > They > > >> >> > > > > > > > > > > > > > > > > > > > are > > >> >> > > > > > > > > > > > > > > > > > > > > > all > > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > off-heap > > >> >> memory. > > >> >> > > > > Right? > > >> >> > > > > > > So i > > >> >> > > > > > > > > > don’t > > >> >> > > > > > > > > > > > > think > > >> >> > > > > > > > > > > > > > > we > > >> >> > > > > > > > > > > > > > > > > > could > > >> >> > > > > > > > > > > > > > > > > > > > not > > >> >> > > > > > > > > > > > > > > > > > > > > > set > > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > -XX:MaxDirectMemorySize > > >> >> > > > properly. I > > >> >> > > > > > > > prefer > > >> >> > > > > > > > > > > > leaving > > >> >> > > > > > > > > > > > > > it a > > >> >> > > > > > > > > > > > > > > > > very > > >> >> > > > > > > > > > > > > > > > > > > > large > > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > >> fine-grained > > >> >> > > > > > > memory(network > > >> >> > > > > > > > > > > memory, > > >> >> > > > > > > > > > > > > > > managed > > >> >> > > > > > > > > > > > > > > > > > > memory, > > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > > >> process > > >> >> > > memory, > > >> >> > > > > how > > >> >> > > > > > do > > >> >> > > > > > > > we > > >> >> > > > > > > > > > deal > > >> >> > > > > > > > > > > > > with > > >> >> > > > > > > > > > > > > > > this > > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > > memory > > >> >> > > > > configuration > > >> >> > > > > > > in > > >> >> > > > > > > > > > > client? > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> >> > > [hidden email]> > > >> >> > > > > > > > > > 于2019年8月7日周三 > > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > start > > a > > >> >> > > discussion > > >> >> > > > > > > thread > > >> >> > > > > > > > on > > >> >> > > > > > > > > > > > > "FLIP-49: > > >> >> > > > > > > > > > > > > > > > > Unified > > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > > >> >> > > > TaskExecutors"[1], > > >> >> > > > > > > where > > >> >> > > > > > > > we > > >> >> > > > > > > > > > > > > describe > > >> >> > > > > > > > > > > > > > > how > > >> >> > > > > > > > > > > > > > > > to > > >> >> > > > > > > > > > > > > > > > > > > > improve > > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> >> > > configurations. > > >> >> > > > > The > > >> >> > > > > > > > FLIP > > >> >> > > > > > > > > > > > document > > >> >> > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > mostly > > >> >> > > > > > > > > > > > > > > > > > > > based > > >> >> > > > > > > > > > > > > > > > > > > > > > on > > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > > >> >> Management > > >> >> > > and > > >> >> > > > > > > > > > Configuration > > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> >> > > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > > >> follow-up > > >> >> > > > > discussions > > >> >> > > > > > > > both > > >> >> > > > > > > > > > > online > > >> >> > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > > offline. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > > >> several > > >> >> > > > > > shortcomings > > >> >> > > > > > > of > > >> >> > > > > > > > > > > current > > >> >> > > > > > > > > > > > > > > (Flink > > >> >> > > > > > > > > > > > > > > > > 1.9) > > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > > >> >> > > configuration. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > >> configuration > > >> >> > for > > >> >> > > > > > > Streaming > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > > Batch. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > > >> difficult > > >> >> > > > > > configuration > > >> >> > > > > > > of > > >> >> > > > > > > > > > > RocksDB > > >> >> > > > > > > > > > > > > in > > >> >> > > > > > > > > > > > > > > > > > Streaming. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > > >> uncertain > > >> >> and > > >> >> > > > hard > > >> >> > > > > to > > >> >> > > > > > > > > > > understand. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > > the > > >> >> > problems > > >> >> > > > can > > >> >> > > > > > be > > >> >> > > > > > > > > > > summarized > > >> >> > > > > > > > > > > > > as > > >> >> > > > > > > > > > > > > > > > > follows. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > > >> manager > > >> >> to > > >> >> > > also > > >> >> > > > > > > account > > >> >> > > > > > > > > for > > >> >> > > > > > > > > > > > memory > > >> >> > > > > > > > > > > > > > > usage > > >> >> > > > > > > > > > > > > > > > > by > > >> >> > > > > > > > > > > > > > > > > > > > state > > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > >> TaskExecutor > > >> >> > > memory > > >> >> > > > > is > > >> >> > > > > > > > > > > partitioned > > >> >> > > > > > > > > > > > > > > > accounted > > >> >> > > > > > > > > > > > > > > > > > > > > individual > > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > reservations > > >> and > > >> >> > pools. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > > >> >> > > configuration > > >> >> > > > > > > options > > >> >> > > > > > > > > and > > >> >> > > > > > > > > > > > > > > calculations > > >> >> > > > > > > > > > > > > > > > > > > logics. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > > details > > >> in > > >> >> the > > >> >> > > > FLIP > > >> >> > > > > > wiki > > >> >> > > > > > > > > > > document > > >> >> > > > > > > > > > > > > [1]. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > > >> early > > >> >> > > design > > >> >> > > > > doc > > >> >> > > > > > > [2] > > >> >> > > > > > > > is > > >> >> > > > > > > > > > out > > >> >> > > > > > > > > > > > of > > >> >> > > > > > > > > > > > > > > sync, > > >> >> > > > > > > > > > > > > > > > > and > > >> >> > > > > > > > > > > > > > > > > > it > > >> >> > > > > > > > > > > > > > > > > > > > is > > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have > the > > >> >> > > discussion > > >> >> > > > in > > >> >> > > > > > > this > > >> >> > > > > > > > > > > mailing > > >> >> > > > > > > > > > > > > list > > >> >> > > > > > > > > > > > > > > > > > thread.) > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to > your > > >> >> > > feedbacks. > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > >> > > > > > > |
EDIT: sorry for confusion I meant
taskmanager.memory.off-heap instead of setting taskmanager.memory.preallocate On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin <[hidden email]> wrote: > Hi All, > > @Xitong thanks a lot for driving the discussion. > > I also reviewed the FLIP and it looks quite good to me. > Here are some comments: > > > - One thing I wanted to discuss is the backwards-compatibility with > the previous user setups. We could list which options we plan to deprecate. > From the first glance it looks possible to provide the same/similar > behaviour for the setups relying on the deprecated options. E.g. > setting taskmanager.memory.preallocate to true could override the > new taskmanager.memory.managed.offheap-fraction to 1 etc. At the moment the > FLIP just states that in some cases it may require re-configuring of > cluster if migrated from prior versions. My suggestion is that we try to > keep it backwards-compatible unless there is a good reason like some major > complication for the implementation. > > > Also couple of smaller things: > > - I suggest we remove TaskExecutorSpecifics from the FLIP and leave > some general wording atm, like 'data structure to store' or 'utility > classes'. When the classes are implemented, we put the concrete class > names. This way we can avoid confusion and stale documents. > > > - As I understand, if user task uses native memory (not direct memory, > but e.g. unsafe.allocate or from external lib), there will be no > explicit guard against exceeding 'task off heap memory'. Then user should > still explicitly make sure that her/his direct buffer allocation plus any > other memory usages does not exceed value announced as 'task off heap'. I > guess there is no so much that can be done about it except mentioning in > docs, similar to controlling the heap state backend. > > > Thanks, > Andrey > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> wrote: > >> I also agree that all the configuration should be calculated out of >> TaskManager. >> >> So a full configuration should be generated before TaskManager started. >> >> Override the calculated configurations through -D now seems better. >> >> >> >> Best, >> >> Yang >> >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: >> >> > I just updated the FLIP wiki page [1], with the following changes: >> > >> > - Network memory uses JVM direct memory, and is accounted when >> setting >> > JVM max direct memory size parameter. >> > - Use dynamic configurations (`-Dkey=value`) to pass calculated >> memory >> > configs into TaskExecutors, instead of ENV variables. >> > - Remove 'supporting memory reservation' from the scope of this FLIP. >> > >> > @till @stephan, please take another look see if there are any other >> > concerns. >> > >> > Thank you~ >> > >> > Xintong Song >> > >> > >> > [1] >> > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email]> >> > wrote: >> > >> > > Sorry for the late response. >> > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the >> detail >> > > in PR. >> > > - Regarding passing parameters into the `TaskExecutor`, +1 for using >> > > dynamic configuration at the moment, given that there are more >> questions >> > to >> > > be discussed to have a general framework for overwriting >> configurations >> > > with ENV variables. >> > > - Regarding memory reservation, I double checked with Yu and he will >> take >> > > care of it. >> > > >> > > Thank you~ >> > > >> > > Xintong Song >> > > >> > > >> > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email]> >> > > wrote: >> > > >> > >> What I forgot to add is that we could tackle specifying the >> > configuration >> > >> fully in an incremental way and that the full specification should be >> > the >> > >> desired end state. >> > >> >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <[hidden email]> >> > >> wrote: >> > >> >> > >> > I think our goal should be that the configuration is fully >> specified >> > >> when >> > >> > the process is started. By considering the internal calculation >> step >> > to >> > >> be >> > >> > rather validate existing values and calculate missing ones, these >> two >> > >> > proposal shouldn't even conflict (given determinism). >> > >> > >> > >> > Since we don't want to change an existing flink-conf.yaml, >> specifying >> > >> the >> > >> > full configuration would require to pass in the options >> differently. >> > >> > >> > >> > One way could be the ENV variables approach. The reason why I'm >> trying >> > >> to >> > >> > exclude this feature from the FLIP is that I believe it needs a bit >> > more >> > >> > discussion. Just some questions which come to my mind: What would >> be >> > the >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator >> which >> > is >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot >> > >> > separator what would be the order of precedence if there are two >> ENV >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the >> > >> > precedence of env variable vs. dynamic configuration value >> specified >> > >> via -D? >> > >> > >> > >> > Another approach could be to pass in the dynamic configuration >> values >> > >> via >> > >> > `-Dkey=value` to the Flink process. For that we don't have to >> change >> > >> > anything because the functionality already exists. >> > >> > >> > >> > Cheers, >> > >> > Till >> > >> > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> >> > wrote: >> > >> > >> > >> >> I see. Under the assumption of strict determinism that should >> work. >> > >> >> >> > >> >> The original proposal had this point "don't compute inside the TM, >> > >> compute >> > >> >> outside and supply a full config", because that sounded more >> > intuitive. >> > >> >> >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < >> [hidden email] >> > > >> > >> >> wrote: >> > >> >> >> > >> >> > My understanding was that before starting the Flink process we >> > call a >> > >> >> > utility which calculates these values. I assume that this >> utility >> > >> will >> > >> >> do >> > >> >> > the calculation based on a set of configured values (process >> > memory, >> > >> >> flink >> > >> >> > memory, network memory etc.). Assuming that these values don't >> > differ >> > >> >> from >> > >> >> > the values with which the JVM is started, it should be possible >> to >> > >> >> > recompute them in the Flink process in order to set the values. >> > >> >> > >> > >> >> > >> > >> >> > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <[hidden email] >> > >> > >> wrote: >> > >> >> > >> > >> >> > > When computing the values in the JVM process after it started, >> > how >> > >> >> would >> > >> >> > > you deal with values like Max Direct Memory, Metaspace size. >> > native >> > >> >> > memory >> > >> >> > > reservation (reduce heap size), etc? All the values that are >> > >> >> parameters >> > >> >> > to >> > >> >> > > the JVM process and that need to be supplied at process >> startup? >> > >> >> > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < >> > >> [hidden email]> >> > >> >> > > wrote: >> > >> >> > > >> > >> >> > > > Thanks for the clarification. I have some more comments: >> > >> >> > > > >> > >> >> > > > - I would actually split the logic to compute the process >> > memory >> > >> >> > > > requirements and storing the values into two things. E.g. >> one >> > >> could >> > >> >> > name >> > >> >> > > > the former TaskExecutorProcessUtility and the latter >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR >> > >> since >> > >> >> it's >> > >> >> > > > just a naming detail. >> > >> >> > > > >> > >> >> > > > - Generally, I'm not opposed to making configuration values >> > >> >> overridable >> > >> >> > > by >> > >> >> > > > ENV variables. I think this is a very good idea and makes >> the >> > >> >> > > > configurability of Flink processes easier. However, I think >> > that >> > >> >> adding >> > >> >> > > > this functionality should not be part of this FLIP because >> it >> > >> would >> > >> >> > > simply >> > >> >> > > > widen the scope unnecessarily. >> > >> >> > > > >> > >> >> > > > The reasons why I believe it is unnecessary are the >> following: >> > >> For >> > >> >> Yarn >> > >> >> > > we >> > >> >> > > > already create write a flink-conf.yaml which could be >> populated >> > >> with >> > >> >> > the >> > >> >> > > > memory settings. For the other processes it should not make >> a >> > >> >> > difference >> > >> >> > > > whether the loaded Configuration is populated with the >> memory >> > >> >> settings >> > >> >> > > from >> > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to >> compute >> > >> the >> > >> >> > > missing >> > >> >> > > > values from the loaded configuration. If the latter would >> not >> > be >> > >> >> > possible >> > >> >> > > > (wrong or missing configuration values), then we should not >> > have >> > >> >> been >> > >> >> > > able >> > >> >> > > > to actually start the process in the first place. >> > >> >> > > > >> > >> >> > > > - Concerning the memory reservation: I agree with you that >> we >> > >> need >> > >> >> the >> > >> >> > > > memory reservation functionality to make streaming jobs work >> > with >> > >> >> > > "managed" >> > >> >> > > > memory. However, w/o this functionality the whole Flip would >> > >> already >> > >> >> > > bring >> > >> >> > > > a good amount of improvements to our users when running >> batch >> > >> jobs. >> > >> >> > > > Moreover, by keeping the scope smaller we can complete the >> FLIP >> > >> >> faster. >> > >> >> > > > Hence, I would propose to address the memory reservation >> > >> >> functionality >> > >> >> > > as a >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). >> > >> >> > > > >> > >> >> > > > Cheers, >> > >> >> > > > Till >> > >> >> > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < >> > >> [hidden email]> >> > >> >> > > wrote: >> > >> >> > > > >> > >> >> > > > > Just add my 2 cents. >> > >> >> > > > > >> > >> >> > > > > Using environment variables to override the configuration >> for >> > >> >> > different >> > >> >> > > > > taskmanagers is better. >> > >> >> > > > > We do not need to generate dedicated flink-conf.yaml for >> all >> > >> >> > > > taskmanagers. >> > >> >> > > > > A common flink-conf.yam and different environment >> variables >> > are >> > >> >> > enough. >> > >> >> > > > > By reducing the distributed cached files, it could make >> > >> launching >> > >> >> a >> > >> >> > > > > taskmanager faster. >> > >> >> > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could move the >> logic >> > >> into >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. >> > >> >> > > > > Maybe the client could also benefit from this. Different >> > users >> > >> do >> > >> >> not >> > >> >> > > > have >> > >> >> > > > > to export FLINK_CONF_DIR to update few config options. >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > > > > Best, >> > >> >> > > > > Yang >> > >> >> > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: >> > >> >> > > > > >> > >> >> > > > > > One note on the Environment Variables and Configuration >> > >> >> discussion. >> > >> >> > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables are added >> to >> > >> the >> > >> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" >> > >> method >> > >> >> (or >> > >> >> > > > > > similar). >> > >> >> > > > > > For all the code inside Flink, it looks like the data >> was >> > in >> > >> the >> > >> >> > > config >> > >> >> > > > > to >> > >> >> > > > > > start with, just that the scripts that compute the >> > variables >> > >> can >> > >> >> > pass >> > >> >> > > > the >> > >> >> > > > > > values to the process without actually needing to write >> a >> > >> file. >> > >> >> > > > > > >> > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" >> method >> > >> would >> > >> >> > take >> > >> >> > > > any >> > >> >> > > > > > ENV variable prefixed with "flink" and add it as a >> config >> > >> key. >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become >> > >> >> > > > "taskmanager.memory.size: >> > >> >> > > > > > 2g". >> > >> >> > > > > > >> > >> >> > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < >> > >> >> > [hidden email]> >> > >> >> > > > > > wrote: >> > >> >> > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. >> > >> >> > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki page, but >> let's >> > >> keep >> > >> >> the >> > >> >> > > > > > > discussion here. >> > >> >> > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think >> > about >> > >> >> > naming >> > >> >> > > it >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. >> > >> >> > > > > > > - Regarding passing memory configurations into task >> > >> executors, >> > >> >> > I'm >> > >> >> > > in >> > >> >> > > > > > favor >> > >> >> > > > > > > of do it via environment variables rather than >> > >> configurations, >> > >> >> > with >> > >> >> > > > the >> > >> >> > > > > > > following two reasons. >> > >> >> > > > > > > - It is easier to keep the memory options once >> > calculate >> > >> >> not to >> > >> >> > > be >> > >> >> > > > > > > changed with environment variables rather than >> > >> configurations. >> > >> >> > > > > > > - I'm not sure whether we should write the >> > configuration >> > >> in >> > >> >> > > startup >> > >> >> > > > > > > scripts. Writing changes into the configuration files >> > when >> > >> >> > running >> > >> >> > > > the >> > >> >> > > > > > > startup scripts does not sounds right to me. Or we >> could >> > >> make >> > >> >> a >> > >> >> > > copy >> > >> >> > > > of >> > >> >> > > > > > > configuration files per flink cluster, and make the >> task >> > >> >> executor >> > >> >> > > to >> > >> >> > > > > load >> > >> >> > > > > > > from the copy, and clean up the copy after the >> cluster is >> > >> >> > shutdown, >> > >> >> > > > > which >> > >> >> > > > > > > is complicated. (I think this is also what Stephan >> means >> > in >> > >> >> his >> > >> >> > > > comment >> > >> >> > > > > > on >> > >> >> > > > > > > the wiki page?) >> > >> >> > > > > > > - Regarding reserving memory, I think this change >> should >> > be >> > >> >> > > included >> > >> >> > > > in >> > >> >> > > > > > > this FLIP. I think a big part of motivations of this >> FLIP >> > >> is >> > >> >> to >> > >> >> > > unify >> > >> >> > > > > > > memory configuration for streaming / batch and make it >> > easy >> > >> >> for >> > >> >> > > > > > configuring >> > >> >> > > > > > > rocksdb memory. If we don't support memory >> reservation, >> > >> then >> > >> >> > > > streaming >> > >> >> > > > > > jobs >> > >> >> > > > > > > cannot use managed memory (neither on-heap or >> off-heap), >> > >> which >> > >> >> > > makes >> > >> >> > > > > this >> > >> >> > > > > > > FLIP incomplete. >> > >> >> > > > > > > - Regarding network memory, I think you are right. I >> > think >> > >> we >> > >> >> > > > probably >> > >> >> > > > > > > don't need to change network stack from using direct >> > >> memory to >> > >> >> > > using >> > >> >> > > > > > unsafe >> > >> >> > > > > > > native memory. Network memory size is deterministic, >> > >> cannot be >> > >> >> > > > reserved >> > >> >> > > > > > as >> > >> >> > > > > > > managed memory does, and cannot be overused. I think >> it >> > >> also >> > >> >> > works >> > >> >> > > if >> > >> >> > > > > we >> > >> >> > > > > > > simply keep using direct memory for network and >> include >> > it >> > >> in >> > >> >> jvm >> > >> >> > > max >> > >> >> > > > > > > direct memory size. >> > >> >> > > > > > > >> > >> >> > > > > > > Thank you~ >> > >> >> > > > > > > >> > >> >> > > > > > > Xintong Song >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < >> > >> >> > > [hidden email]> >> > >> >> > > > > > > wrote: >> > >> >> > > > > > > >> > >> >> > > > > > > > Hi Xintong, >> > >> >> > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and adding a more >> > >> >> detailed >> > >> >> > > > > > > > implementation plan. I have a couple of comments >> > >> concerning >> > >> >> the >> > >> >> > > > > > > > implementation plan: >> > >> >> > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really >> > >> >> descriptive. >> > >> >> > > > > Choosing >> > >> >> > > > > > a >> > >> >> > > > > > > > different name could help here. >> > >> >> > > > > > > > - I'm not sure whether I would pass the memory >> > >> >> configuration to >> > >> >> > > the >> > >> >> > > > > > > > TaskExecutor via environment variables. I think it >> > would >> > >> be >> > >> >> > > better >> > >> >> > > > to >> > >> >> > > > > > > write >> > >> >> > > > > > > > it into the configuration one uses to start the TM >> > >> process. >> > >> >> > > > > > > > - If possible, I would exclude the memory >> reservation >> > >> from >> > >> >> this >> > >> >> > > > FLIP >> > >> >> > > > > > and >> > >> >> > > > > > > > add this as part of a dedicated FLIP. >> > >> >> > > > > > > > - If possible, then I would exclude changes to the >> > >> network >> > >> >> > stack >> > >> >> > > > from >> > >> >> > > > > > > this >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory >> > >> needed >> > >> >> by >> > >> >> > > the >> > >> >> > > > > > > network >> > >> >> > > > > > > > stack is the framework direct memory requirement. >> > >> Changing >> > >> >> how >> > >> >> > > the >> > >> >> > > > > > memory >> > >> >> > > > > > > > is allocated can happen in a second step. This would >> > keep >> > >> >> the >> > >> >> > > scope >> > >> >> > > > > of >> > >> >> > > > > > > this >> > >> >> > > > > > > > FLIP smaller. >> > >> >> > > > > > > > >> > >> >> > > > > > > > Cheers, >> > >> >> > > > > > > > Till >> > >> >> > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < >> > >> >> > > > [hidden email]> >> > >> >> > > > > > > > wrote: >> > >> >> > > > > > > > >> > >> >> > > > > > > > > Hi everyone, >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], with >> > the >> > >> >> > > following >> > >> >> > > > > > > changes. >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding MemorySegment >> > >> >> > allocation. >> > >> >> > > As >> > >> >> > > > > > > > > discussed, we exclude this topic from the >> scope of >> > >> this >> > >> >> > > FLIP. >> > >> >> > > > > > > > > - Updated content about JVM direct memory >> > parameter >> > >> >> > > according >> > >> >> > > > to >> > >> >> > > > > > > > recent >> > >> >> > > > > > > > > discussions, and moved the other options to >> > >> "Rejected >> > >> >> > > > > > Alternatives" >> > >> >> > > > > > > > for >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > moment. >> > >> >> > > > > > > > > - Added implementation steps. >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > Thank you~ >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > Xintong Song >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > [1] >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < >> > >> >> > [hidden email] >> > >> >> > > > >> > >> >> > > > > > wrote: >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users >> before >> > >> task >> > >> >> > > dispose >> > >> >> > > > > and >> > >> >> > > > > > > > > memory >> > >> >> > > > > > > > > > release": I agree, that's how it should be. >> Let's >> > >> try it >> > >> >> > out. >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not >> wait >> > >> for >> > >> >> GC >> > >> >> > > when >> > >> >> > > > > > > > allocating >> > >> >> > > > > > > > > > direct memory buffer": There seems to be pretty >> > >> >> elaborate >> > >> >> > > logic >> > >> >> > > > > to >> > >> >> > > > > > > free >> > >> >> > > > > > > > > > buffers when allocating new ones. See >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default >> > works >> > >> >> (like >> > >> >> > > > going >> > >> >> > > > > > > with >> > >> >> > > > > > > > > > option 2 and not setting >> "-XX:MaxDirectMemorySize" >> > at >> > >> >> all), >> > >> >> > > > then >> > >> >> > > > > I >> > >> >> > > > > > > > think >> > >> >> > > > > > > > > it >> > >> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" >> to >> > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even >> if >> > we >> > >> use >> > >> >> > > > RocksDB. >> > >> >> > > > > > > That >> > >> >> > > > > > > > > is a >> > >> >> > > > > > > > > > big if, though, I honestly have no idea :D >> Would be >> > >> >> good to >> > >> >> > > > > > > understand >> > >> >> > > > > > > > > > this, though, because this would affect option >> (2) >> > >> and >> > >> >> > option >> > >> >> > > > > > (1.2). >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < >> > >> >> > > > > > [hidden email]> >> > >> >> > > > > > > > > > wrote: >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. Please >> > correct >> > >> >> me if >> > >> >> > > I'm >> > >> >> > > > > > > wrong. >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always avoid >> > returning >> > >> >> > memory >> > >> >> > > > > > segments >> > >> >> > > > > > > > to >> > >> >> > > > > > > > > > > memory manager while there are still >> > un-cleaned >> > >> >> > > > structures / >> > >> >> > > > > > > > threads >> > >> >> > > > > > > > > > > that >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it would >> cause >> > >> >> serious >> > >> >> > > > > problems >> > >> >> > > > > > > by >> > >> >> > > > > > > > > > having >> > >> >> > > > > > > > > > > multiple consumers trying to use the same >> > memory >> > >> >> > > segment. >> > >> >> > > > > > > > > > > - JVM does not wait for GC when allocating >> > >> direct >> > >> >> > memory >> > >> >> > > > > > buffer. >> > >> >> > > > > > > > > > > Therefore even we set proper max direct >> memory >> > >> size >> > >> >> > > limit, >> > >> >> > > > > we >> > >> >> > > > > > > may >> > >> >> > > > > > > > > > still >> > >> >> > > > > > > > > > > encounter direct memory oom if the GC >> cleaning >> > >> >> memory >> > >> >> > > > slower >> > >> >> > > > > > > than >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > > > direct memory allocation. >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < >> > >> >> > > > > > > [hidden email] >> > >> >> > > > > > > > > > > .invalid> >> > >> >> > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly shut down >> > >> before >> > >> >> we >> > >> >> > can >> > >> >> > > > > exit >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > task: >> > >> >> > > > > > > > > > > > In the current case of memory reuse, it has >> > >> freed up >> > >> >> > the >> > >> >> > > > > memory >> > >> >> > > > > > > it >> > >> >> > > > > > > > > > > > uses. If this memory is used by other tasks >> > and >> > >> >> > > > asynchronous >> > >> >> > > > > > > > threads >> > >> >> > > > > > > > > > > > of exited task may still be writing, there >> > will >> > >> be >> > >> >> > > > > concurrent >> > >> >> > > > > > > > > security >> > >> >> > > > > > > > > > > > problems, and even lead to errors in user >> > >> computing >> > >> >> > > > results. >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and intolerable >> > >> bug, No >> > >> >> > > matter >> > >> >> > > > > > what >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > > > option is, it should be avoided. >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: >> > >> >> > > > > > > > > > > > I don't think it is a good idea, I've >> > >> encountered so >> > >> >> > many >> > >> >> > > > > > > > situations >> > >> >> > > > > > > > > > > > that it's too late for GC to cause >> > DirectMemory >> > >> >> OOM. >> > >> >> > > > Release >> > >> >> > > > > > and >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on the type of >> > user >> > >> >> job, >> > >> >> > > > which >> > >> >> > > > > is >> > >> >> > > > > > > > > > > > often beyond our control. >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, >> > >> >> > > > > > > > > > > > Jingsong Lee >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > >> > >> >> > >> ------------------------------------------------------------------ >> > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified >> Memory >> > >> >> > > Configuration >> > >> >> > > > > for >> > >> >> > > > > > > > > > > > TaskExecutors >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 (manually >> release >> > >> >> memory) >> > >> >> > > is >> > >> >> > > > > that >> > >> >> > > > > > > > > > segfaults >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on >> user >> > >> >> ends. >> > >> >> > So >> > >> >> > > we >> > >> >> > > > > > need >> > >> >> > > > > > > to >> > >> >> > > > > > > > > > > > guarantee that this never happens. >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data >> > >> >> structures / >> > >> >> > > > > > algorithms >> > >> >> > > > > > > > > with >> > >> >> > > > > > > > > > > > additional threads, like hash table >> spill/read >> > >> and >> > >> >> > > sorting >> > >> >> > > > > > > threads. >> > >> >> > > > > > > > > We >> > >> >> > > > > > > > > > > need >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut down >> before >> > we >> > >> can >> > >> >> > exit >> > >> >> > > > the >> > >> >> > > > > > > task. >> > >> >> > > > > > > > > > > > I am not sure that we have that guaranteed >> > >> already, >> > >> >> > > that's >> > >> >> > > > > why >> > >> >> > > > > > > > option >> > >> >> > > > > > > > > > 1.1 >> > >> >> > > > > > > > > > > > seemed simpler to me. >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong >> Song < >> > >> >> > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. >> Summarized >> > in >> > >> >> this >> > >> >> > > way >> > >> >> > > > > > really >> > >> >> > > > > > > > > makes >> > >> >> > > > > > > > > > > > > things easier to understand. >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the >> > >> >> moment. I >> > >> >> > > > think >> > >> >> > > > > it >> > >> >> > > > > > > is >> > >> >> > > > > > > > > not >> > >> >> > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for >> memory >> > >> >> > manager, >> > >> >> > > as >> > >> >> > > > > > long >> > >> >> > > > > > > as >> > >> >> > > > > > > > > we >> > >> >> > > > > > > > > > > > always >> > >> >> > > > > > > > > > > > > de-allocate the memory segment when it is >> > >> released >> > >> >> > from >> > >> >> > > > the >> > >> >> > > > > > > > memory >> > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer >> > continue >> > >> >> using >> > >> >> > > the >> > >> >> > > > > > > buffer >> > >> >> > > > > > > > of >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > segment after releasing it, in which case >> we >> > do >> > >> >> want >> > >> >> > > the >> > >> >> > > > > job >> > >> >> > > > > > to >> > >> >> > > > > > > > > fail >> > >> >> > > > > > > > > > so >> > >> >> > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > detect the memory leak early. >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a >> good >> > >> idea. >> > >> >> > Not >> > >> >> > > > only >> > >> >> > > > > > > > because >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough to clean >> > >> direct >> > >> >> > > buffers) >> > >> >> > > > > may >> > >> >> > > > > > > not >> > >> >> > > > > > > > > > > always >> > >> >> > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > true, but also it makes harder for finding >> > >> >> problems >> > >> >> > in >> > >> >> > > > > cases >> > >> >> > > > > > of >> > >> >> > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured some direct >> > >> memory >> > >> >> for >> > >> >> > > the >> > >> >> > > > > > user >> > >> >> > > > > > > > > > > libraries. >> > >> >> > > > > > > > > > > > > If the library actually use more direct >> > memory >> > >> >> then >> > >> >> > > > > > configured, >> > >> >> > > > > > > > > which >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are >> > still >> > >> in >> > >> >> > use, >> > >> >> > > > may >> > >> >> > > > > > lead >> > >> >> > > > > > > > to >> > >> >> > > > > > > > > > > > overuse >> > >> >> > > > > > > > > > > > > of the total container memory. In that >> case, >> > >> if it >> > >> >> > > didn't >> > >> >> > > > > > touch >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > default max direct memory limit, we cannot >> > get >> > >> a >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > OOM >> > >> >> > > > > > > > > > and >> > >> >> > > > > > > > > > > it >> > >> >> > > > > > > > > > > > > will become super hard to understand which >> > >> part of >> > >> >> > the >> > >> >> > > > > > > > > configuration >> > >> >> > > > > > > > > > > need >> > >> >> > > > > > > > > > > > > to be updated. >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the similar >> problem as >> > >> >> 1.2, if >> > >> >> > > the >> > >> >> > > > > > > > exceeded >> > >> >> > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > memory does not reach the max direct >> memory >> > >> limit >> > >> >> > > > specified >> > >> >> > > > > > by >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it is >> slightly >> > >> better >> > >> >> > than >> > >> >> > > > > 1.2, >> > >> >> > > > > > > only >> > >> >> > > > > > > > > > > because >> > >> >> > > > > > > > > > > > > we can tune the parameter. >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan >> Ewen >> > < >> > >> >> > > > > > [hidden email] >> > >> >> > > > > > > > >> > >> >> > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" >> > >> discussion, >> > >> >> > maybe >> > >> >> > > > let >> > >> >> > > > > > me >> > >> >> > > > > > > > > > > summarize >> > >> >> > > > > > > > > > > > > it a >> > >> >> > > > > > > > > > > > > > bit differently: >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two options: >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be >> de-allocated >> > by >> > >> the >> > >> >> > GC. >> > >> >> > > > That >> > >> >> > > > > > > makes >> > >> >> > > > > > > > > it >> > >> >> > > > > > > > > > > > > segfault >> > >> >> > > > > > > > > > > > > > safe. But then we need a way to trigger >> GC >> > in >> > >> >> case >> > >> >> > > > > > > > de-allocation >> > >> >> > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments >> > happens >> > >> >> > quickly, >> > >> >> > > > > which >> > >> >> > > > > > > is >> > >> >> > > > > > > > > > often >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > case during batch scheduling or task >> > restart. >> > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" >> (option >> > >> 1.1) >> > >> >> is >> > >> >> > one >> > >> >> > > > way >> > >> >> > > > > > to >> > >> >> > > > > > > do >> > >> >> > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > - Another way could be to have a >> > dedicated >> > >> >> > > > bookkeeping >> > >> >> > > > > in >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this >> > is a >> > >> >> > number >> > >> >> > > > > > > > independent >> > >> >> > > > > > > > > of >> > >> >> > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate >> > the >> > >> >> memory >> > >> >> > > for >> > >> >> > > > > the >> > >> >> > > > > > > > > > > > > MemorySegments >> > >> >> > > > > > > > > > > > > > (option 2). That way we need not worry >> > about >> > >> >> > > triggering >> > >> >> > > > > GC >> > >> >> > > > > > by >> > >> >> > > > > > > > > some >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is >> harder >> > to >> > >> >> > prevent >> > >> >> > > > > > > > segfaults. >> > >> >> > > > > > > > > We >> > >> >> > > > > > > > > > > > need >> > >> >> > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > be very careful about when we release >> the >> > >> memory >> > >> >> > > > segments >> > >> >> > > > > > > (only >> > >> >> > > > > > > > > in >> > >> >> > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably >> need >> > to >> > >> >> set >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to >> > >> >> > > "off_heap_managed_memory + >> > >> >> > > > > > > > > > > direct_memory" >> > >> >> > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate >> reserved >> > >> >> memory >> > >> >> > > > pool. >> > >> >> > > > > > > > Because >> > >> >> > > > > > > > > if >> > >> >> > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > just >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to >> > >> >> > > > > "off_heap_managed_memory + >> > >> >> > > > > > > > > > > > > jvm_overhead", >> > >> >> > > > > > > > > > > > > > then there will be times when that >> entire >> > >> >> memory is >> > >> >> > > > > > allocated >> > >> >> > > > > > > > by >> > >> >> > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > > buffers and we have nothing left for the >> > JVM >> > >> >> > > overhead. >> > >> >> > > > So >> > >> >> > > > > > we >> > >> >> > > > > > > > > either >> > >> >> > > > > > > > > > > > need >> > >> >> > > > > > > > > > > > > a >> > >> >> > > > > > > > > > > > > > way to compensate for that (again some >> > safety >> > >> >> > margin >> > >> >> > > > > cutoff >> > >> >> > > > > > > > > value) >> > >> >> > > > > > > > > > or >> > >> >> > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > will exceed container memory. >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be >> > aware >> > >> >> that >> > >> >> > it >> > >> >> > > > > takes >> > >> >> > > > > > > > > > elaborate >> > >> >> > > > > > > > > > > > > logic >> > >> >> > > > > > > > > > > > > > to push recycling of direct buffers >> without >> > >> >> always >> > >> >> > > > > > > triggering a >> > >> >> > > > > > > > > > full >> > >> >> > > > > > > > > > > > GC. >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the options will >> be >> > >> >> easiest >> > >> >> > to >> > >> >> > > > do >> > >> >> > > > > in >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > following >> > >> >> > > > > > > > > > > > > > order: >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated >> > direct_memory >> > >> >> > > > parameter, >> > >> >> > > > > as >> > >> >> > > > > > > > > > discussed >> > >> >> > > > > > > > > > > > > > above. We would need to find a way to >> set >> > the >> > >> >> > > > > direct_memory >> > >> >> > > > > > > > > > parameter >> > >> >> > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > default. We could start with 64 MB and >> see >> > >> how >> > >> >> it >> > >> >> > > goes >> > >> >> > > > in >> > >> >> > > > > > > > > practice. >> > >> >> > > > > > > > > > > One >> > >> >> > > > > > > > > > > > > > danger I see is that setting this loo >> low >> > can >> > >> >> > cause a >> > >> >> > > > > bunch >> > >> >> > > > > > > of >> > >> >> > > > > > > > > > > > additional >> > >> >> > > > > > > > > > > > > > GCs compared to before (we need to watch >> > this >> > >> >> > > > carefully). >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite >> simple >> > to >> > >> >> > > implement, >> > >> >> > > > > we >> > >> >> > > > > > > > could >> > >> >> > > > > > > > > > try >> > >> >> > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" >> > >> >> > > > > > > > > > > > > parameter >> > >> >> > > > > > > > > > > > > > at all and assume that all the direct >> > memory >> > >> >> > > > allocations >> > >> >> > > > > > that >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be >> > cleaned >> > >> up >> > >> >> > fast >> > >> >> > > > > enough >> > >> >> > > > > > > > > through >> > >> >> > > > > > > > > > > > > regular >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid >> > >> assumption, >> > >> >> > > > though. >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, >> > >> >> > > > > > > > > > > > > > Stephan >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong >> > Song >> > >> < >> > >> >> > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I >> was >> > >> >> > wondering >> > >> >> > > > > > whether >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > can >> > >> >> > > > > > > > > > > > > avoid >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap >> > >> managed >> > >> >> > memory >> > >> >> > > > and >> > >> >> > > > > > > > network >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > with >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a >> > second >> > >> >> > > thought, >> > >> >> > > > I >> > >> >> > > > > > > think >> > >> >> > > > > > > > > even >> > >> >> > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for >> > >> off-heap >> > >> >> > > > managed >> > >> >> > > > > > > memory >> > >> >> > > > > > > > > > could >> > >> >> > > > > > > > > > > > > cause >> > >> >> > > > > > > > > > > > > > > problems. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what >> > >> proposed >> > >> >> in >> > >> >> > > this >> > >> >> > > > > > FLIP >> > >> >> > > > > > > it >> > >> >> > > > > > > > > to >> > >> >> > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > both >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and network >> > memory >> > >> >> > > allocated >> > >> >> > > > > > > through >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they >> are >> > >> >> > practically >> > >> >> > > > > > native >> > >> >> > > > > > > > > memory >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The >> > only >> > >> >> parts >> > >> >> > of >> > >> >> > > > > > memory >> > >> >> > > > > > > > > > limited >> > >> >> > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap >> > memory >> > >> and >> > >> >> > JVM >> > >> >> > > > > > > overhead, >> > >> >> > > > > > > > > > which >> > >> >> > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set >> the >> > >> JVM >> > >> >> max >> > >> >> > > > > direct >> > >> >> > > > > > > > memory >> > >> >> > > > > > > > > > to. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till >> > >> Rohrmann >> > >> >> < >> > >> >> > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification >> Xintong. I >> > >> >> > > understand >> > >> >> > > > > the >> > >> >> > > > > > > two >> > >> >> > > > > > > > > > > > > alternatives >> > >> >> > > > > > > > > > > > > > > > now. >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 >> > because >> > >> it >> > >> >> > makes >> > >> >> > > > > > things >> > >> >> > > > > > > > > > > explicit. >> > >> >> > > > > > > > > > > > If >> > >> >> > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I >> fear >> > >> that >> > >> >> we >> > >> >> > > might >> > >> >> > > > > end >> > >> >> > > > > > > up >> > >> >> > > > > > > > > in a >> > >> >> > > > > > > > > > > > > similar >> > >> >> > > > > > > > > > > > > > > > situation as we are currently in: >> The >> > >> user >> > >> >> > might >> > >> >> > > > see >> > >> >> > > > > > that >> > >> >> > > > > > > > her >> > >> >> > > > > > > > > > > > process >> > >> >> > > > > > > > > > > > > > > gets >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know >> why >> > >> this >> > >> >> is >> > >> >> > > the >> > >> >> > > > > > case. >> > >> >> > > > > > > > > > > > > Consequently, >> > >> >> > > > > > > > > > > > > > > she >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory >> > size >> > >> >> > > (similar >> > >> >> > > > to >> > >> >> > > > > > > > > > increasing >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > cutoff >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for >> the >> > >> extra >> > >> >> > > direct >> > >> >> > > > > > > memory. >> > >> >> > > > > > > > > > Even >> > >> >> > > > > > > > > > > > > worse, >> > >> >> > > > > > > > > > > > > > > she >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets >> which >> > >> are >> > >> >> not >> > >> >> > > > fully >> > >> >> > > > > > used >> > >> >> > > > > > > > and >> > >> >> > > > > > > > > > > hence >> > >> >> > > > > > > > > > > > > > won't >> > >> >> > > > > > > > > > > > > > > > change the overall memory >> consumption. >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM >> > Xintong >> > >> >> Song < >> > >> >> > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a >> concrete >> > >> >> example >> > >> >> > > Till. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the following >> > >> scenario. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap >> > >> Memory + >> > >> >> JVM >> > >> >> > > > > > > Overhead): >> > >> >> > > > > > > > > > 200MB >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> > >> >> Metaspace, >> > >> >> > > > > > Off-Heap >> > >> >> > > > > > > > > > Managed >> > >> >> > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set >> > >> >> > > -XX:MaxDirectMemorySize >> > >> >> > > > > to >> > >> >> > > > > > > > 200MB. >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set >> > >> >> > > -XX:MaxDirectMemorySize >> > >> >> > > > > to >> > >> >> > > > > > a >> > >> >> > > > > > > > very >> > >> >> > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > value, >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage >> of >> > >> Task >> > >> >> > > > Off-Heap >> > >> >> > > > > > > Memory >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > Overhead >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then >> > alternative 2 >> > >> >> and >> > >> >> > > > > > > alternative 3 >> > >> >> > > > > > > > > > > should >> > >> >> > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger >> > >> >> > > > > -XX:MaxDirectMemorySize >> > >> >> > > > > > > will >> > >> >> > > > > > > > > not >> > >> >> > > > > > > > > > > > > reduce >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage >> of >> > >> Task >> > >> >> > > > Off-Heap >> > >> >> > > > > > > Memory >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, >> > then >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from >> > >> frequent >> > >> >> OOM. >> > >> >> > > To >> > >> >> > > > > > avoid >> > >> >> > > > > > > > > that, >> > >> >> > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > only >> > >> >> > > > > > > > > > > > > > > > thing >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify the >> > >> >> configuration >> > >> >> > > and >> > >> >> > > > > > > > increase >> > >> >> > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > Direct >> > >> >> > > > > > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM >> > >> Overhead). >> > >> >> > Let's >> > >> >> > > > say >> > >> >> > > > > > > that >> > >> >> > > > > > > > > user >> > >> >> > > > > > > > > > > > > > increases >> > >> >> > > > > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this >> will >> > >> >> reduce >> > >> >> > the >> > >> >> > > > > total >> > >> >> > > > > > > > size >> > >> >> > > > > > > > > of >> > >> >> > > > > > > > > > > > other >> > >> >> > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total >> > >> process >> > >> >> > > memory >> > >> >> > > > > > > remains >> > >> >> > > > > > > > > > 1GB. >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is >> no >> > >> >> chance of >> > >> >> > > > > direct >> > >> >> > > > > > > OOM. >> > >> >> > > > > > > > > > There >> > >> >> > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > chances >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total process >> > >> memory >> > >> >> > limit, >> > >> >> > > > but >> > >> >> > > > > > > given >> > >> >> > > > > > > > > > that >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > process >> > >> >> > > > > > > > > > > > > > > > > may >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved >> native >> > >> >> memory >> > >> >> > > > > > (Off-Heap >> > >> >> > > > > > > > > > Managed >> > >> >> > > > > > > > > > > > > > Memory, >> > >> >> > > > > > > > > > > > > > > > > Network >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the >> > >> actual >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > > usage >> > >> >> > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > slightly >> > >> >> > > > > > > > > > > > > > > > > above >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user >> > >> probably >> > >> >> do >> > >> >> > > not >> > >> >> > > > > need >> > >> >> > > > > > > to >> > >> >> > > > > > > > > > change >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > configurations. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> > >> >> > > perspective, a >> > >> >> > > > > > > > feasible >> > >> >> > > > > > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to >> lower >> > >> >> resource >> > >> >> > > > > > > utilization >> > >> >> > > > > > > > > > > compared >> > >> >> > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > alternative 3. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM >> Till >> > >> >> > Rohrmann >> > >> >> > > < >> > >> >> > > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me >> > >> understand >> > >> >> the >> > >> >> > > > > > difference >> > >> >> > > > > > > > > > between >> > >> >> > > > > > > > > > > > > > > > > alternative 2 >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under >> > utilization >> > >> >> > > Xintong. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> > >> >> XX:MaxDirectMemorySize >> > >> >> > > to >> > >> >> > > > > Task >> > >> >> > > > > > > > > > Off-Heap >> > >> >> > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk >> > that >> > >> >> this >> > >> >> > > size >> > >> >> > > > > is >> > >> >> > > > > > > too >> > >> >> > > > > > > > > low >> > >> >> > > > > > > > > > > > > > resulting >> > >> >> > > > > > > > > > > > > > > > in a >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> > >> >> potentially >> > >> >> > an >> > >> >> > > > OOM. >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> > >> >> XX:MaxDirectMemorySize >> > >> >> > > to >> > >> >> > > > > > > > something >> > >> >> > > > > > > > > > > larger >> > >> >> > > > > > > > > > > > > > than >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of >> course >> > >> >> reduce >> > >> >> > > the >> > >> >> > > > > > sizes >> > >> >> > > > > > > of >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > > > > other >> > >> >> > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > types. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now >> result >> > >> in an >> > >> >> > > under >> > >> >> > > > > > > > > utilization >> > >> >> > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> > >> >> alternative 3 >> > >> >> > > > > > strictly >> > >> >> > > > > > > > > sets a >> > >> >> > > > > > > > > > > > > higher >> > >> >> > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use >> only >> > >> >> little, >> > >> >> > > > then I >> > >> >> > > > > > > would >> > >> >> > > > > > > > > > > expect >> > >> >> > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory >> > under >> > >> >> > > > > utilization. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM >> > Yang >> > >> >> Wang < >> > >> >> > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very >> large >> > >> max >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > size >> > >> >> > > > > > > > > > > when >> > >> >> > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > do >> > >> >> > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and >> native >> > >> >> memory. >> > >> >> > If >> > >> >> > > > the >> > >> >> > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > memory,including >> > >> >> > > > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework >> > direct >> > >> >> > > > memory,could >> > >> >> > > > > > be >> > >> >> > > > > > > > > > > calculated >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting >> direct >> > >> memory >> > >> >> > with >> > >> >> > > > > fixed >> > >> >> > > > > > > > > value. >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn >> > and >> > >> >> k8s,we >> > >> >> > > > need >> > >> >> > > > > to >> > >> >> > > > > > > > check >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client to >> avoid >> > >> >> > > submitting >> > >> >> > > > > > > > > successfully >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > failing >> > >> >> > > > > > > > > > > > > > > > > in >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < >> > >> [hidden email] >> > >> >> > > > > >于2019年8月13日 >> > >> >> > > > > > > > > > 周二22:07写道: >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think >> > you >> > >> are >> > >> >> > > right >> > >> >> > > > > that >> > >> >> > > > > > > we >> > >> >> > > > > > > > > > should >> > >> >> > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > include >> > >> >> > > > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this >> > FLIP. >> > >> >> This >> > >> >> > > FLIP >> > >> >> > > > > > should >> > >> >> > > > > > > > > > > > concentrate >> > >> >> > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> > >> >> > TaskExecutors, >> > >> >> > > > > with >> > >> >> > > > > > > > > minimum >> > >> >> > > > > > > > > > > > > > > involvement >> > >> >> > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> > >> >> > alternative >> > >> >> > > 3 >> > >> >> > > > > may >> > >> >> > > > > > > not >> > >> >> > > > > > > > > > having >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > same >> > >> >> > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that >> > >> alternative 2 >> > >> >> > > does, >> > >> >> > > > > but >> > >> >> > > > > > at >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > > cost >> > >> >> > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > risk >> > >> >> > > > > > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the >> container >> > >> level, >> > >> >> > > which >> > >> >> > > > is >> > >> >> > > > > > not >> > >> >> > > > > > > > > good. >> > >> >> > > > > > > > > > > My >> > >> >> > > > > > > > > > > > > > point >> > >> >> > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" >> and >> > >> "JVM >> > >> >> > > > > Overhead" >> > >> >> > > > > > > are >> > >> >> > > > > > > > > not >> > >> >> > > > > > > > > > > easy >> > >> >> > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > config. >> > >> >> > > > > > > > > > > > > > > > > > > For >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might >> > >> configure >> > >> >> > them >> > >> >> > > > > > higher >> > >> >> > > > > > > > than >> > >> >> > > > > > > > > > > what >> > >> >> > > > > > > > > > > > > > > actually >> > >> >> > > > > > > > > > > > > > > > > > > needed, >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a >> direct >> > >> OOM. >> > >> >> For >> > >> >> > > > > > > alternative >> > >> >> > > > > > > > > 3, >> > >> >> > > > > > > > > > > > users >> > >> >> > > > > > > > > > > > > do >> > >> >> > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > get >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not >> > >> config >> > >> >> the >> > >> >> > > two >> > >> >> > > > > > > options >> > >> >> > > > > > > > > > > > > aggressively >> > >> >> > > > > > > > > > > > > > > > high. >> > >> >> > > > > > > > > > > > > > > > > > But >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks >> of >> > >> >> overall >> > >> >> > > > > container >> > >> >> > > > > > > > > memory >> > >> >> > > > > > > > > > > > usage >> > >> >> > > > > > > > > > > > > > > > exceeds >> > >> >> > > > > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > > > > budget. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at >> 9:39 AM >> > >> Till >> > >> >> > > > > Rohrmann < >> > >> >> > > > > > > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this >> > FLIP >> > >> >> > Xintong. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it >> already >> > >> >> looks >> > >> >> > > quite >> > >> >> > > > > > good. >> > >> >> > > > > > > > > > > > Concerning >> > >> >> > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > first >> > >> >> > > > > > > > > > > > > > > > > > > open >> > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating >> > >> memory >> > >> >> > > > segments, >> > >> >> > > > > I >> > >> >> > > > > > > was >> > >> >> > > > > > > > > > > > wondering >> > >> >> > > > > > > > > > > > > > > > whether >> > >> >> > > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do >> in >> > the >> > >> >> > context >> > >> >> > > > of >> > >> >> > > > > > this >> > >> >> > > > > > > > > FLIP >> > >> >> > > > > > > > > > or >> > >> >> > > > > > > > > > > > > > whether >> > >> >> > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? >> > Without >> > >> >> > knowing >> > >> >> > > > all >> > >> >> > > > > > > > > details, >> > >> >> > > > > > > > > > I >> > >> >> > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > > concerned >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the >> scope >> > >> of >> > >> >> this >> > >> >> > > > FLIP >> > >> >> > > > > > too >> > >> >> > > > > > > > much >> > >> >> > > > > > > > > > > > because >> > >> >> > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing >> > call >> > >> >> sites >> > >> >> > of >> > >> >> > > > the >> > >> >> > > > > > > > > > > MemoryManager >> > >> >> > > > > > > > > > > > > > where >> > >> >> > > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > > > allocate >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this >> should >> > >> >> mainly >> > >> >> > be >> > >> >> > > > > batch >> > >> >> > > > > > > > > > > operators). >> > >> >> > > > > > > > > > > > > The >> > >> >> > > > > > > > > > > > > > > > > addition >> > >> >> > > > > > > > > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation >> call >> > to >> > >> the >> > >> >> > > > > > > MemoryManager >> > >> >> > > > > > > > > > should >> > >> >> > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > affected >> > >> >> > > > > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that >> > >> this is >> > >> >> > the >> > >> >> > > > only >> > >> >> > > > > > > point >> > >> >> > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > interaction >> > >> >> > > > > > > > > > > > > > > > a >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have >> with >> > >> the >> > >> >> > > > > > > MemoryManager. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> > >> >> question >> > >> >> > > about >> > >> >> > > > > > > setting >> > >> >> > > > > > > > > or >> > >> >> > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > setting >> > >> >> > > > > > > > > > > > > > > > a >> > >> >> > > > > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I >> would >> > >> also >> > >> >> be >> > >> >> > > > > > interested >> > >> >> > > > > > > > why >> > >> >> > > > > > > > > > > Yang >> > >> >> > > > > > > > > > > > > Wang >> > >> >> > > > > > > > > > > > > > > > > thinks >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be >> > best. >> > >> My >> > >> >> > > concern >> > >> >> > > > > > about >> > >> >> > > > > > > > > this >> > >> >> > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation >> as >> > we >> > >> >> are >> > >> >> > now >> > >> >> > > > > with >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> > >> >> > > > > > > > > > > > > > > > > > > If >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools >> > are >> > >> not >> > >> >> > > > clearly >> > >> >> > > > > > > > > separated >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > can >> > >> >> > > > > > > > > > > > > > > > spill >> > >> >> > > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it >> is >> > >> quite >> > >> >> > hard >> > >> >> > > > to >> > >> >> > > > > > > > > understand >> > >> >> > > > > > > > > > > > what >> > >> >> > > > > > > > > > > > > > > > exactly >> > >> >> > > > > > > > > > > > > > > > > > > > causes a >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for >> > using >> > >> >> too >> > >> >> > > much >> > >> >> > > > > > > memory. >> > >> >> > > > > > > > > This >> > >> >> > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > then >> > >> >> > > > > > > > > > > > > > > > > > easily >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar >> situation >> > >> what >> > >> >> we >> > >> >> > > have >> > >> >> > > > > with >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > > > cutoff-ratio. >> > >> >> > > > > > > > > > > > > > > > So >> > >> >> > > > > > > > > > > > > > > > > > why >> > >> >> > > > > > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default >> value >> > >> for >> > >> >> max >> > >> >> > > > direct >> > >> >> > > > > > > > memory >> > >> >> > > > > > > > > > and >> > >> >> > > > > > > > > > > > > giving >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > an >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if >> he >> > >> runs >> > >> >> into >> > >> >> > > an >> > >> >> > > > > OOM. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would >> > >> alternative 2 >> > >> >> > lead >> > >> >> > > to >> > >> >> > > > > > lower >> > >> >> > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > utilization >> > >> >> > > > > > > > > > > > > > > > > > than >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set >> > the >> > >> >> direct >> > >> >> > > > > memory >> > >> >> > > > > > > to a >> > >> >> > > > > > > > > > > higher >> > >> >> > > > > > > > > > > > > > value? >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at >> 9:12 >> > AM >> > >> >> > Xintong >> > >> >> > > > > Song < >> > >> >> > > > > > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, >> > >> Yang. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct >> Memory* >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very >> > large >> > >> max >> > >> >> > > direct >> > >> >> > > > > > > memory >> > >> >> > > > > > > > > size >> > >> >> > > > > > > > > > > > > > > definitely >> > >> >> > > > > > > > > > > > > > > > > has >> > >> >> > > > > > > > > > > > > > > > > > > some >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do >> not >> > >> >> worry >> > >> >> > > about >> > >> >> > > > > > > direct >> > >> >> > > > > > > > > OOM, >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > don't >> > >> >> > > > > > > > > > > > > > > > > > even >> > >> >> > > > > > > > > > > > > > > > > > > > > need >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / >> > network >> > >> >> > memory >> > >> >> > > > with >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also >> > some >> > >> >> down >> > >> >> > > sides >> > >> >> > > > > of >> > >> >> > > > > > > > doing >> > >> >> > > > > > > > > > > this. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can >> think >> > >> of is >> > >> >> > that >> > >> >> > > > if >> > >> >> > > > > a >> > >> >> > > > > > > task >> > >> >> > > > > > > > > > > > executor >> > >> >> > > > > > > > > > > > > > > > > container >> > >> >> > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to >> overusing >> > >> >> memory, >> > >> >> > it >> > >> >> > > > > could >> > >> >> > > > > > > be >> > >> >> > > > > > > > > hard >> > >> >> > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > use >> > >> >> > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > know >> > >> >> > > > > > > > > > > > > > > > > > > > which >> > >> >> > > > > > > > > > > > > > > > > > > > > > part >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is >> > overused. >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side >> is >> > >> that >> > >> >> the >> > >> >> > > JVM >> > >> >> > > > > > never >> > >> >> > > > > > > > > > trigger >> > >> >> > > > > > > > > > > GC >> > >> >> > > > > > > > > > > > > due >> > >> >> > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > reaching >> > >> >> > > > > > > > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, >> > >> because >> > >> >> the >> > >> >> > > > limit >> > >> >> > > > > > is >> > >> >> > > > > > > > too >> > >> >> > > > > > > > > > high >> > >> >> > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > reached. >> > >> >> > > > > > > > > > > > > > > > > > > > That >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of >> relay >> > on >> > >> >> heap >> > >> >> > > > memory >> > >> >> > > > > to >> > >> >> > > > > > > > > trigger >> > >> >> > > > > > > > > > > GC >> > >> >> > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > release >> > >> >> > > > > > > > > > > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could >> be a >> > >> >> problem >> > >> >> > in >> > >> >> > > > > cases >> > >> >> > > > > > > > where >> > >> >> > > > > > > > > > we >> > >> >> > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > more >> > >> >> > > > > > > > > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough >> > heap >> > >> >> > activity >> > >> >> > > > to >> > >> >> > > > > > > > trigger >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > GC. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your >> > >> reasons >> > >> >> > for >> > >> >> > > > > > > preferring >> > >> >> > > > > > > > > > > > setting a >> > >> >> > > > > > > > > > > > > > > very >> > >> >> > > > > > > > > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > > > > > > > value, >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything >> else >> > I >> > >> >> > > > overlooked. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict >> > >> between >> > >> >> > > > multiple >> > >> >> > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I >> > >> think we >> > >> >> > > should >> > >> >> > > > > > throw >> > >> >> > > > > > > > an >> > >> >> > > > > > > > > > > error. >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking >> on >> > the >> > >> >> > client >> > >> >> > > > side >> > >> >> > > > > > is >> > >> >> > > > > > > a >> > >> >> > > > > > > > > good >> > >> >> > > > > > > > > > > > idea, >> > >> >> > > > > > > > > > > > > > so >> > >> >> > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the >> > >> problem >> > >> >> > > before >> > >> >> > > > > > > > submitting >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > Flink >> > >> >> > > > > > > > > > > > > > > > > > cluster, >> > >> >> > > > > > > > > > > > > > > > > > > > > which >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only >> rely on >> > >> the >> > >> >> > > client >> > >> >> > > > > side >> > >> >> > > > > > > > > > checking, >> > >> >> > > > > > > > > > > > > > because >> > >> >> > > > > > > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster >> > >> TaskManagers >> > >> >> on >> > >> >> > > > > > different >> > >> >> > > > > > > > > > machines >> > >> >> > > > > > > > > > > > may >> > >> >> > > > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > > > > > > different >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the >> > client >> > >> >> does >> > >> >> > > see >> > >> >> > > > > > that. >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at >> 5:09 >> > >> PM >> > >> >> Yang >> > >> >> > > > Wang >> > >> >> > > > > < >> > >> >> > > > > > > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your >> detailed >> > >> >> > proposal. >> > >> >> > > > > After >> > >> >> > > > > > > all >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be >> > more >> > >> >> > > powerful >> > >> >> > > > to >> > >> >> > > > > > > > control >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > flink >> > >> >> > > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few >> questions >> > >> about >> > >> >> it. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct >> > >> Memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not >> differentiate >> > >> user >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > and >> > >> >> > > > > > > > > > > native >> > >> >> > > > > > > > > > > > > > > memory. >> > >> >> > > > > > > > > > > > > > > > > > They >> > >> >> > > > > > > > > > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > > > > > > > all >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task >> off-heap >> > >> >> memory. >> > >> >> > > > > Right? >> > >> >> > > > > > > So i >> > >> >> > > > > > > > > > don’t >> > >> >> > > > > > > > > > > > > think >> > >> >> > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > > > > set >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the >> > -XX:MaxDirectMemorySize >> > >> >> > > > properly. I >> > >> >> > > > > > > > prefer >> > >> >> > > > > > > > > > > > leaving >> > >> >> > > > > > > > > > > > > > it a >> > >> >> > > > > > > > > > > > > > > > > very >> > >> >> > > > > > > > > > > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory >> Calculation >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and >> > >> fine-grained >> > >> >> > > > > > > memory(network >> > >> >> > > > > > > > > > > memory, >> > >> >> > > > > > > > > > > > > > > managed >> > >> >> > > > > > > > > > > > > > > > > > > memory, >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total >> > >> process >> > >> >> > > memory, >> > >> >> > > > > how >> > >> >> > > > > > do >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > deal >> > >> >> > > > > > > > > > > > > with >> > >> >> > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the >> > memory >> > >> >> > > > > configuration >> > >> >> > > > > > > in >> > >> >> > > > > > > > > > > client? >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> > >> >> > > [hidden email]> >> > >> >> > > > > > > > > > 于2019年8月7日周三 >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to >> start >> > a >> > >> >> > > discussion >> > >> >> > > > > > > thread >> > >> >> > > > > > > > on >> > >> >> > > > > > > > > > > > > "FLIP-49: >> > >> >> > > > > > > > > > > > > > > > > Unified >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> > >> >> > > > TaskExecutors"[1], >> > >> >> > > > > > > where >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > > > > describe >> > >> >> > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > improve >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > >> >> > > configurations. >> > >> >> > > > > The >> > >> >> > > > > > > > FLIP >> > >> >> > > > > > > > > > > > document >> > >> >> > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > mostly >> > >> >> > > > > > > > > > > > > > > > > > > > based >> > >> >> > > > > > > > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> > >> >> Management >> > >> >> > > and >> > >> >> > > > > > > > > > Configuration >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] >> > >> >> > > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from >> > >> follow-up >> > >> >> > > > > discussions >> > >> >> > > > > > > > both >> > >> >> > > > > > > > > > > online >> > >> >> > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > > offline. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses >> > >> several >> > >> >> > > > > > shortcomings >> > >> >> > > > > > > of >> > >> >> > > > > > > > > > > current >> > >> >> > > > > > > > > > > > > > > (Flink >> > >> >> > > > > > > > > > > > > > > > > 1.9) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > >> >> > > configuration. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different >> > >> configuration >> > >> >> > for >> > >> >> > > > > > > Streaming >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > > Batch. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and >> > >> difficult >> > >> >> > > > > > configuration >> > >> >> > > > > > > of >> > >> >> > > > > > > > > > > RocksDB >> > >> >> > > > > > > > > > > > > in >> > >> >> > > > > > > > > > > > > > > > > > Streaming. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, >> > >> uncertain >> > >> >> and >> > >> >> > > > hard >> > >> >> > > > > to >> > >> >> > > > > > > > > > > understand. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve >> > the >> > >> >> > problems >> > >> >> > > > can >> > >> >> > > > > > be >> > >> >> > > > > > > > > > > summarized >> > >> >> > > > > > > > > > > > > as >> > >> >> > > > > > > > > > > > > > > > > follows. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory >> > >> manager >> > >> >> to >> > >> >> > > also >> > >> >> > > > > > > account >> > >> >> > > > > > > > > for >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > usage >> > >> >> > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > state >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how >> > >> TaskExecutor >> > >> >> > > memory >> > >> >> > > > > is >> > >> >> > > > > > > > > > > partitioned >> > >> >> > > > > > > > > > > > > > > > accounted >> > >> >> > > > > > > > > > > > > > > > > > > > > individual >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory >> reservations >> > >> and >> > >> >> > pools. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> > >> >> > > configuration >> > >> >> > > > > > > options >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > calculations >> > >> >> > > > > > > > > > > > > > > > > > > logics. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more >> > details >> > >> in >> > >> >> the >> > >> >> > > > FLIP >> > >> >> > > > > > wiki >> > >> >> > > > > > > > > > > document >> > >> >> > > > > > > > > > > > > [1]. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that >> the >> > >> early >> > >> >> > > design >> > >> >> > > > > doc >> > >> >> > > > > > > [2] >> > >> >> > > > > > > > is >> > >> >> > > > > > > > > > out >> > >> >> > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > sync, >> > >> >> > > > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > > it >> > >> >> > > > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have >> the >> > >> >> > > discussion >> > >> >> > > > in >> > >> >> > > > > > > this >> > >> >> > > > > > > > > > > mailing >> > >> >> > > > > > > > > > > > > list >> > >> >> > > > > > > > > > > > > > > > > > thread.) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to >> your >> > >> >> > > feedbacks. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong >> > Song >> > >> < >> > >> >> > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I >> was >> > >> >> > wondering >> > >> >> > > > > > whether >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > can >> > >> >> > > > > > > > > > > > > avoid >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap >> > >> managed >> > >> >> > memory >> > >> >> > > > and >> > >> >> > > > > > > > network >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > with >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a >> > second >> > >> >> > > thought, >> > >> >> > > > I >> > >> >> > > > > > > think >> > >> >> > > > > > > > > even >> > >> >> > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for >> > >> off-heap >> > >> >> > > > managed >> > >> >> > > > > > > memory >> > >> >> > > > > > > > > > could >> > >> >> > > > > > > > > > > > > cause >> > >> >> > > > > > > > > > > > > > > problems. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what >> > >> proposed >> > >> >> in >> > >> >> > > this >> > >> >> > > > > > FLIP >> > >> >> > > > > > > it >> > >> >> > > > > > > > > to >> > >> >> > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > both >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and network >> > memory >> > >> >> > > allocated >> > >> >> > > > > > > through >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they >> are >> > >> >> > practically >> > >> >> > > > > > native >> > >> >> > > > > > > > > memory >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The >> > only >> > >> >> parts >> > >> >> > of >> > >> >> > > > > > memory >> > >> >> > > > > > > > > > limited >> > >> >> > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap >> > memory >> > >> and >> > >> >> > JVM >> > >> >> > > > > > > overhead, >> > >> >> > > > > > > > > > which >> > >> >> > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set >> the >> > >> JVM >> > >> >> max >> > >> >> > > > > direct >> > >> >> > > > > > > > memory >> > >> >> > > > > > > > > > to. >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till >> > >> Rohrmann >> > >> >> < >> > >> >> > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification >> Xintong. I >> > >> >> > > understand >> > >> >> > > > > the >> > >> >> > > > > > > two >> > >> >> > > > > > > > > > > > > alternatives >> > >> >> > > > > > > > > > > > > > > > now. >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 >> > because >> > >> it >> > >> >> > makes >> > >> >> > > > > > things >> > >> >> > > > > > > > > > > explicit. >> > >> >> > > > > > > > > > > > If >> > >> >> > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I >> fear >> > >> that >> > >> >> we >> > >> >> > > might >> > >> >> > > > > end >> > >> >> > > > > > > up >> > >> >> > > > > > > > > in a >> > >> >> > > > > > > > > > > > > similar >> > >> >> > > > > > > > > > > > > > > > situation as we are currently in: >> The >> > >> user >> > >> >> > might >> > >> >> > > > see >> > >> >> > > > > > that >> > >> >> > > > > > > > her >> > >> >> > > > > > > > > > > > process >> > >> >> > > > > > > > > > > > > > > gets >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know >> why >> > >> this >> > >> >> is >> > >> >> > > the >> > >> >> > > > > > case. >> > >> >> > > > > > > > > > > > > Consequently, >> > >> >> > > > > > > > > > > > > > > she >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory >> > size >> > >> >> > > (similar >> > >> >> > > > to >> > >> >> > > > > > > > > > increasing >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > cutoff >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for >> the >> > >> extra >> > >> >> > > direct >> > >> >> > > > > > > memory. >> > >> >> > > > > > > > > > Even >> > >> >> > > > > > > > > > > > > worse, >> > >> >> > > > > > > > > > > > > > > she >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets >> which >> > >> are >> > >> >> not >> > >> >> > > > fully >> > >> >> > > > > > used >> > >> >> > > > > > > > and >> > >> >> > > > > > > > > > > hence >> > >> >> > > > > > > > > > > > > > won't >> > >> >> > > > > > > > > > > > > > > > change the overall memory >> consumption. >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM >> > Xintong >> > >> >> Song < >> > >> >> > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a >> concrete >> > >> >> example >> > >> >> > > Till. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the following >> > >> scenario. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap >> > >> Memory + >> > >> >> JVM >> > >> >> > > > > > > Overhead): >> > >> >> > > > > > > > > > 200MB >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM >> > >> >> Metaspace, >> > >> >> > > > > > Off-Heap >> > >> >> > > > > > > > > > Managed >> > >> >> > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set >> > >> >> > > -XX:MaxDirectMemorySize >> > >> >> > > > > to >> > >> >> > > > > > > > 200MB. >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set >> > >> >> > > -XX:MaxDirectMemorySize >> > >> >> > > > > to >> > >> >> > > > > > a >> > >> >> > > > > > > > very >> > >> >> > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > value, >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage >> of >> > >> Task >> > >> >> > > > Off-Heap >> > >> >> > > > > > > Memory >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > Overhead >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then >> > alternative 2 >> > >> >> and >> > >> >> > > > > > > alternative 3 >> > >> >> > > > > > > > > > > should >> > >> >> > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger >> > >> >> > > > > -XX:MaxDirectMemorySize >> > >> >> > > > > > > will >> > >> >> > > > > > > > > not >> > >> >> > > > > > > > > > > > > reduce >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage >> of >> > >> Task >> > >> >> > > > Off-Heap >> > >> >> > > > > > > Memory >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, >> > then >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from >> > >> frequent >> > >> >> OOM. >> > >> >> > > To >> > >> >> > > > > > avoid >> > >> >> > > > > > > > > that, >> > >> >> > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > only >> > >> >> > > > > > > > > > > > > > > > thing >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify the >> > >> >> configuration >> > >> >> > > and >> > >> >> > > > > > > > increase >> > >> >> > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > Direct >> > >> >> > > > > > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM >> > >> Overhead). >> > >> >> > Let's >> > >> >> > > > say >> > >> >> > > > > > > that >> > >> >> > > > > > > > > user >> > >> >> > > > > > > > > > > > > > increases >> > >> >> > > > > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this >> will >> > >> >> reduce >> > >> >> > the >> > >> >> > > > > total >> > >> >> > > > > > > > size >> > >> >> > > > > > > > > of >> > >> >> > > > > > > > > > > > other >> > >> >> > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total >> > >> process >> > >> >> > > memory >> > >> >> > > > > > > remains >> > >> >> > > > > > > > > > 1GB. >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is >> no >> > >> >> chance of >> > >> >> > > > > direct >> > >> >> > > > > > > OOM. >> > >> >> > > > > > > > > > There >> > >> >> > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > chances >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total process >> > >> memory >> > >> >> > limit, >> > >> >> > > > but >> > >> >> > > > > > > given >> > >> >> > > > > > > > > > that >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > process >> > >> >> > > > > > > > > > > > > > > > > may >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved >> native >> > >> >> memory >> > >> >> > > > > > (Off-Heap >> > >> >> > > > > > > > > > Managed >> > >> >> > > > > > > > > > > > > > Memory, >> > >> >> > > > > > > > > > > > > > > > > Network >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the >> > >> actual >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > > usage >> > >> >> > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > slightly >> > >> >> > > > > > > > > > > > > > > > > above >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user >> > >> probably >> > >> >> do >> > >> >> > > not >> > >> >> > > > > need >> > >> >> > > > > > > to >> > >> >> > > > > > > > > > change >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > configurations. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's >> > >> >> > > perspective, a >> > >> >> > > > > > > > feasible >> > >> >> > > > > > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to >> lower >> > >> >> resource >> > >> >> > > > > > > utilization >> > >> >> > > > > > > > > > > compared >> > >> >> > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > alternative 3. >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM >> Till >> > >> >> > Rohrmann >> > >> >> > > < >> > >> >> > > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me >> > >> understand >> > >> >> the >> > >> >> > > > > > difference >> > >> >> > > > > > > > > > between >> > >> >> > > > > > > > > > > > > > > > > alternative 2 >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under >> > utilization >> > >> >> > > Xintong. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set >> > >> >> XX:MaxDirectMemorySize >> > >> >> > > to >> > >> >> > > > > Task >> > >> >> > > > > > > > > > Off-Heap >> > >> >> > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > JVM >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk >> > that >> > >> >> this >> > >> >> > > size >> > >> >> > > > > is >> > >> >> > > > > > > too >> > >> >> > > > > > > > > low >> > >> >> > > > > > > > > > > > > > resulting >> > >> >> > > > > > > > > > > > > > > > in a >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and >> > >> >> potentially >> > >> >> > an >> > >> >> > > > OOM. >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set >> > >> >> XX:MaxDirectMemorySize >> > >> >> > > to >> > >> >> > > > > > > > something >> > >> >> > > > > > > > > > > larger >> > >> >> > > > > > > > > > > > > > than >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of >> course >> > >> >> reduce >> > >> >> > > the >> > >> >> > > > > > sizes >> > >> >> > > > > > > of >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > > > > other >> > >> >> > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > types. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now >> result >> > >> in an >> > >> >> > > under >> > >> >> > > > > > > > > utilization >> > >> >> > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If >> > >> >> alternative 3 >> > >> >> > > > > > strictly >> > >> >> > > > > > > > > sets a >> > >> >> > > > > > > > > > > > > higher >> > >> >> > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use >> only >> > >> >> little, >> > >> >> > > > then I >> > >> >> > > > > > > would >> > >> >> > > > > > > > > > > expect >> > >> >> > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory >> > under >> > >> >> > > > > utilization. >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM >> > Yang >> > >> >> Wang < >> > >> >> > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very >> large >> > >> max >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > size >> > >> >> > > > > > > > > > > when >> > >> >> > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > do >> > >> >> > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and >> native >> > >> >> memory. >> > >> >> > If >> > >> >> > > > the >> > >> >> > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > memory,including >> > >> >> > > > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework >> > direct >> > >> >> > > > memory,could >> > >> >> > > > > > be >> > >> >> > > > > > > > > > > calculated >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting >> direct >> > >> memory >> > >> >> > with >> > >> >> > > > > fixed >> > >> >> > > > > > > > > value. >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn >> > and >> > >> >> k8s,we >> > >> >> > > > need >> > >> >> > > > > to >> > >> >> > > > > > > > check >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client to >> avoid >> > >> >> > > submitting >> > >> >> > > > > > > > > successfully >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > failing >> > >> >> > > > > > > > > > > > > > > > > in >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < >> > >> [hidden email] >> > >> >> > > > > >于2019年8月13日 >> > >> >> > > > > > > > > > 周二22:07写道: >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think >> > you >> > >> are >> > >> >> > > right >> > >> >> > > > > that >> > >> >> > > > > > > we >> > >> >> > > > > > > > > > should >> > >> >> > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > include >> > >> >> > > > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this >> > FLIP. >> > >> >> This >> > >> >> > > FLIP >> > >> >> > > > > > should >> > >> >> > > > > > > > > > > > concentrate >> > >> >> > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for >> > >> >> > TaskExecutors, >> > >> >> > > > > with >> > >> >> > > > > > > > > minimum >> > >> >> > > > > > > > > > > > > > > involvement >> > >> >> > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think >> > >> >> > alternative >> > >> >> > > 3 >> > >> >> > > > > may >> > >> >> > > > > > > not >> > >> >> > > > > > > > > > having >> > >> >> > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > same >> > >> >> > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that >> > >> alternative 2 >> > >> >> > > does, >> > >> >> > > > > but >> > >> >> > > > > > at >> > >> >> > > > > > > > the >> > >> >> > > > > > > > > > > cost >> > >> >> > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > risk >> > >> >> > > > > > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the >> container >> > >> level, >> > >> >> > > which >> > >> >> > > > is >> > >> >> > > > > > not >> > >> >> > > > > > > > > good. >> > >> >> > > > > > > > > > > My >> > >> >> > > > > > > > > > > > > > point >> > >> >> > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" >> and >> > >> "JVM >> > >> >> > > > > Overhead" >> > >> >> > > > > > > are >> > >> >> > > > > > > > > not >> > >> >> > > > > > > > > > > easy >> > >> >> > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > config. >> > >> >> > > > > > > > > > > > > > > > > > > For >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might >> > >> configure >> > >> >> > them >> > >> >> > > > > > higher >> > >> >> > > > > > > > than >> > >> >> > > > > > > > > > > what >> > >> >> > > > > > > > > > > > > > > actually >> > >> >> > > > > > > > > > > > > > > > > > > needed, >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a >> direct >> > >> OOM. >> > >> >> For >> > >> >> > > > > > > alternative >> > >> >> > > > > > > > > 3, >> > >> >> > > > > > > > > > > > users >> > >> >> > > > > > > > > > > > > do >> > >> >> > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > get >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not >> > >> config >> > >> >> the >> > >> >> > > two >> > >> >> > > > > > > options >> > >> >> > > > > > > > > > > > > aggressively >> > >> >> > > > > > > > > > > > > > > > high. >> > >> >> > > > > > > > > > > > > > > > > > But >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks >> of >> > >> >> overall >> > >> >> > > > > container >> > >> >> > > > > > > > > memory >> > >> >> > > > > > > > > > > > usage >> > >> >> > > > > > > > > > > > > > > > exceeds >> > >> >> > > > > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > > > > budget. >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at >> 9:39 AM >> > >> Till >> > >> >> > > > > Rohrmann < >> > >> >> > > > > > > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this >> > FLIP >> > >> >> > Xintong. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it >> already >> > >> >> looks >> > >> >> > > quite >> > >> >> > > > > > good. >> > >> >> > > > > > > > > > > > Concerning >> > >> >> > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > first >> > >> >> > > > > > > > > > > > > > > > > > > open >> > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating >> > >> memory >> > >> >> > > > segments, >> > >> >> > > > > I >> > >> >> > > > > > > was >> > >> >> > > > > > > > > > > > wondering >> > >> >> > > > > > > > > > > > > > > > whether >> > >> >> > > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do >> in >> > the >> > >> >> > context >> > >> >> > > > of >> > >> >> > > > > > this >> > >> >> > > > > > > > > FLIP >> > >> >> > > > > > > > > > or >> > >> >> > > > > > > > > > > > > > whether >> > >> >> > > > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? >> > Without >> > >> >> > knowing >> > >> >> > > > all >> > >> >> > > > > > > > > details, >> > >> >> > > > > > > > > > I >> > >> >> > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > > concerned >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the >> scope >> > >> of >> > >> >> this >> > >> >> > > > FLIP >> > >> >> > > > > > too >> > >> >> > > > > > > > much >> > >> >> > > > > > > > > > > > because >> > >> >> > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing >> > call >> > >> >> sites >> > >> >> > of >> > >> >> > > > the >> > >> >> > > > > > > > > > > MemoryManager >> > >> >> > > > > > > > > > > > > > where >> > >> >> > > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > > > allocate >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this >> should >> > >> >> mainly >> > >> >> > be >> > >> >> > > > > batch >> > >> >> > > > > > > > > > > operators). >> > >> >> > > > > > > > > > > > > The >> > >> >> > > > > > > > > > > > > > > > > addition >> > >> >> > > > > > > > > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation >> call >> > to >> > >> the >> > >> >> > > > > > > MemoryManager >> > >> >> > > > > > > > > > should >> > >> >> > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > affected >> > >> >> > > > > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that >> > >> this is >> > >> >> > the >> > >> >> > > > only >> > >> >> > > > > > > point >> > >> >> > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > interaction >> > >> >> > > > > > > > > > > > > > > > a >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have >> with >> > >> the >> > >> >> > > > > > > MemoryManager. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open >> > >> >> question >> > >> >> > > about >> > >> >> > > > > > > setting >> > >> >> > > > > > > > > or >> > >> >> > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > setting >> > >> >> > > > > > > > > > > > > > > > a >> > >> >> > > > > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I >> would >> > >> also >> > >> >> be >> > >> >> > > > > > interested >> > >> >> > > > > > > > why >> > >> >> > > > > > > > > > > Yang >> > >> >> > > > > > > > > > > > > Wang >> > >> >> > > > > > > > > > > > > > > > > thinks >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be >> > best. >> > >> My >> > >> >> > > concern >> > >> >> > > > > > about >> > >> >> > > > > > > > > this >> > >> >> > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > > > would >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation >> as >> > we >> > >> >> are >> > >> >> > now >> > >> >> > > > > with >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. >> > >> >> > > > > > > > > > > > > > > > > > > If >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools >> > are >> > >> not >> > >> >> > > > clearly >> > >> >> > > > > > > > > separated >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > can >> > >> >> > > > > > > > > > > > > > > > spill >> > >> >> > > > > > > > > > > > > > > > > > over >> > >> >> > > > > > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it >> is >> > >> quite >> > >> >> > hard >> > >> >> > > > to >> > >> >> > > > > > > > > understand >> > >> >> > > > > > > > > > > > what >> > >> >> > > > > > > > > > > > > > > > exactly >> > >> >> > > > > > > > > > > > > > > > > > > > causes a >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for >> > using >> > >> >> too >> > >> >> > > much >> > >> >> > > > > > > memory. >> > >> >> > > > > > > > > This >> > >> >> > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > then >> > >> >> > > > > > > > > > > > > > > > > > easily >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar >> situation >> > >> what >> > >> >> we >> > >> >> > > have >> > >> >> > > > > with >> > >> >> > > > > > > the >> > >> >> > > > > > > > > > > > > > cutoff-ratio. >> > >> >> > > > > > > > > > > > > > > > So >> > >> >> > > > > > > > > > > > > > > > > > why >> > >> >> > > > > > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default >> value >> > >> for >> > >> >> max >> > >> >> > > > direct >> > >> >> > > > > > > > memory >> > >> >> > > > > > > > > > and >> > >> >> > > > > > > > > > > > > giving >> > >> >> > > > > > > > > > > > > > > the >> > >> >> > > > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > an >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if >> he >> > >> runs >> > >> >> into >> > >> >> > > an >> > >> >> > > > > OOM. >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would >> > >> alternative 2 >> > >> >> > lead >> > >> >> > > to >> > >> >> > > > > > lower >> > >> >> > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > utilization >> > >> >> > > > > > > > > > > > > > > > > > than >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set >> > the >> > >> >> direct >> > >> >> > > > > memory >> > >> >> > > > > > > to a >> > >> >> > > > > > > > > > > higher >> > >> >> > > > > > > > > > > > > > value? >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, >> > >> >> > > > > > > > > > > > > > > > > > > > > Till >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at >> 9:12 >> > AM >> > >> >> > Xintong >> > >> >> > > > > Song < >> > >> >> > > > > > > > > > > > > > > > [hidden email] >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, >> > >> Yang. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct >> Memory* >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very >> > large >> > >> max >> > >> >> > > direct >> > >> >> > > > > > > memory >> > >> >> > > > > > > > > size >> > >> >> > > > > > > > > > > > > > > definitely >> > >> >> > > > > > > > > > > > > > > > > has >> > >> >> > > > > > > > > > > > > > > > > > > some >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do >> not >> > >> >> worry >> > >> >> > > about >> > >> >> > > > > > > direct >> > >> >> > > > > > > > > OOM, >> > >> >> > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > don't >> > >> >> > > > > > > > > > > > > > > > > > even >> > >> >> > > > > > > > > > > > > > > > > > > > > need >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / >> > network >> > >> >> > memory >> > >> >> > > > with >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also >> > some >> > >> >> down >> > >> >> > > sides >> > >> >> > > > > of >> > >> >> > > > > > > > doing >> > >> >> > > > > > > > > > > this. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can >> think >> > >> of is >> > >> >> > that >> > >> >> > > > if >> > >> >> > > > > a >> > >> >> > > > > > > task >> > >> >> > > > > > > > > > > > executor >> > >> >> > > > > > > > > > > > > > > > > container >> > >> >> > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to >> overusing >> > >> >> memory, >> > >> >> > it >> > >> >> > > > > could >> > >> >> > > > > > > be >> > >> >> > > > > > > > > hard >> > >> >> > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > use >> > >> >> > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > know >> > >> >> > > > > > > > > > > > > > > > > > > > which >> > >> >> > > > > > > > > > > > > > > > > > > > > > part >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is >> > overused. >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side >> is >> > >> that >> > >> >> the >> > >> >> > > JVM >> > >> >> > > > > > never >> > >> >> > > > > > > > > > trigger >> > >> >> > > > > > > > > > > GC >> > >> >> > > > > > > > > > > > > due >> > >> >> > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > reaching >> > >> >> > > > > > > > > > > > > > > > > > > > > max >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, >> > >> because >> > >> >> the >> > >> >> > > > limit >> > >> >> > > > > > is >> > >> >> > > > > > > > too >> > >> >> > > > > > > > > > high >> > >> >> > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > be >> > >> >> > > > > > > > > > > > > > > > > > reached. >> > >> >> > > > > > > > > > > > > > > > > > > > That >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of >> relay >> > on >> > >> >> heap >> > >> >> > > > memory >> > >> >> > > > > to >> > >> >> > > > > > > > > trigger >> > >> >> > > > > > > > > > > GC >> > >> >> > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > release >> > >> >> > > > > > > > > > > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could >> be a >> > >> >> problem >> > >> >> > in >> > >> >> > > > > cases >> > >> >> > > > > > > > where >> > >> >> > > > > > > > > > we >> > >> >> > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > more >> > >> >> > > > > > > > > > > > > > > > > > direct >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough >> > heap >> > >> >> > activity >> > >> >> > > > to >> > >> >> > > > > > > > trigger >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > GC. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your >> > >> reasons >> > >> >> > for >> > >> >> > > > > > > preferring >> > >> >> > > > > > > > > > > > setting a >> > >> >> > > > > > > > > > > > > > > very >> > >> >> > > > > > > > > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > > > > > > > value, >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything >> else >> > I >> > >> >> > > > overlooked. >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict >> > >> between >> > >> >> > > > multiple >> > >> >> > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > user >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I >> > >> think we >> > >> >> > > should >> > >> >> > > > > > throw >> > >> >> > > > > > > > an >> > >> >> > > > > > > > > > > error. >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking >> on >> > the >> > >> >> > client >> > >> >> > > > side >> > >> >> > > > > > is >> > >> >> > > > > > > a >> > >> >> > > > > > > > > good >> > >> >> > > > > > > > > > > > idea, >> > >> >> > > > > > > > > > > > > > so >> > >> >> > > > > > > > > > > > > > > > that >> > >> >> > > > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the >> > >> problem >> > >> >> > > before >> > >> >> > > > > > > > submitting >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > Flink >> > >> >> > > > > > > > > > > > > > > > > > cluster, >> > >> >> > > > > > > > > > > > > > > > > > > > > which >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only >> rely on >> > >> the >> > >> >> > > client >> > >> >> > > > > side >> > >> >> > > > > > > > > > checking, >> > >> >> > > > > > > > > > > > > > because >> > >> >> > > > > > > > > > > > > > > > for >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster >> > >> TaskManagers >> > >> >> on >> > >> >> > > > > > different >> > >> >> > > > > > > > > > machines >> > >> >> > > > > > > > > > > > may >> > >> >> > > > > > > > > > > > > > > have >> > >> >> > > > > > > > > > > > > > > > > > > > different >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the >> > client >> > >> >> does >> > >> >> > > see >> > >> >> > > > > > that. >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at >> 5:09 >> > >> PM >> > >> >> Yang >> > >> >> > > > Wang >> > >> >> > > > > < >> > >> >> > > > > > > > > > > > > > > > [hidden email]> >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your >> detailed >> > >> >> > proposal. >> > >> >> > > > > After >> > >> >> > > > > > > all >> > >> >> > > > > > > > > the >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > configuration >> > >> >> > > > > > > > > > > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be >> > more >> > >> >> > > powerful >> > >> >> > > > to >> > >> >> > > > > > > > control >> > >> >> > > > > > > > > > the >> > >> >> > > > > > > > > > > > > flink >> > >> >> > > > > > > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few >> questions >> > >> about >> > >> >> it. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct >> > >> Memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not >> differentiate >> > >> user >> > >> >> > direct >> > >> >> > > > > > memory >> > >> >> > > > > > > > and >> > >> >> > > > > > > > > > > native >> > >> >> > > > > > > > > > > > > > > memory. >> > >> >> > > > > > > > > > > > > > > > > > They >> > >> >> > > > > > > > > > > > > > > > > > > > are >> > >> >> > > > > > > > > > > > > > > > > > > > > > all >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task >> off-heap >> > >> >> memory. >> > >> >> > > > > Right? >> > >> >> > > > > > > So i >> > >> >> > > > > > > > > > don’t >> > >> >> > > > > > > > > > > > > think >> > >> >> > > > > > > > > > > > > > > we >> > >> >> > > > > > > > > > > > > > > > > > could >> > >> >> > > > > > > > > > > > > > > > > > > > not >> > >> >> > > > > > > > > > > > > > > > > > > > > > set >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the >> > -XX:MaxDirectMemorySize >> > >> >> > > > properly. I >> > >> >> > > > > > > > prefer >> > >> >> > > > > > > > > > > > leaving >> > >> >> > > > > > > > > > > > > > it a >> > >> >> > > > > > > > > > > > > > > > > very >> > >> >> > > > > > > > > > > > > > > > > > > > large >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory >> Calculation >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and >> > >> fine-grained >> > >> >> > > > > > > memory(network >> > >> >> > > > > > > > > > > memory, >> > >> >> > > > > > > > > > > > > > > managed >> > >> >> > > > > > > > > > > > > > > > > > > memory, >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total >> > >> process >> > >> >> > > memory, >> > >> >> > > > > how >> > >> >> > > > > > do >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > deal >> > >> >> > > > > > > > > > > > > with >> > >> >> > > > > > > > > > > > > > > this >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the >> > memory >> > >> >> > > > > configuration >> > >> >> > > > > > > in >> > >> >> > > > > > > > > > > client? >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < >> > >> >> > > [hidden email]> >> > >> >> > > > > > > > > > 于2019年8月7日周三 >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to >> start >> > a >> > >> >> > > discussion >> > >> >> > > > > > > thread >> > >> >> > > > > > > > on >> > >> >> > > > > > > > > > > > > "FLIP-49: >> > >> >> > > > > > > > > > > > > > > > > Unified >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for >> > >> >> > > > TaskExecutors"[1], >> > >> >> > > > > > > where >> > >> >> > > > > > > > we >> > >> >> > > > > > > > > > > > > describe >> > >> >> > > > > > > > > > > > > > > how >> > >> >> > > > > > > > > > > > > > > > to >> > >> >> > > > > > > > > > > > > > > > > > > > improve >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > >> >> > > configurations. >> > >> >> > > > > The >> > >> >> > > > > > > > FLIP >> > >> >> > > > > > > > > > > > document >> > >> >> > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > mostly >> > >> >> > > > > > > > > > > > > > > > > > > > based >> > >> >> > > > > > > > > > > > > > > > > > > > > > on >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory >> > >> >> Management >> > >> >> > > and >> > >> >> > > > > > > > > > Configuration >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] >> > >> >> > > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from >> > >> follow-up >> > >> >> > > > > discussions >> > >> >> > > > > > > > both >> > >> >> > > > > > > > > > > online >> > >> >> > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > > offline. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses >> > >> several >> > >> >> > > > > > shortcomings >> > >> >> > > > > > > of >> > >> >> > > > > > > > > > > current >> > >> >> > > > > > > > > > > > > > > (Flink >> > >> >> > > > > > > > > > > > > > > > > 1.9) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory >> > >> >> > > configuration. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different >> > >> configuration >> > >> >> > for >> > >> >> > > > > > > Streaming >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > > Batch. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and >> > >> difficult >> > >> >> > > > > > configuration >> > >> >> > > > > > > of >> > >> >> > > > > > > > > > > RocksDB >> > >> >> > > > > > > > > > > > > in >> > >> >> > > > > > > > > > > > > > > > > > Streaming. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, >> > >> uncertain >> > >> >> and >> > >> >> > > > hard >> > >> >> > > > > to >> > >> >> > > > > > > > > > > understand. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve >> > the >> > >> >> > problems >> > >> >> > > > can >> > >> >> > > > > > be >> > >> >> > > > > > > > > > > summarized >> > >> >> > > > > > > > > > > > > as >> > >> >> > > > > > > > > > > > > > > > > follows. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory >> > >> manager >> > >> >> to >> > >> >> > > also >> > >> >> > > > > > > account >> > >> >> > > > > > > > > for >> > >> >> > > > > > > > > > > > memory >> > >> >> > > > > > > > > > > > > > > usage >> > >> >> > > > > > > > > > > > > > > > > by >> > >> >> > > > > > > > > > > > > > > > > > > > state >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how >> > >> TaskExecutor >> > >> >> > > memory >> > >> >> > > > > is >> > >> >> > > > > > > > > > > partitioned >> > >> >> > > > > > > > > > > > > > > > accounted >> > >> >> > > > > > > > > > > > > > > > > > > > > individual >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory >> reservations >> > >> and >> > >> >> > pools. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory >> > >> >> > > configuration >> > >> >> > > > > > > options >> > >> >> > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > calculations >> > >> >> > > > > > > > > > > > > > > > > > > logics. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more >> > details >> > >> in >> > >> >> the >> > >> >> > > > FLIP >> > >> >> > > > > > wiki >> > >> >> > > > > > > > > > > document >> > >> >> > > > > > > > > > > > > [1]. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that >> the >> > >> early >> > >> >> > > design >> > >> >> > > > > doc >> > >> >> > > > > > > [2] >> > >> >> > > > > > > > is >> > >> >> > > > > > > > > > out >> > >> >> > > > > > > > > > > > of >> > >> >> > > > > > > > > > > > > > > sync, >> > >> >> > > > > > > > > > > > > > > > > and >> > >> >> > > > > > > > > > > > > > > > > > it >> > >> >> > > > > > > > > > > > > > > > > > > > is >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have >> the >> > >> >> > > discussion >> > >> >> > > > in >> > >> >> > > > > > > this >> > >> >> > > > > > > > > > > mailing >> > >> >> > > > > > > > > > > > > list >> > >> >> > > > > > > > > > > > > > > > > > thread.) >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to >> your >> > >> >> > > feedbacks. >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> >> > >> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> >> >> > >> > >> > >> >> > > >> > >> > |
Thanks for your comments, Andrey.
- Regarding Task Off-Heap Memory, I think you're right that the user need to make sure that direct memory and native memory together used by the user code (external libs) do not exceed the configured value. As far as I can think of, there is nothing we can do about it. I addressed the rest of your comment in the wiki page [1]. Please take a look. Thank you~ Xintong Song [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin <[hidden email]> wrote: > EDIT: sorry for confusion I meant > taskmanager.memory.off-heap > instead of > setting taskmanager.memory.preallocate > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin <[hidden email]> > wrote: > > > Hi All, > > > > @Xitong thanks a lot for driving the discussion. > > > > I also reviewed the FLIP and it looks quite good to me. > > Here are some comments: > > > > > > - One thing I wanted to discuss is the backwards-compatibility with > > the previous user setups. We could list which options we plan to > deprecate. > > From the first glance it looks possible to provide the same/similar > > behaviour for the setups relying on the deprecated options. E.g. > > setting taskmanager.memory.preallocate to true could override the > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At the > moment the > > FLIP just states that in some cases it may require re-configuring of > > cluster if migrated from prior versions. My suggestion is that we try > to > > keep it backwards-compatible unless there is a good reason like some > major > > complication for the implementation. > > > > > > Also couple of smaller things: > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP and leave > > some general wording atm, like 'data structure to store' or 'utility > > classes'. When the classes are implemented, we put the concrete class > > names. This way we can avoid confusion and stale documents. > > > > > > - As I understand, if user task uses native memory (not direct memory, > > but e.g. unsafe.allocate or from external lib), there will be no > > explicit guard against exceeding 'task off heap memory'. Then user > should > > still explicitly make sure that her/his direct buffer allocation plus > any > > other memory usages does not exceed value announced as 'task off > heap'. I > > guess there is no so much that can be done about it except mentioning > in > > docs, similar to controlling the heap state backend. > > > > > > Thanks, > > Andrey > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> wrote: > > > >> I also agree that all the configuration should be calculated out of > >> TaskManager. > >> > >> So a full configuration should be generated before TaskManager started. > >> > >> Override the calculated configurations through -D now seems better. > >> > >> > >> > >> Best, > >> > >> Yang > >> > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > >> > >> > I just updated the FLIP wiki page [1], with the following changes: > >> > > >> > - Network memory uses JVM direct memory, and is accounted when > >> setting > >> > JVM max direct memory size parameter. > >> > - Use dynamic configurations (`-Dkey=value`) to pass calculated > >> memory > >> > configs into TaskExecutors, instead of ENV variables. > >> > - Remove 'supporting memory reservation' from the scope of this > FLIP. > >> > > >> > @till @stephan, please take another look see if there are any other > >> > concerns. > >> > > >> > Thank you~ > >> > > >> > Xintong Song > >> > > >> > > >> > [1] > >> > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email]> > >> > wrote: > >> > > >> > > Sorry for the late response. > >> > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the > >> detail > >> > > in PR. > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 for using > >> > > dynamic configuration at the moment, given that there are more > >> questions > >> > to > >> > > be discussed to have a general framework for overwriting > >> configurations > >> > > with ENV variables. > >> > > - Regarding memory reservation, I double checked with Yu and he will > >> take > >> > > care of it. > >> > > > >> > > Thank you~ > >> > > > >> > > Xintong Song > >> > > > >> > > > >> > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <[hidden email] > > > >> > > wrote: > >> > > > >> > >> What I forgot to add is that we could tackle specifying the > >> > configuration > >> > >> fully in an incremental way and that the full specification should > be > >> > the > >> > >> desired end state. > >> > >> > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > [hidden email]> > >> > >> wrote: > >> > >> > >> > >> > I think our goal should be that the configuration is fully > >> specified > >> > >> when > >> > >> > the process is started. By considering the internal calculation > >> step > >> > to > >> > >> be > >> > >> > rather validate existing values and calculate missing ones, these > >> two > >> > >> > proposal shouldn't even conflict (given determinism). > >> > >> > > >> > >> > Since we don't want to change an existing flink-conf.yaml, > >> specifying > >> > >> the > >> > >> > full configuration would require to pass in the options > >> differently. > >> > >> > > >> > >> > One way could be the ENV variables approach. The reason why I'm > >> trying > >> > >> to > >> > >> > exclude this feature from the FLIP is that I believe it needs a > bit > >> > more > >> > >> > discussion. Just some questions which come to my mind: What would > >> be > >> > the > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator > >> which > >> > is > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > >> > >> > separator what would be the order of precedence if there are two > >> ENV > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is > the > >> > >> > precedence of env variable vs. dynamic configuration value > >> specified > >> > >> via -D? > >> > >> > > >> > >> > Another approach could be to pass in the dynamic configuration > >> values > >> > >> via > >> > >> > `-Dkey=value` to the Flink process. For that we don't have to > >> change > >> > >> > anything because the functionality already exists. > >> > >> > > >> > >> > Cheers, > >> > >> > Till > >> > >> > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <[hidden email]> > >> > wrote: > >> > >> > > >> > >> >> I see. Under the assumption of strict determinism that should > >> work. > >> > >> >> > >> > >> >> The original proposal had this point "don't compute inside the > TM, > >> > >> compute > >> > >> >> outside and supply a full config", because that sounded more > >> > intuitive. > >> > >> >> > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > >> [hidden email] > >> > > > >> > >> >> wrote: > >> > >> >> > >> > >> >> > My understanding was that before starting the Flink process we > >> > call a > >> > >> >> > utility which calculates these values. I assume that this > >> utility > >> > >> will > >> > >> >> do > >> > >> >> > the calculation based on a set of configured values (process > >> > memory, > >> > >> >> flink > >> > >> >> > memory, network memory etc.). Assuming that these values don't > >> > differ > >> > >> >> from > >> > >> >> > the values with which the JVM is started, it should be > possible > >> to > >> > >> >> > recompute them in the Flink process in order to set the > values. > >> > >> >> > > >> > >> >> > > >> > >> >> > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > [hidden email] > >> > > >> > >> wrote: > >> > >> >> > > >> > >> >> > > When computing the values in the JVM process after it > started, > >> > how > >> > >> >> would > >> > >> >> > > you deal with values like Max Direct Memory, Metaspace size. > >> > native > >> > >> >> > memory > >> > >> >> > > reservation (reduce heap size), etc? All the values that are > >> > >> >> parameters > >> > >> >> > to > >> > >> >> > > the JVM process and that need to be supplied at process > >> startup? > >> > >> >> > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > >> > >> [hidden email]> > >> > >> >> > > wrote: > >> > >> >> > > > >> > >> >> > > > Thanks for the clarification. I have some more comments: > >> > >> >> > > > > >> > >> >> > > > - I would actually split the logic to compute the process > >> > memory > >> > >> >> > > > requirements and storing the values into two things. E.g. > >> one > >> > >> could > >> > >> >> > name > >> > >> >> > > > the former TaskExecutorProcessUtility and the latter > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the > PR > >> > >> since > >> > >> >> it's > >> > >> >> > > > just a naming detail. > >> > >> >> > > > > >> > >> >> > > > - Generally, I'm not opposed to making configuration > values > >> > >> >> overridable > >> > >> >> > > by > >> > >> >> > > > ENV variables. I think this is a very good idea and makes > >> the > >> > >> >> > > > configurability of Flink processes easier. However, I > think > >> > that > >> > >> >> adding > >> > >> >> > > > this functionality should not be part of this FLIP because > >> it > >> > >> would > >> > >> >> > > simply > >> > >> >> > > > widen the scope unnecessarily. > >> > >> >> > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are the > >> following: > >> > >> For > >> > >> >> Yarn > >> > >> >> > > we > >> > >> >> > > > already create write a flink-conf.yaml which could be > >> populated > >> > >> with > >> > >> >> > the > >> > >> >> > > > memory settings. For the other processes it should not > make > >> a > >> > >> >> > difference > >> > >> >> > > > whether the loaded Configuration is populated with the > >> memory > >> > >> >> settings > >> > >> >> > > from > >> > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to > >> compute > >> > >> the > >> > >> >> > > missing > >> > >> >> > > > values from the loaded configuration. If the latter would > >> not > >> > be > >> > >> >> > possible > >> > >> >> > > > (wrong or missing configuration values), then we should > not > >> > have > >> > >> >> been > >> > >> >> > > able > >> > >> >> > > > to actually start the process in the first place. > >> > >> >> > > > > >> > >> >> > > > - Concerning the memory reservation: I agree with you that > >> we > >> > >> need > >> > >> >> the > >> > >> >> > > > memory reservation functionality to make streaming jobs > work > >> > with > >> > >> >> > > "managed" > >> > >> >> > > > memory. However, w/o this functionality the whole Flip > would > >> > >> already > >> > >> >> > > bring > >> > >> >> > > > a good amount of improvements to our users when running > >> batch > >> > >> jobs. > >> > >> >> > > > Moreover, by keeping the scope smaller we can complete the > >> FLIP > >> > >> >> faster. > >> > >> >> > > > Hence, I would propose to address the memory reservation > >> > >> >> functionality > >> > >> >> > > as a > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not > mistaken). > >> > >> >> > > > > >> > >> >> > > > Cheers, > >> > >> >> > > > Till > >> > >> >> > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > >> > >> [hidden email]> > >> > >> >> > > wrote: > >> > >> >> > > > > >> > >> >> > > > > Just add my 2 cents. > >> > >> >> > > > > > >> > >> >> > > > > Using environment variables to override the > configuration > >> for > >> > >> >> > different > >> > >> >> > > > > taskmanagers is better. > >> > >> >> > > > > We do not need to generate dedicated flink-conf.yaml for > >> all > >> > >> >> > > > taskmanagers. > >> > >> >> > > > > A common flink-conf.yam and different environment > >> variables > >> > are > >> > >> >> > enough. > >> > >> >> > > > > By reducing the distributed cached files, it could make > >> > >> launching > >> > >> >> a > >> > >> >> > > > > taskmanager faster. > >> > >> >> > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could move the > >> logic > >> > >> into > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > >> > >> >> > > > > Maybe the client could also benefit from this. Different > >> > users > >> > >> do > >> > >> >> not > >> > >> >> > > > have > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config options. > >> > >> >> > > > > > >> > >> >> > > > > > >> > >> >> > > > > Best, > >> > >> >> > > > > Yang > >> > >> >> > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 上午1:21写道: > >> > >> >> > > > > > >> > >> >> > > > > > One note on the Environment Variables and > Configuration > >> > >> >> discussion. > >> > >> >> > > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables are > added > >> to > >> > >> the > >> > >> >> > > > > > configuration in the > "GlobalConfiguration.loadConfig()" > >> > >> method > >> > >> >> (or > >> > >> >> > > > > > similar). > >> > >> >> > > > > > For all the code inside Flink, it looks like the data > >> was > >> > in > >> > >> the > >> > >> >> > > config > >> > >> >> > > > > to > >> > >> >> > > > > > start with, just that the scripts that compute the > >> > variables > >> > >> can > >> > >> >> > pass > >> > >> >> > > > the > >> > >> >> > > > > > values to the process without actually needing to > write > >> a > >> > >> file. > >> > >> >> > > > > > > >> > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" > >> method > >> > >> would > >> > >> >> > take > >> > >> >> > > > any > >> > >> >> > > > > > ENV variable prefixed with "flink" and add it as a > >> config > >> > >> key. > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > >> > >> >> > > > "taskmanager.memory.size: > >> > >> >> > > > > > 2g". > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > >> > >> >> > [hidden email]> > >> > >> >> > > > > > wrote: > >> > >> >> > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > >> > >> >> > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki page, but > >> let's > >> > >> keep > >> > >> >> the > >> > >> >> > > > > > > discussion here. > >> > >> >> > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you > think > >> > about > >> > >> >> > naming > >> > >> >> > > it > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > >> > >> >> > > > > > > - Regarding passing memory configurations into task > >> > >> executors, > >> > >> >> > I'm > >> > >> >> > > in > >> > >> >> > > > > > favor > >> > >> >> > > > > > > of do it via environment variables rather than > >> > >> configurations, > >> > >> >> > with > >> > >> >> > > > the > >> > >> >> > > > > > > following two reasons. > >> > >> >> > > > > > > - It is easier to keep the memory options once > >> > calculate > >> > >> >> not to > >> > >> >> > > be > >> > >> >> > > > > > > changed with environment variables rather than > >> > >> configurations. > >> > >> >> > > > > > > - I'm not sure whether we should write the > >> > configuration > >> > >> in > >> > >> >> > > startup > >> > >> >> > > > > > > scripts. Writing changes into the configuration > files > >> > when > >> > >> >> > running > >> > >> >> > > > the > >> > >> >> > > > > > > startup scripts does not sounds right to me. Or we > >> could > >> > >> make > >> > >> >> a > >> > >> >> > > copy > >> > >> >> > > > of > >> > >> >> > > > > > > configuration files per flink cluster, and make the > >> task > >> > >> >> executor > >> > >> >> > > to > >> > >> >> > > > > load > >> > >> >> > > > > > > from the copy, and clean up the copy after the > >> cluster is > >> > >> >> > shutdown, > >> > >> >> > > > > which > >> > >> >> > > > > > > is complicated. (I think this is also what Stephan > >> means > >> > in > >> > >> >> his > >> > >> >> > > > comment > >> > >> >> > > > > > on > >> > >> >> > > > > > > the wiki page?) > >> > >> >> > > > > > > - Regarding reserving memory, I think this change > >> should > >> > be > >> > >> >> > > included > >> > >> >> > > > in > >> > >> >> > > > > > > this FLIP. I think a big part of motivations of this > >> FLIP > >> > >> is > >> > >> >> to > >> > >> >> > > unify > >> > >> >> > > > > > > memory configuration for streaming / batch and make > it > >> > easy > >> > >> >> for > >> > >> >> > > > > > configuring > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > >> reservation, > >> > >> then > >> > >> >> > > > streaming > >> > >> >> > > > > > jobs > >> > >> >> > > > > > > cannot use managed memory (neither on-heap or > >> off-heap), > >> > >> which > >> > >> >> > > makes > >> > >> >> > > > > this > >> > >> >> > > > > > > FLIP incomplete. > >> > >> >> > > > > > > - Regarding network memory, I think you are right. I > >> > think > >> > >> we > >> > >> >> > > > probably > >> > >> >> > > > > > > don't need to change network stack from using direct > >> > >> memory to > >> > >> >> > > using > >> > >> >> > > > > > unsafe > >> > >> >> > > > > > > native memory. Network memory size is deterministic, > >> > >> cannot be > >> > >> >> > > > reserved > >> > >> >> > > > > > as > >> > >> >> > > > > > > managed memory does, and cannot be overused. I think > >> it > >> > >> also > >> > >> >> > works > >> > >> >> > > if > >> > >> >> > > > > we > >> > >> >> > > > > > > simply keep using direct memory for network and > >> include > >> > it > >> > >> in > >> > >> >> jvm > >> > >> >> > > max > >> > >> >> > > > > > > direct memory size. > >> > >> >> > > > > > > > >> > >> >> > > > > > > Thank you~ > >> > >> >> > > > > > > > >> > >> >> > > > > > > Xintong Song > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > >> > >> >> > > [hidden email]> > >> > >> >> > > > > > > wrote: > >> > >> >> > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and adding a > more > >> > >> >> detailed > >> > >> >> > > > > > > > implementation plan. I have a couple of comments > >> > >> concerning > >> > >> >> the > >> > >> >> > > > > > > > implementation plan: > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > >> > >> >> descriptive. > >> > >> >> > > > > Choosing > >> > >> >> > > > > > a > >> > >> >> > > > > > > > different name could help here. > >> > >> >> > > > > > > > - I'm not sure whether I would pass the memory > >> > >> >> configuration to > >> > >> >> > > the > >> > >> >> > > > > > > > TaskExecutor via environment variables. I think it > >> > would > >> > >> be > >> > >> >> > > better > >> > >> >> > > > to > >> > >> >> > > > > > > write > >> > >> >> > > > > > > > it into the configuration one uses to start the TM > >> > >> process. > >> > >> >> > > > > > > > - If possible, I would exclude the memory > >> reservation > >> > >> from > >> > >> >> this > >> > >> >> > > > FLIP > >> > >> >> > > > > > and > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > >> > >> >> > > > > > > > - If possible, then I would exclude changes to the > >> > >> network > >> > >> >> > stack > >> > >> >> > > > from > >> > >> >> > > > > > > this > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct > memory > >> > >> needed > >> > >> >> by > >> > >> >> > > the > >> > >> >> > > > > > > network > >> > >> >> > > > > > > > stack is the framework direct memory requirement. > >> > >> Changing > >> > >> >> how > >> > >> >> > > the > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > is allocated can happen in a second step. This > would > >> > keep > >> > >> >> the > >> > >> >> > > scope > >> > >> >> > > > > of > >> > >> >> > > > > > > this > >> > >> >> > > > > > > > FLIP smaller. > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > Cheers, > >> > >> >> > > > > > > > Till > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > >> > >> >> > > > [hidden email]> > >> > >> >> > > > > > > > wrote: > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], > with > >> > the > >> > >> >> > > following > >> > >> >> > > > > > > changes. > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > MemorySegment > >> > >> >> > allocation. > >> > >> >> > > As > >> > >> >> > > > > > > > > discussed, we exclude this topic from the > >> scope of > >> > >> this > >> > >> >> > > FLIP. > >> > >> >> > > > > > > > > - Updated content about JVM direct memory > >> > parameter > >> > >> >> > > according > >> > >> >> > > > to > >> > >> >> > > > > > > > recent > >> > >> >> > > > > > > > > discussions, and moved the other options to > >> > >> "Rejected > >> > >> >> > > > > > Alternatives" > >> > >> >> > > > > > > > for > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > moment. > >> > >> >> > > > > > > > > - Added implementation steps. > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > [1] > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > >> > >> >> > [hidden email] > >> > >> >> > > > > >> > >> >> > > > > > wrote: > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users > >> before > >> > >> task > >> > >> >> > > dispose > >> > >> >> > > > > and > >> > >> >> > > > > > > > > memory > >> > >> >> > > > > > > > > > release": I agree, that's how it should be. > >> Let's > >> > >> try it > >> > >> >> > out. > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not > >> wait > >> > >> for > >> > >> >> GC > >> > >> >> > > when > >> > >> >> > > > > > > > allocating > >> > >> >> > > > > > > > > > direct memory buffer": There seems to be > pretty > >> > >> >> elaborate > >> > >> >> > > logic > >> > >> >> > > > > to > >> > >> >> > > > > > > free > >> > >> >> > > > > > > > > > buffers when allocating new ones. See > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM > default > >> > works > >> > >> >> (like > >> > >> >> > > > going > >> > >> >> > > > > > > with > >> > >> >> > > > > > > > > > option 2 and not setting > >> "-XX:MaxDirectMemorySize" > >> > at > >> > >> >> all), > >> > >> >> > > > then > >> > >> >> > > > > I > >> > >> >> > > > > > > > think > >> > >> >> > > > > > > > > it > >> > >> >> > > > > > > > > > should be okay to set > "-XX:MaxDirectMemorySize" > >> to > >> > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even > >> if > >> > we > >> > >> use > >> > >> >> > > > RocksDB. > >> > >> >> > > > > > > That > >> > >> >> > > > > > > > > is a > >> > >> >> > > > > > > > > > big if, though, I honestly have no idea :D > >> Would be > >> > >> >> good to > >> > >> >> > > > > > > understand > >> > >> >> > > > > > > > > > this, though, because this would affect option > >> (2) > >> > >> and > >> > >> >> > option > >> > >> >> > > > > > (1.2). > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > >> > >> >> > > > > > [hidden email]> > >> > >> >> > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. Please > >> > correct > >> > >> >> me if > >> > >> >> > > I'm > >> > >> >> > > > > > > wrong. > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always avoid > >> > returning > >> > >> >> > memory > >> > >> >> > > > > > segments > >> > >> >> > > > > > > > to > >> > >> >> > > > > > > > > > > memory manager while there are still > >> > un-cleaned > >> > >> >> > > > structures / > >> > >> >> > > > > > > > threads > >> > >> >> > > > > > > > > > > that > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it would > >> cause > >> > >> >> serious > >> > >> >> > > > > problems > >> > >> >> > > > > > > by > >> > >> >> > > > > > > > > > having > >> > >> >> > > > > > > > > > > multiple consumers trying to use the same > >> > memory > >> > >> >> > > segment. > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > allocating > >> > >> direct > >> > >> >> > memory > >> > >> >> > > > > > buffer. > >> > >> >> > > > > > > > > > > Therefore even we set proper max direct > >> memory > >> > >> size > >> > >> >> > > limit, > >> > >> >> > > > > we > >> > >> >> > > > > > > may > >> > >> >> > > > > > > > > > still > >> > >> >> > > > > > > > > > > encounter direct memory oom if the GC > >> cleaning > >> > >> >> memory > >> > >> >> > > > slower > >> > >> >> > > > > > > than > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > > > direct memory allocation. > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee > < > >> > >> >> > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > .invalid> > >> > >> >> > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly shut > down > >> > >> before > >> > >> >> we > >> > >> >> > can > >> > >> >> > > > > exit > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > task: > >> > >> >> > > > > > > > > > > > In the current case of memory reuse, it > has > >> > >> freed up > >> > >> >> > the > >> > >> >> > > > > memory > >> > >> >> > > > > > > it > >> > >> >> > > > > > > > > > > > uses. If this memory is used by other > tasks > >> > and > >> > >> >> > > > asynchronous > >> > >> >> > > > > > > > threads > >> > >> >> > > > > > > > > > > > of exited task may still be writing, > there > >> > will > >> > >> be > >> > >> >> > > > > concurrent > >> > >> >> > > > > > > > > security > >> > >> >> > > > > > > > > > > > problems, and even lead to errors in user > >> > >> computing > >> > >> >> > > > results. > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > intolerable > >> > >> bug, No > >> > >> >> > > matter > >> > >> >> > > > > > what > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > > > option is, it should be avoided. > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > >> > >> encountered so > >> > >> >> > many > >> > >> >> > > > > > > > situations > >> > >> >> > > > > > > > > > > > that it's too late for GC to cause > >> > DirectMemory > >> > >> >> OOM. > >> > >> >> > > > Release > >> > >> >> > > > > > and > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on the type > of > >> > user > >> > >> >> job, > >> > >> >> > > > which > >> > >> >> > > > > is > >> > >> >> > > > > > > > > > > > often beyond our control. > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > >> > >> >> > > > > > > > > > > > Jingsong Lee > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > >> ------------------------------------------------------------------ > >> > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified > >> Memory > >> > >> >> > > Configuration > >> > >> >> > > > > for > >> > >> >> > > > > > > > > > > > TaskExecutors > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 (manually > >> release > >> > >> >> memory) > >> > >> >> > > is > >> > >> >> > > > > that > >> > >> >> > > > > > > > > > segfaults > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on > >> user > >> > >> >> ends. > >> > >> >> > So > >> > >> >> > > we > >> > >> >> > > > > > need > >> > >> >> > > > > > > to > >> > >> >> > > > > > > > > > > > guarantee that this never happens. > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data > >> > >> >> structures / > >> > >> >> > > > > > algorithms > >> > >> >> > > > > > > > > with > >> > >> >> > > > > > > > > > > > additional threads, like hash table > >> spill/read > >> > >> and > >> > >> >> > > sorting > >> > >> >> > > > > > > threads. > >> > >> >> > > > > > > > > We > >> > >> >> > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut down > >> before > >> > we > >> > >> can > >> > >> >> > exit > >> > >> >> > > > the > >> > >> >> > > > > > > task. > >> > >> >> > > > > > > > > > > > I am not sure that we have that guaranteed > >> > >> already, > >> > >> >> > > that's > >> > >> >> > > > > why > >> > >> >> > > > > > > > option > >> > >> >> > > > > > > > > > 1.1 > >> > >> >> > > > > > > > > > > > seemed simpler to me. > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong > >> Song < > >> > >> >> > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. > >> Summarized > >> > in > >> > >> >> this > >> > >> >> > > way > >> > >> >> > > > > > really > >> > >> >> > > > > > > > > makes > >> > >> >> > > > > > > > > > > > > things easier to understand. > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for > the > >> > >> >> moment. I > >> > >> >> > > > think > >> > >> >> > > > > it > >> > >> >> > > > > > > is > >> > >> >> > > > > > > > > not > >> > >> >> > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for > >> memory > >> > >> >> > manager, > >> > >> >> > > as > >> > >> >> > > > > > long > >> > >> >> > > > > > > as > >> > >> >> > > > > > > > > we > >> > >> >> > > > > > > > > > > > always > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment when it > is > >> > >> released > >> > >> >> > from > >> > >> >> > > > the > >> > >> >> > > > > > > > memory > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer > >> > continue > >> > >> >> using > >> > >> >> > > the > >> > >> >> > > > > > > buffer > >> > >> >> > > > > > > > of > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > segment after releasing it, in which > case > >> we > >> > do > >> > >> >> want > >> > >> >> > > the > >> > >> >> > > > > job > >> > >> >> > > > > > to > >> > >> >> > > > > > > > > fail > >> > >> >> > > > > > > > > > so > >> > >> >> > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a > >> good > >> > >> idea. > >> > >> >> > Not > >> > >> >> > > > only > >> > >> >> > > > > > > > because > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough to > clean > >> > >> direct > >> > >> >> > > buffers) > >> > >> >> > > > > may > >> > >> >> > > > > > > not > >> > >> >> > > > > > > > > > > always > >> > >> >> > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > true, but also it makes harder for > finding > >> > >> >> problems > >> > >> >> > in > >> > >> >> > > > > cases > >> > >> >> > > > > > of > >> > >> >> > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured some > direct > >> > >> memory > >> > >> >> for > >> > >> >> > > the > >> > >> >> > > > > > user > >> > >> >> > > > > > > > > > > libraries. > >> > >> >> > > > > > > > > > > > > If the library actually use more direct > >> > memory > >> > >> >> then > >> > >> >> > > > > > configured, > >> > >> >> > > > > > > > > which > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are > >> > still > >> > >> in > >> > >> >> > use, > >> > >> >> > > > may > >> > >> >> > > > > > lead > >> > >> >> > > > > > > > to > >> > >> >> > > > > > > > > > > > overuse > >> > >> >> > > > > > > > > > > > > of the total container memory. In that > >> case, > >> > >> if it > >> > >> >> > > didn't > >> > >> >> > > > > > touch > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > default max direct memory limit, we > cannot > >> > get > >> > >> a > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > OOM > >> > >> >> > > > > > > > > > and > >> > >> >> > > > > > > > > > > it > >> > >> >> > > > > > > > > > > > > will become super hard to understand > which > >> > >> part of > >> > >> >> > the > >> > >> >> > > > > > > > > configuration > >> > >> >> > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > > to be updated. > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the similar > >> problem as > >> > >> >> 1.2, if > >> > >> >> > > the > >> > >> >> > > > > > > > exceeded > >> > >> >> > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > memory does not reach the max direct > >> memory > >> > >> limit > >> > >> >> > > > specified > >> > >> >> > > > > > by > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it is > >> slightly > >> > >> better > >> > >> >> > than > >> > >> >> > > > > 1.2, > >> > >> >> > > > > > > only > >> > >> >> > > > > > > > > > > because > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan > >> Ewen > >> > < > >> > >> >> > > > > > [hidden email] > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > >> > >> discussion, > >> > >> >> > maybe > >> > >> >> > > > let > >> > >> >> > > > > > me > >> > >> >> > > > > > > > > > > summarize > >> > >> >> > > > > > > > > > > > > it a > >> > >> >> > > > > > > > > > > > > > bit differently: > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two options: > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > >> de-allocated > >> > by > >> > >> the > >> > >> >> > GC. > >> > >> >> > > > That > >> > >> >> > > > > > > makes > >> > >> >> > > > > > > > > it > >> > >> >> > > > > > > > > > > > > segfault > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way to > trigger > >> GC > >> > in > >> > >> >> case > >> > >> >> > > > > > > > de-allocation > >> > >> >> > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments > >> > happens > >> > >> >> > quickly, > >> > >> >> > > > > which > >> > >> >> > > > > > > is > >> > >> >> > > > > > > > > > often > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > case during batch scheduling or task > >> > restart. > >> > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" > >> (option > >> > >> 1.1) > >> > >> >> is > >> > >> >> > one > >> > >> >> > > > way > >> > >> >> > > > > > to > >> > >> >> > > > > > > do > >> > >> >> > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > - Another way could be to have a > >> > dedicated > >> > >> >> > > > bookkeeping > >> > >> >> > > > > in > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that > this > >> > is a > >> > >> >> > number > >> > >> >> > > > > > > > independent > >> > >> >> > > > > > > > > of > >> > >> >> > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > de-allocate > >> > the > >> > >> >> memory > >> > >> >> > > for > >> > >> >> > > > > the > >> > >> >> > > > > > > > > > > > > MemorySegments > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need not worry > >> > about > >> > >> >> > > triggering > >> > >> >> > > > > GC > >> > >> >> > > > > > by > >> > >> >> > > > > > > > > some > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is > >> harder > >> > to > >> > >> >> > prevent > >> > >> >> > > > > > > > segfaults. > >> > >> >> > > > > > > > > We > >> > >> >> > > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > be very careful about when we release > >> the > >> > >> memory > >> > >> >> > > > segments > >> > >> >> > > > > > > (only > >> > >> >> > > > > > > > > in > >> > >> >> > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably > >> need > >> > to > >> > >> >> set > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > >> > >> >> > > "off_heap_managed_memory + > >> > >> >> > > > > > > > > > > direct_memory" > >> > >> >> > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate > >> reserved > >> > >> >> memory > >> > >> >> > > > pool. > >> > >> >> > > > > > > > Because > >> > >> >> > > > > > > > > if > >> > >> >> > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > just > >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > >> > >> >> > > > > "off_heap_managed_memory + > >> > >> >> > > > > > > > > > > > > jvm_overhead", > >> > >> >> > > > > > > > > > > > > > then there will be times when that > >> entire > >> > >> >> memory is > >> > >> >> > > > > > allocated > >> > >> >> > > > > > > > by > >> > >> >> > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing left for > the > >> > JVM > >> > >> >> > > overhead. > >> > >> >> > > > So > >> > >> >> > > > > > we > >> > >> >> > > > > > > > > either > >> > >> >> > > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > > a > >> > >> >> > > > > > > > > > > > > > way to compensate for that (again some > >> > safety > >> > >> >> > margin > >> > >> >> > > > > cutoff > >> > >> >> > > > > > > > > value) > >> > >> >> > > > > > > > > > or > >> > >> >> > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > will exceed container memory. > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to > be > >> > aware > >> > >> >> that > >> > >> >> > it > >> > >> >> > > > > takes > >> > >> >> > > > > > > > > > elaborate > >> > >> >> > > > > > > > > > > > > logic > >> > >> >> > > > > > > > > > > > > > to push recycling of direct buffers > >> without > >> > >> >> always > >> > >> >> > > > > > > triggering a > >> > >> >> > > > > > > > > > full > >> > >> >> > > > > > > > > > > > GC. > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the options > will > >> be > >> > >> >> easiest > >> > >> >> > to > >> > >> >> > > > do > >> > >> >> > > > > in > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > following > >> > >> >> > > > > > > > > > > > > > order: > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > >> > direct_memory > >> > >> >> > > > parameter, > >> > >> >> > > > > as > >> > >> >> > > > > > > > > > discussed > >> > >> >> > > > > > > > > > > > > > above. We would need to find a way to > >> set > >> > the > >> > >> >> > > > > direct_memory > >> > >> >> > > > > > > > > > parameter > >> > >> >> > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > default. We could start with 64 MB and > >> see > >> > >> how > >> > >> >> it > >> > >> >> > > goes > >> > >> >> > > > in > >> > >> >> > > > > > > > > practice. > >> > >> >> > > > > > > > > > > One > >> > >> >> > > > > > > > > > > > > > danger I see is that setting this loo > >> low > >> > can > >> > >> >> > cause a > >> > >> >> > > > > bunch > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > > > > additional > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we need to > watch > >> > this > >> > >> >> > > > carefully). > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite > >> simple > >> > to > >> > >> >> > > implement, > >> > >> >> > > > > we > >> > >> >> > > > > > > > could > >> > >> >> > > > > > > > > > try > >> > >> >> > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > >> > >> >> > > > > > > > > > > > > parameter > >> > >> >> > > > > > > > > > > > > > at all and assume that all the direct > >> > memory > >> > >> >> > > > allocations > >> > >> >> > > > > > that > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be > >> > cleaned > >> > >> up > >> > >> >> > fast > >> > >> >> > > > > enough > >> > >> >> > > > > > > > > through > >> > >> >> > > > > > > > > > > > > regular > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > >> > >> assumption, > >> > >> >> > > > though. > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > >> > >> >> > > > > > > > > > > > > > Stephan > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > Xintong > >> > Song > >> > >> < > >> > >> >> > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > Till. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. > I > >> was > >> > >> >> > wondering > >> > >> >> > > > > > whether > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > can > >> > >> >> > > > > > > > > > > > > avoid > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> > >> managed > >> > >> >> > memory > >> > >> >> > > > and > >> > >> >> > > > > > > > network > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > with > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > >> > second > >> > >> >> > > thought, > >> > >> >> > > > I > >> > >> >> > > > > > > think > >> > >> >> > > > > > > > > even > >> > >> >> > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory > for > >> > >> off-heap > >> > >> >> > > > managed > >> > >> >> > > > > > > memory > >> > >> >> > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > cause > >> > >> >> > > > > > > > > > > > > > > problems. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> > >> proposed > >> > >> >> in > >> > >> >> > > this > >> > >> >> > > > > > FLIP > >> > >> >> > > > > > > it > >> > >> >> > > > > > > > > to > >> > >> >> > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > both > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > >> > memory > >> > >> >> > > allocated > >> > >> >> > > > > > > through > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they > >> are > >> > >> >> > practically > >> > >> >> > > > > > native > >> > >> >> > > > > > > > > memory > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. > The > >> > only > >> > >> >> parts > >> > >> >> > of > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > > > limited > >> > >> >> > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > >> > memory > >> > >> and > >> > >> >> > JVM > >> > >> >> > > > > > > overhead, > >> > >> >> > > > > > > > > > which > >> > >> >> > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to > set > >> the > >> > >> JVM > >> > >> >> max > >> > >> >> > > > > direct > >> > >> >> > > > > > > > memory > >> > >> >> > > > > > > > > > to. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> > >> Rohrmann > >> > >> >> < > >> > >> >> > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > >> Xintong. I > >> > >> >> > > understand > >> > >> >> > > > > the > >> > >> >> > > > > > > two > >> > >> >> > > > > > > > > > > > > alternatives > >> > >> >> > > > > > > > > > > > > > > > now. > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > >> > because > >> > >> it > >> > >> >> > makes > >> > >> >> > > > > > things > >> > >> >> > > > > > > > > > > explicit. > >> > >> >> > > > > > > > > > > > If > >> > >> >> > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I > >> fear > >> > >> that > >> > >> >> we > >> > >> >> > > might > >> > >> >> > > > > end > >> > >> >> > > > > > > up > >> > >> >> > > > > > > > > in a > >> > >> >> > > > > > > > > > > > > similar > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently in: > >> The > >> > >> user > >> > >> >> > might > >> > >> >> > > > see > >> > >> >> > > > > > that > >> > >> >> > > > > > > > her > >> > >> >> > > > > > > > > > > > process > >> > >> >> > > > > > > > > > > > > > > gets > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know > >> why > >> > >> this > >> > >> >> is > >> > >> >> > > the > >> > >> >> > > > > > case. > >> > >> >> > > > > > > > > > > > > Consequently, > >> > >> >> > > > > > > > > > > > > > > she > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > memory > >> > size > >> > >> >> > > (similar > >> > >> >> > > > to > >> > >> >> > > > > > > > > > increasing > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > cutoff > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for > >> the > >> > >> extra > >> > >> >> > > direct > >> > >> >> > > > > > > memory. > >> > >> >> > > > > > > > > > Even > >> > >> >> > > > > > > > > > > > > worse, > >> > >> >> > > > > > > > > > > > > > > she > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > >> which > >> > >> are > >> > >> >> not > >> > >> >> > > > fully > >> > >> >> > > > > > used > >> > >> >> > > > > > > > and > >> > >> >> > > > > > > > > > > hence > >> > >> >> > > > > > > > > > > > > > won't > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > >> consumption. > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > >> > Xintong > >> > >> >> Song < > >> > >> >> > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > >> concrete > >> > >> >> example > >> > >> >> > > Till. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> > >> scenario. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> > >> Memory + > >> > >> >> JVM > >> > >> >> > > > > > > Overhead): > >> > >> >> > > > > > > > > > 200MB > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, > JVM > >> > >> >> Metaspace, > >> > >> >> > > > > > Off-Heap > >> > >> >> > > > > > > > > > Managed > >> > >> >> > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> > >> >> > > -XX:MaxDirectMemorySize > >> > >> >> > > > > to > >> > >> >> > > > > > > > 200MB. > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> > >> >> > > -XX:MaxDirectMemorySize > >> > >> >> > > > > to > >> > >> >> > > > > > a > >> > >> >> > > > > > > > very > >> > >> >> > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > value, > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > usage > >> of > >> > >> Task > >> > >> >> > > > Off-Heap > >> > >> >> > > > > > > Memory > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > Overhead > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > >> > alternative 2 > >> > >> >> and > >> > >> >> > > > > > > alternative 3 > >> > >> >> > > > > > > > > > > should > >> > >> >> > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> > >> >> > > > > -XX:MaxDirectMemorySize > >> > >> >> > > > > > > will > >> > >> >> > > > > > > > > not > >> > >> >> > > > > > > > > > > > > reduce > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > usage > >> of > >> > >> Task > >> > >> >> > > > Off-Heap > >> > >> >> > > > > > > Memory > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > 200MB, > >> > then > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> > >> frequent > >> > >> >> OOM. > >> > >> >> > > To > >> > >> >> > > > > > avoid > >> > >> >> > > > > > > > > that, > >> > >> >> > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > only > >> > >> >> > > > > > > > > > > > > > > > thing > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> > >> >> configuration > >> > >> >> > > and > >> > >> >> > > > > > > > increase > >> > >> >> > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > Direct > >> > >> >> > > > > > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> > >> Overhead). > >> > >> >> > Let's > >> > >> >> > > > say > >> > >> >> > > > > > > that > >> > >> >> > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > increases > >> > >> >> > > > > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this > >> will > >> > >> >> reduce > >> > >> >> > the > >> > >> >> > > > > total > >> > >> >> > > > > > > > size > >> > >> >> > > > > > > > > of > >> > >> >> > > > > > > > > > > > other > >> > >> >> > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > total > >> > >> process > >> > >> >> > > memory > >> > >> >> > > > > > > remains > >> > >> >> > > > > > > > > > 1GB. > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is > >> no > >> > >> >> chance of > >> > >> >> > > > > direct > >> > >> >> > > > > > > OOM. > >> > >> >> > > > > > > > > > There > >> > >> >> > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > chances > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > process > >> > >> memory > >> > >> >> > limit, > >> > >> >> > > > but > >> > >> >> > > > > > > given > >> > >> >> > > > > > > > > > that > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > process > >> > >> >> > > > > > > > > > > > > > > > > may > >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > >> native > >> > >> >> memory > >> > >> >> > > > > > (Off-Heap > >> > >> >> > > > > > > > > > Managed > >> > >> >> > > > > > > > > > > > > > Memory, > >> > >> >> > > > > > > > > > > > > > > > > Network > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if > the > >> > >> actual > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > > usage > >> > >> >> > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > slightly > >> > >> >> > > > > > > > > > > > > > > > > above > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> > >> probably > >> > >> >> do > >> > >> >> > > not > >> > >> >> > > > > need > >> > >> >> > > > > > > to > >> > >> >> > > > > > > > > > change > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > configurations. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > user's > >> > >> >> > > perspective, a > >> > >> >> > > > > > > > feasible > >> > >> >> > > > > > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to > >> lower > >> > >> >> resource > >> > >> >> > > > > > > utilization > >> > >> >> > > > > > > > > > > compared > >> > >> >> > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM > >> Till > >> > >> >> > Rohrmann > >> > >> >> > > < > >> > >> >> > > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> > >> understand > >> > >> >> the > >> > >> >> > > > > > difference > >> > >> >> > > > > > > > > > between > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > >> > utilization > >> > >> >> > > Xintong. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> > >> >> XX:MaxDirectMemorySize > >> > >> >> > > to > >> > >> >> > > > > Task > >> > >> >> > > > > > > > > > Off-Heap > >> > >> >> > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the > risk > >> > that > >> > >> >> this > >> > >> >> > > size > >> > >> >> > > > > is > >> > >> >> > > > > > > too > >> > >> >> > > > > > > > > low > >> > >> >> > > > > > > > > > > > > > resulting > >> > >> >> > > > > > > > > > > > > > > > in a > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> > >> >> potentially > >> > >> >> > an > >> > >> >> > > > OOM. > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> > >> >> XX:MaxDirectMemorySize > >> > >> >> > > to > >> > >> >> > > > > > > > something > >> > >> >> > > > > > > > > > > larger > >> > >> >> > > > > > > > > > > > > > than > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > >> course > >> > >> >> reduce > >> > >> >> > > the > >> > >> >> > > > > > sizes > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > > > > other > >> > >> >> > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > types. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > >> result > >> > >> in an > >> > >> >> > > under > >> > >> >> > > > > > > > > utilization > >> > >> >> > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> > >> >> alternative 3 > >> > >> >> > > > > > strictly > >> > >> >> > > > > > > > > sets a > >> > >> >> > > > > > > > > > > > > higher > >> > >> >> > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use > >> only > >> > >> >> little, > >> > >> >> > > > then I > >> > >> >> > > > > > > would > >> > >> >> > > > > > > > > > > expect > >> > >> >> > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > memory > >> > under > >> > >> >> > > > > utilization. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 > PM > >> > Yang > >> > >> >> Wang < > >> > >> >> > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > >> large > >> > >> max > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > size > >> > >> >> > > > > > > > > > > when > >> > >> >> > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > do > >> > >> >> > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > >> native > >> > >> >> memory. > >> > >> >> > If > >> > >> >> > > > the > >> > >> >> > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > memory,including > >> > >> >> > > > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > >> > direct > >> > >> >> > > > memory,could > >> > >> >> > > > > > be > >> > >> >> > > > > > > > > > > calculated > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > >> direct > >> > >> memory > >> > >> >> > with > >> > >> >> > > > > fixed > >> > >> >> > > > > > > > > value. > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For > Yarn > >> > and > >> > >> >> k8s,we > >> > >> >> > > > need > >> > >> >> > > > > to > >> > >> >> > > > > > > > check > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client to > >> avoid > >> > >> >> > > submitting > >> > >> >> > > > > > > > > successfully > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > failing > >> > >> >> > > > > > > > > > > > > > > > > in > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> > >> [hidden email] > >> > >> >> > > > > >于2019年8月13日 > >> > >> >> > > > > > > > > > 周二22:07写道: > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > think > >> > you > >> > >> are > >> > >> >> > > right > >> > >> >> > > > > that > >> > >> >> > > > > > > we > >> > >> >> > > > > > > > > > should > >> > >> >> > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > include > >> > >> >> > > > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > >> > FLIP. > >> > >> >> This > >> > >> >> > > FLIP > >> > >> >> > > > > > should > >> > >> >> > > > > > > > > > > > concentrate > >> > >> >> > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> > >> >> > TaskExecutors, > >> > >> >> > > > > with > >> > >> >> > > > > > > > > minimum > >> > >> >> > > > > > > > > > > > > > > involvement > >> > >> >> > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > think > >> > >> >> > alternative > >> > >> >> > > 3 > >> > >> >> > > > > may > >> > >> >> > > > > > > not > >> > >> >> > > > > > > > > > having > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > same > >> > >> >> > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> > >> alternative 2 > >> > >> >> > > does, > >> > >> >> > > > > but > >> > >> >> > > > > > at > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > > cost > >> > >> >> > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > risk > >> > >> >> > > > > > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > >> container > >> > >> level, > >> > >> >> > > which > >> > >> >> > > > is > >> > >> >> > > > > > not > >> > >> >> > > > > > > > > good. > >> > >> >> > > > > > > > > > > My > >> > >> >> > > > > > > > > > > > > > point > >> > >> >> > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > Memory" > >> and > >> > >> "JVM > >> > >> >> > > > > Overhead" > >> > >> >> > > > > > > are > >> > >> >> > > > > > > > > not > >> > >> >> > > > > > > > > > > easy > >> > >> >> > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > config. > >> > >> >> > > > > > > > > > > > > > > > > > > For > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> > >> configure > >> > >> >> > them > >> > >> >> > > > > > higher > >> > >> >> > > > > > > > than > >> > >> >> > > > > > > > > > > what > >> > >> >> > > > > > > > > > > > > > > actually > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > >> direct > >> > >> OOM. > >> > >> >> For > >> > >> >> > > > > > > alternative > >> > >> >> > > > > > > > > 3, > >> > >> >> > > > > > > > > > > > users > >> > >> >> > > > > > > > > > > > > do > >> > >> >> > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > get > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may > not > >> > >> config > >> > >> >> the > >> > >> >> > > two > >> > >> >> > > > > > > options > >> > >> >> > > > > > > > > > > > > aggressively > >> > >> >> > > > > > > > > > > > > > > > high. > >> > >> >> > > > > > > > > > > > > > > > > > But > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks > >> of > >> > >> >> overall > >> > >> >> > > > > container > >> > >> >> > > > > > > > > memory > >> > >> >> > > > > > > > > > > > usage > >> > >> >> > > > > > > > > > > > > > > > exceeds > >> > >> >> > > > > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > >> 9:39 AM > >> > >> Till > >> > >> >> > > > > Rohrmann < > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > this > >> > FLIP > >> > >> >> > Xintong. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > >> already > >> > >> >> looks > >> > >> >> > > quite > >> > >> >> > > > > > good. > >> > >> >> > > > > > > > > > > > Concerning > >> > >> >> > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > first > >> > >> >> > > > > > > > > > > > > > > > > > > open > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > allocating > >> > >> memory > >> > >> >> > > > segments, > >> > >> >> > > > > I > >> > >> >> > > > > > > was > >> > >> >> > > > > > > > > > > > wondering > >> > >> >> > > > > > > > > > > > > > > > whether > >> > >> >> > > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do > >> in > >> > the > >> > >> >> > context > >> > >> >> > > > of > >> > >> >> > > > > > this > >> > >> >> > > > > > > > > FLIP > >> > >> >> > > > > > > > > > or > >> > >> >> > > > > > > > > > > > > > whether > >> > >> >> > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > >> > Without > >> > >> >> > knowing > >> > >> >> > > > all > >> > >> >> > > > > > > > > details, > >> > >> >> > > > > > > > > > I > >> > >> >> > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the > >> scope > >> > >> of > >> > >> >> this > >> > >> >> > > > FLIP > >> > >> >> > > > > > too > >> > >> >> > > > > > > > much > >> > >> >> > > > > > > > > > > > because > >> > >> >> > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > existing > >> > call > >> > >> >> sites > >> > >> >> > of > >> > >> >> > > > the > >> > >> >> > > > > > > > > > > MemoryManager > >> > >> >> > > > > > > > > > > > > > where > >> > >> >> > > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > >> should > >> > >> >> mainly > >> > >> >> > be > >> > >> >> > > > > batch > >> > >> >> > > > > > > > > > > operators). > >> > >> >> > > > > > > > > > > > > The > >> > >> >> > > > > > > > > > > > > > > > > addition > >> > >> >> > > > > > > > > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation > >> call > >> > to > >> > >> the > >> > >> >> > > > > > > MemoryManager > >> > >> >> > > > > > > > > > should > >> > >> >> > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > affected > >> > >> >> > > > > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope > that > >> > >> this is > >> > >> >> > the > >> > >> >> > > > only > >> > >> >> > > > > > > point > >> > >> >> > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > interaction > >> > >> >> > > > > > > > > > > > > > > > a > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have > >> with > >> > >> the > >> > >> >> > > > > > > MemoryManager. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second > open > >> > >> >> question > >> > >> >> > > about > >> > >> >> > > > > > > setting > >> > >> >> > > > > > > > > or > >> > >> >> > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > setting > >> > >> >> > > > > > > > > > > > > > > > a > >> > >> >> > > > > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > >> would > >> > >> also > >> > >> >> be > >> > >> >> > > > > > interested > >> > >> >> > > > > > > > why > >> > >> >> > > > > > > > > > > Yang > >> > >> >> > > > > > > > > > > > > Wang > >> > >> >> > > > > > > > > > > > > > > > > thinks > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > >> > best. > >> > >> My > >> > >> >> > > concern > >> > >> >> > > > > > about > >> > >> >> > > > > > > > > this > >> > >> >> > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > situation > >> as > >> > we > >> > >> >> are > >> > >> >> > now > >> > >> >> > > > > with > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> > >> >> > > > > > > > > > > > > > > > > > > If > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > pools > >> > are > >> > >> not > >> > >> >> > > > clearly > >> > >> >> > > > > > > > > separated > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > can > >> > >> >> > > > > > > > > > > > > > > > spill > >> > >> >> > > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then > it > >> is > >> > >> quite > >> > >> >> > hard > >> > >> >> > > > to > >> > >> >> > > > > > > > > understand > >> > >> >> > > > > > > > > > > > what > >> > >> >> > > > > > > > > > > > > > > > exactly > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed > for > >> > using > >> > >> >> too > >> > >> >> > > much > >> > >> >> > > > > > > memory. > >> > >> >> > > > > > > > > This > >> > >> >> > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > then > >> > >> >> > > > > > > > > > > > > > > > > > easily > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > >> situation > >> > >> what > >> > >> >> we > >> > >> >> > > have > >> > >> >> > > > > with > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> > >> >> > > > > > > > > > > > > > > > So > >> > >> >> > > > > > > > > > > > > > > > > > why > >> > >> >> > > > > > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > >> value > >> > >> for > >> > >> >> max > >> > >> >> > > > direct > >> > >> >> > > > > > > > memory > >> > >> >> > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > giving > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > an > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if > >> he > >> > >> runs > >> > >> >> into > >> > >> >> > > an > >> > >> >> > > > > OOM. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> > >> alternative 2 > >> > >> >> > lead > >> > >> >> > > to > >> > >> >> > > > > > lower > >> > >> >> > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > utilization > >> > >> >> > > > > > > > > > > > > > > > > > than > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we > set > >> > the > >> > >> >> direct > >> > >> >> > > > > memory > >> > >> >> > > > > > > to a > >> > >> >> > > > > > > > > > > higher > >> > >> >> > > > > > > > > > > > > > value? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at > >> 9:12 > >> > AM > >> > >> >> > Xintong > >> > >> >> > > > > Song < > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > feedback, > >> > >> Yang. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > comments: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > >> Memory* > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > >> > large > >> > >> max > >> > >> >> > > direct > >> > >> >> > > > > > > memory > >> > >> >> > > > > > > > > size > >> > >> >> > > > > > > > > > > > > > > definitely > >> > >> >> > > > > > > > > > > > > > > > > has > >> > >> >> > > > > > > > > > > > > > > > > > > some > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we > do > >> not > >> > >> >> worry > >> > >> >> > > about > >> > >> >> > > > > > > direct > >> > >> >> > > > > > > > > OOM, > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > don't > >> > >> >> > > > > > > > > > > > > > > > > > even > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > >> > network > >> > >> >> > memory > >> > >> >> > > > with > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > also > >> > some > >> > >> >> down > >> > >> >> > > sides > >> > >> >> > > > > of > >> > >> >> > > > > > > > doing > >> > >> >> > > > > > > > > > > this. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > >> think > >> > >> of is > >> > >> >> > that > >> > >> >> > > > if > >> > >> >> > > > > a > >> > >> >> > > > > > > task > >> > >> >> > > > > > > > > > > > executor > >> > >> >> > > > > > > > > > > > > > > > > container > >> > >> >> > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > >> overusing > >> > >> >> memory, > >> > >> >> > it > >> > >> >> > > > > could > >> > >> >> > > > > > > be > >> > >> >> > > > > > > > > hard > >> > >> >> > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > use > >> > >> >> > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > know > >> > >> >> > > > > > > > > > > > > > > > > > > > which > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > >> > overused. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side > >> is > >> > >> that > >> > >> >> the > >> > >> >> > > JVM > >> > >> >> > > > > > never > >> > >> >> > > > > > > > > > trigger > >> > >> >> > > > > > > > > > > GC > >> > >> >> > > > > > > > > > > > > due > >> > >> >> > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > limit, > >> > >> because > >> > >> >> the > >> > >> >> > > > limit > >> > >> >> > > > > > is > >> > >> >> > > > > > > > too > >> > >> >> > > > > > > > > > high > >> > >> >> > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > reached. > >> > >> >> > > > > > > > > > > > > > > > > > > > That > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of > >> relay > >> > on > >> > >> >> heap > >> > >> >> > > > memory > >> > >> >> > > > > to > >> > >> >> > > > > > > > > trigger > >> > >> >> > > > > > > > > > > GC > >> > >> >> > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > release > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could > >> be a > >> > >> >> problem > >> > >> >> > in > >> > >> >> > > > > cases > >> > >> >> > > > > > > > where > >> > >> >> > > > > > > > > > we > >> > >> >> > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > more > >> > >> >> > > > > > > > > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > enough > >> > heap > >> > >> >> > activity > >> > >> >> > > > to > >> > >> >> > > > > > > > trigger > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > GC. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share > your > >> > >> reasons > >> > >> >> > for > >> > >> >> > > > > > > preferring > >> > >> >> > > > > > > > > > > > setting a > >> > >> >> > > > > > > > > > > > > > > very > >> > >> >> > > > > > > > > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything > >> else > >> > I > >> > >> >> > > > overlooked. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > conflict > >> > >> between > >> > >> >> > > > multiple > >> > >> >> > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, > I > >> > >> think we > >> > >> >> > > should > >> > >> >> > > > > > throw > >> > >> >> > > > > > > > an > >> > >> >> > > > > > > > > > > error. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking > >> on > >> > the > >> > >> >> > client > >> > >> >> > > > side > >> > >> >> > > > > > is > >> > >> >> > > > > > > a > >> > >> >> > > > > > > > > good > >> > >> >> > > > > > > > > > > > idea, > >> > >> >> > > > > > > > > > > > > > so > >> > >> >> > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover > the > >> > >> problem > >> > >> >> > > before > >> > >> >> > > > > > > > submitting > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > Flink > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > thing. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only > >> rely on > >> > >> the > >> > >> >> > > client > >> > >> >> > > > > side > >> > >> >> > > > > > > > > > checking, > >> > >> >> > > > > > > > > > > > > > because > >> > >> >> > > > > > > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> > >> TaskManagers > >> > >> >> on > >> > >> >> > > > > > different > >> > >> >> > > > > > > > > > machines > >> > >> >> > > > > > > > > > > > may > >> > >> >> > > > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > > > > > > different > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > >> > client > >> > >> >> does > >> > >> >> > > see > >> > >> >> > > > > > that. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at > >> 5:09 > >> > >> PM > >> > >> >> Yang > >> > >> >> > > > Wang > >> > >> >> > > > > < > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > >> detailed > >> > >> >> > proposal. > >> > >> >> > > > > After > >> > >> >> > > > > > > all > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will > be > >> > more > >> > >> >> > > powerful > >> > >> >> > > > to > >> > >> >> > > > > > > > control > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > flink > >> > >> >> > > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > >> questions > >> > >> about > >> > >> >> it. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > Direct > >> > >> Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > >> differentiate > >> > >> user > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > and > >> > >> >> > > > > > > > > > > native > >> > >> >> > > > > > > > > > > > > > > memory. > >> > >> >> > > > > > > > > > > > > > > > > > They > >> > >> >> > > > > > > > > > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > >> off-heap > >> > >> >> memory. > >> > >> >> > > > > Right? > >> > >> >> > > > > > > So i > >> > >> >> > > > > > > > > > don’t > >> > >> >> > > > > > > > > > > > > think > >> > >> >> > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > >> > -XX:MaxDirectMemorySize > >> > >> >> > > > properly. I > >> > >> >> > > > > > > > prefer > >> > >> >> > > > > > > > > > > > leaving > >> > >> >> > > > > > > > > > > > > > it a > >> > >> >> > > > > > > > > > > > > > > > > very > >> > >> >> > > > > > > > > > > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > >> Calculation > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> > >> fine-grained > >> > >> >> > > > > > > memory(network > >> > >> >> > > > > > > > > > > memory, > >> > >> >> > > > > > > > > > > > > > > managed > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> > >> process > >> > >> >> > > memory, > >> > >> >> > > > > how > >> > >> >> > > > > > do > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > deal > >> > >> >> > > > > > > > > > > > > with > >> > >> >> > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > >> > memory > >> > >> >> > > > > configuration > >> > >> >> > > > > > > in > >> > >> >> > > > > > > > > > > client? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> > >> >> > > [hidden email]> > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > >> start > >> > a > >> > >> >> > > discussion > >> > >> >> > > > > > > thread > >> > >> >> > > > > > > > on > >> > >> >> > > > > > > > > > > > > "FLIP-49: > >> > >> >> > > > > > > > > > > > > > > > > Unified > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> > >> >> > > > TaskExecutors"[1], > >> > >> >> > > > > > > where > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > > > > describe > >> > >> >> > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > memory > >> > >> >> > > configurations. > >> > >> >> > > > > The > >> > >> >> > > > > > > > FLIP > >> > >> >> > > > > > > > > > > > document > >> > >> >> > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > mostly > >> > >> >> > > > > > > > > > > > > > > > > > > > based > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > "Memory > >> > >> >> Management > >> > >> >> > > and > >> > >> >> > > > > > > > > > Configuration > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> > >> >> > > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> > >> follow-up > >> > >> >> > > > > discussions > >> > >> >> > > > > > > > both > >> > >> >> > > > > > > > > > > online > >> > >> >> > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > > offline. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > addresses > >> > >> several > >> > >> >> > > > > > shortcomings > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > > > current > >> > >> >> > > > > > > > > > > > > > > (Flink > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > memory > >> > >> >> > > configuration. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> > >> configuration > >> > >> >> > for > >> > >> >> > > > > > > Streaming > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > > Batch. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> > >> difficult > >> > >> >> > > > > > configuration > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > > > RocksDB > >> > >> >> > > > > > > > > > > > > in > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> > >> uncertain > >> > >> >> and > >> > >> >> > > > hard > >> > >> >> > > > > to > >> > >> >> > > > > > > > > > > understand. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > solve > >> > the > >> > >> >> > problems > >> > >> >> > > > can > >> > >> >> > > > > > be > >> > >> >> > > > > > > > > > > summarized > >> > >> >> > > > > > > > > > > > > as > >> > >> >> > > > > > > > > > > > > > > > > follows. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> > >> manager > >> > >> >> to > >> > >> >> > > also > >> > >> >> > > > > > > account > >> > >> >> > > > > > > > > for > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > usage > >> > >> >> > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > state > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> > >> TaskExecutor > >> > >> >> > > memory > >> > >> >> > > > > is > >> > >> >> > > > > > > > > > > partitioned > >> > >> >> > > > > > > > > > > > > > > > accounted > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > >> reservations > >> > >> and > >> > >> >> > pools. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > memory > >> > >> >> > > configuration > >> > >> >> > > > > > > options > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > calculations > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > >> > details > >> > >> in > >> > >> >> the > >> > >> >> > > > FLIP > >> > >> >> > > > > > wiki > >> > >> >> > > > > > > > > > > document > >> > >> >> > > > > > > > > > > > > [1]. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that > >> the > >> > >> early > >> > >> >> > > design > >> > >> >> > > > > doc > >> > >> >> > > > > > > [2] > >> > >> >> > > > > > > > is > >> > >> >> > > > > > > > > > out > >> > >> >> > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > sync, > >> > >> >> > > > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > > it > >> > >> >> > > > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > have > >> the > >> > >> >> > > discussion > >> > >> >> > > > in > >> > >> >> > > > > > > this > >> > >> >> > > > > > > > > > > mailing > >> > >> >> > > > > > > > > > > > > list > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to > >> your > >> > >> >> > > feedbacks. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > Xintong > >> > Song > >> > >> < > >> > >> >> > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > Till. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. > I > >> was > >> > >> >> > wondering > >> > >> >> > > > > > whether > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > can > >> > >> >> > > > > > > > > > > > > avoid > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> > >> managed > >> > >> >> > memory > >> > >> >> > > > and > >> > >> >> > > > > > > > network > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > with > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > >> > second > >> > >> >> > > thought, > >> > >> >> > > > I > >> > >> >> > > > > > > think > >> > >> >> > > > > > > > > even > >> > >> >> > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory > for > >> > >> off-heap > >> > >> >> > > > managed > >> > >> >> > > > > > > memory > >> > >> >> > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > cause > >> > >> >> > > > > > > > > > > > > > > problems. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> > >> proposed > >> > >> >> in > >> > >> >> > > this > >> > >> >> > > > > > FLIP > >> > >> >> > > > > > > it > >> > >> >> > > > > > > > > to > >> > >> >> > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > both > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > >> > memory > >> > >> >> > > allocated > >> > >> >> > > > > > > through > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they > >> are > >> > >> >> > practically > >> > >> >> > > > > > native > >> > >> >> > > > > > > > > memory > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. > The > >> > only > >> > >> >> parts > >> > >> >> > of > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > > > limited > >> > >> >> > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > >> > memory > >> > >> and > >> > >> >> > JVM > >> > >> >> > > > > > > overhead, > >> > >> >> > > > > > > > > > which > >> > >> >> > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to > set > >> the > >> > >> JVM > >> > >> >> max > >> > >> >> > > > > direct > >> > >> >> > > > > > > > memory > >> > >> >> > > > > > > > > > to. > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> > >> Rohrmann > >> > >> >> < > >> > >> >> > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > >> Xintong. I > >> > >> >> > > understand > >> > >> >> > > > > the > >> > >> >> > > > > > > two > >> > >> >> > > > > > > > > > > > > alternatives > >> > >> >> > > > > > > > > > > > > > > > now. > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > >> > because > >> > >> it > >> > >> >> > makes > >> > >> >> > > > > > things > >> > >> >> > > > > > > > > > > explicit. > >> > >> >> > > > > > > > > > > > If > >> > >> >> > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I > >> fear > >> > >> that > >> > >> >> we > >> > >> >> > > might > >> > >> >> > > > > end > >> > >> >> > > > > > > up > >> > >> >> > > > > > > > > in a > >> > >> >> > > > > > > > > > > > > similar > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently in: > >> The > >> > >> user > >> > >> >> > might > >> > >> >> > > > see > >> > >> >> > > > > > that > >> > >> >> > > > > > > > her > >> > >> >> > > > > > > > > > > > process > >> > >> >> > > > > > > > > > > > > > > gets > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know > >> why > >> > >> this > >> > >> >> is > >> > >> >> > > the > >> > >> >> > > > > > case. > >> > >> >> > > > > > > > > > > > > Consequently, > >> > >> >> > > > > > > > > > > > > > > she > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > memory > >> > size > >> > >> >> > > (similar > >> > >> >> > > > to > >> > >> >> > > > > > > > > > increasing > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > cutoff > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for > >> the > >> > >> extra > >> > >> >> > > direct > >> > >> >> > > > > > > memory. > >> > >> >> > > > > > > > > > Even > >> > >> >> > > > > > > > > > > > > worse, > >> > >> >> > > > > > > > > > > > > > > she > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > >> which > >> > >> are > >> > >> >> not > >> > >> >> > > > fully > >> > >> >> > > > > > used > >> > >> >> > > > > > > > and > >> > >> >> > > > > > > > > > > hence > >> > >> >> > > > > > > > > > > > > > won't > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > >> consumption. > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > >> > Xintong > >> > >> >> Song < > >> > >> >> > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > >> concrete > >> > >> >> example > >> > >> >> > > Till. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> > >> scenario. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> > >> Memory + > >> > >> >> JVM > >> > >> >> > > > > > > Overhead): > >> > >> >> > > > > > > > > > 200MB > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, > JVM > >> > >> >> Metaspace, > >> > >> >> > > > > > Off-Heap > >> > >> >> > > > > > > > > > Managed > >> > >> >> > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> > >> >> > > -XX:MaxDirectMemorySize > >> > >> >> > > > > to > >> > >> >> > > > > > > > 200MB. > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> > >> >> > > -XX:MaxDirectMemorySize > >> > >> >> > > > > to > >> > >> >> > > > > > a > >> > >> >> > > > > > > > very > >> > >> >> > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > value, > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > usage > >> of > >> > >> Task > >> > >> >> > > > Off-Heap > >> > >> >> > > > > > > Memory > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > Overhead > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > >> > alternative 2 > >> > >> >> and > >> > >> >> > > > > > > alternative 3 > >> > >> >> > > > > > > > > > > should > >> > >> >> > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> > >> >> > > > > -XX:MaxDirectMemorySize > >> > >> >> > > > > > > will > >> > >> >> > > > > > > > > not > >> > >> >> > > > > > > > > > > > > reduce > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > usage > >> of > >> > >> Task > >> > >> >> > > > Off-Heap > >> > >> >> > > > > > > Memory > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > 200MB, > >> > then > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> > >> frequent > >> > >> >> OOM. > >> > >> >> > > To > >> > >> >> > > > > > avoid > >> > >> >> > > > > > > > > that, > >> > >> >> > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > only > >> > >> >> > > > > > > > > > > > > > > > thing > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> > >> >> configuration > >> > >> >> > > and > >> > >> >> > > > > > > > increase > >> > >> >> > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > Direct > >> > >> >> > > > > > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> > >> Overhead). > >> > >> >> > Let's > >> > >> >> > > > say > >> > >> >> > > > > > > that > >> > >> >> > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > increases > >> > >> >> > > > > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this > >> will > >> > >> >> reduce > >> > >> >> > the > >> > >> >> > > > > total > >> > >> >> > > > > > > > size > >> > >> >> > > > > > > > > of > >> > >> >> > > > > > > > > > > > other > >> > >> >> > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > total > >> > >> process > >> > >> >> > > memory > >> > >> >> > > > > > > remains > >> > >> >> > > > > > > > > > 1GB. > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is > >> no > >> > >> >> chance of > >> > >> >> > > > > direct > >> > >> >> > > > > > > OOM. > >> > >> >> > > > > > > > > > There > >> > >> >> > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > chances > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > process > >> > >> memory > >> > >> >> > limit, > >> > >> >> > > > but > >> > >> >> > > > > > > given > >> > >> >> > > > > > > > > > that > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > process > >> > >> >> > > > > > > > > > > > > > > > > may > >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > >> native > >> > >> >> memory > >> > >> >> > > > > > (Off-Heap > >> > >> >> > > > > > > > > > Managed > >> > >> >> > > > > > > > > > > > > > Memory, > >> > >> >> > > > > > > > > > > > > > > > > Network > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if > the > >> > >> actual > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > > usage > >> > >> >> > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > slightly > >> > >> >> > > > > > > > > > > > > > > > > above > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> > >> probably > >> > >> >> do > >> > >> >> > > not > >> > >> >> > > > > need > >> > >> >> > > > > > > to > >> > >> >> > > > > > > > > > change > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > configurations. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > user's > >> > >> >> > > perspective, a > >> > >> >> > > > > > > > feasible > >> > >> >> > > > > > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to > >> lower > >> > >> >> resource > >> > >> >> > > > > > > utilization > >> > >> >> > > > > > > > > > > compared > >> > >> >> > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM > >> Till > >> > >> >> > Rohrmann > >> > >> >> > > < > >> > >> >> > > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> > >> understand > >> > >> >> the > >> > >> >> > > > > > difference > >> > >> >> > > > > > > > > > between > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > >> > utilization > >> > >> >> > > Xintong. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> > >> >> XX:MaxDirectMemorySize > >> > >> >> > > to > >> > >> >> > > > > Task > >> > >> >> > > > > > > > > > Off-Heap > >> > >> >> > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > JVM > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the > risk > >> > that > >> > >> >> this > >> > >> >> > > size > >> > >> >> > > > > is > >> > >> >> > > > > > > too > >> > >> >> > > > > > > > > low > >> > >> >> > > > > > > > > > > > > > resulting > >> > >> >> > > > > > > > > > > > > > > > in a > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> > >> >> potentially > >> > >> >> > an > >> > >> >> > > > OOM. > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> > >> >> XX:MaxDirectMemorySize > >> > >> >> > > to > >> > >> >> > > > > > > > something > >> > >> >> > > > > > > > > > > larger > >> > >> >> > > > > > > > > > > > > > than > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > >> course > >> > >> >> reduce > >> > >> >> > > the > >> > >> >> > > > > > sizes > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > > > > other > >> > >> >> > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > types. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > >> result > >> > >> in an > >> > >> >> > > under > >> > >> >> > > > > > > > > utilization > >> > >> >> > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> > >> >> alternative 3 > >> > >> >> > > > > > strictly > >> > >> >> > > > > > > > > sets a > >> > >> >> > > > > > > > > > > > > higher > >> > >> >> > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use > >> only > >> > >> >> little, > >> > >> >> > > > then I > >> > >> >> > > > > > > would > >> > >> >> > > > > > > > > > > expect > >> > >> >> > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > memory > >> > under > >> > >> >> > > > > utilization. > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 > PM > >> > Yang > >> > >> >> Wang < > >> > >> >> > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > >> large > >> > >> max > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > size > >> > >> >> > > > > > > > > > > when > >> > >> >> > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > do > >> > >> >> > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > >> native > >> > >> >> memory. > >> > >> >> > If > >> > >> >> > > > the > >> > >> >> > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > memory,including > >> > >> >> > > > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > >> > direct > >> > >> >> > > > memory,could > >> > >> >> > > > > > be > >> > >> >> > > > > > > > > > > calculated > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > >> direct > >> > >> memory > >> > >> >> > with > >> > >> >> > > > > fixed > >> > >> >> > > > > > > > > value. > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For > Yarn > >> > and > >> > >> >> k8s,we > >> > >> >> > > > need > >> > >> >> > > > > to > >> > >> >> > > > > > > > check > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client to > >> avoid > >> > >> >> > > submitting > >> > >> >> > > > > > > > > successfully > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > failing > >> > >> >> > > > > > > > > > > > > > > > > in > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> > >> [hidden email] > >> > >> >> > > > > >于2019年8月13日 > >> > >> >> > > > > > > > > > 周二22:07写道: > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > think > >> > you > >> > >> are > >> > >> >> > > right > >> > >> >> > > > > that > >> > >> >> > > > > > > we > >> > >> >> > > > > > > > > > should > >> > >> >> > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > include > >> > >> >> > > > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > >> > FLIP. > >> > >> >> This > >> > >> >> > > FLIP > >> > >> >> > > > > > should > >> > >> >> > > > > > > > > > > > concentrate > >> > >> >> > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> > >> >> > TaskExecutors, > >> > >> >> > > > > with > >> > >> >> > > > > > > > > minimum > >> > >> >> > > > > > > > > > > > > > > involvement > >> > >> >> > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > think > >> > >> >> > alternative > >> > >> >> > > 3 > >> > >> >> > > > > may > >> > >> >> > > > > > > not > >> > >> >> > > > > > > > > > having > >> > >> >> > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > same > >> > >> >> > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> > >> alternative 2 > >> > >> >> > > does, > >> > >> >> > > > > but > >> > >> >> > > > > > at > >> > >> >> > > > > > > > the > >> > >> >> > > > > > > > > > > cost > >> > >> >> > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > risk > >> > >> >> > > > > > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > >> container > >> > >> level, > >> > >> >> > > which > >> > >> >> > > > is > >> > >> >> > > > > > not > >> > >> >> > > > > > > > > good. > >> > >> >> > > > > > > > > > > My > >> > >> >> > > > > > > > > > > > > > point > >> > >> >> > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > Memory" > >> and > >> > >> "JVM > >> > >> >> > > > > Overhead" > >> > >> >> > > > > > > are > >> > >> >> > > > > > > > > not > >> > >> >> > > > > > > > > > > easy > >> > >> >> > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > config. > >> > >> >> > > > > > > > > > > > > > > > > > > For > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> > >> configure > >> > >> >> > them > >> > >> >> > > > > > higher > >> > >> >> > > > > > > > than > >> > >> >> > > > > > > > > > > what > >> > >> >> > > > > > > > > > > > > > > actually > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > >> direct > >> > >> OOM. > >> > >> >> For > >> > >> >> > > > > > > alternative > >> > >> >> > > > > > > > > 3, > >> > >> >> > > > > > > > > > > > users > >> > >> >> > > > > > > > > > > > > do > >> > >> >> > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > get > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may > not > >> > >> config > >> > >> >> the > >> > >> >> > > two > >> > >> >> > > > > > > options > >> > >> >> > > > > > > > > > > > > aggressively > >> > >> >> > > > > > > > > > > > > > > > high. > >> > >> >> > > > > > > > > > > > > > > > > > But > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks > >> of > >> > >> >> overall > >> > >> >> > > > > container > >> > >> >> > > > > > > > > memory > >> > >> >> > > > > > > > > > > > usage > >> > >> >> > > > > > > > > > > > > > > > exceeds > >> > >> >> > > > > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > >> 9:39 AM > >> > >> Till > >> > >> >> > > > > Rohrmann < > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > this > >> > FLIP > >> > >> >> > Xintong. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > >> already > >> > >> >> looks > >> > >> >> > > quite > >> > >> >> > > > > > good. > >> > >> >> > > > > > > > > > > > Concerning > >> > >> >> > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > first > >> > >> >> > > > > > > > > > > > > > > > > > > open > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > allocating > >> > >> memory > >> > >> >> > > > segments, > >> > >> >> > > > > I > >> > >> >> > > > > > > was > >> > >> >> > > > > > > > > > > > wondering > >> > >> >> > > > > > > > > > > > > > > > whether > >> > >> >> > > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do > >> in > >> > the > >> > >> >> > context > >> > >> >> > > > of > >> > >> >> > > > > > this > >> > >> >> > > > > > > > > FLIP > >> > >> >> > > > > > > > > > or > >> > >> >> > > > > > > > > > > > > > whether > >> > >> >> > > > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > >> > Without > >> > >> >> > knowing > >> > >> >> > > > all > >> > >> >> > > > > > > > > details, > >> > >> >> > > > > > > > > > I > >> > >> >> > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the > >> scope > >> > >> of > >> > >> >> this > >> > >> >> > > > FLIP > >> > >> >> > > > > > too > >> > >> >> > > > > > > > much > >> > >> >> > > > > > > > > > > > because > >> > >> >> > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > existing > >> > call > >> > >> >> sites > >> > >> >> > of > >> > >> >> > > > the > >> > >> >> > > > > > > > > > > MemoryManager > >> > >> >> > > > > > > > > > > > > > where > >> > >> >> > > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > >> should > >> > >> >> mainly > >> > >> >> > be > >> > >> >> > > > > batch > >> > >> >> > > > > > > > > > > operators). > >> > >> >> > > > > > > > > > > > > The > >> > >> >> > > > > > > > > > > > > > > > > addition > >> > >> >> > > > > > > > > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation > >> call > >> > to > >> > >> the > >> > >> >> > > > > > > MemoryManager > >> > >> >> > > > > > > > > > should > >> > >> >> > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > affected > >> > >> >> > > > > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope > that > >> > >> this is > >> > >> >> > the > >> > >> >> > > > only > >> > >> >> > > > > > > point > >> > >> >> > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > interaction > >> > >> >> > > > > > > > > > > > > > > > a > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have > >> with > >> > >> the > >> > >> >> > > > > > > MemoryManager. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second > open > >> > >> >> question > >> > >> >> > > about > >> > >> >> > > > > > > setting > >> > >> >> > > > > > > > > or > >> > >> >> > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > setting > >> > >> >> > > > > > > > > > > > > > > > a > >> > >> >> > > > > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > >> would > >> > >> also > >> > >> >> be > >> > >> >> > > > > > interested > >> > >> >> > > > > > > > why > >> > >> >> > > > > > > > > > > Yang > >> > >> >> > > > > > > > > > > > > Wang > >> > >> >> > > > > > > > > > > > > > > > > thinks > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > >> > best. > >> > >> My > >> > >> >> > > concern > >> > >> >> > > > > > about > >> > >> >> > > > > > > > > this > >> > >> >> > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > > > would > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > situation > >> as > >> > we > >> > >> >> are > >> > >> >> > now > >> > >> >> > > > > with > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> > >> >> > > > > > > > > > > > > > > > > > > If > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > pools > >> > are > >> > >> not > >> > >> >> > > > clearly > >> > >> >> > > > > > > > > separated > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > can > >> > >> >> > > > > > > > > > > > > > > > spill > >> > >> >> > > > > > > > > > > > > > > > > > over > >> > >> >> > > > > > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then > it > >> is > >> > >> quite > >> > >> >> > hard > >> > >> >> > > > to > >> > >> >> > > > > > > > > understand > >> > >> >> > > > > > > > > > > > what > >> > >> >> > > > > > > > > > > > > > > > exactly > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed > for > >> > using > >> > >> >> too > >> > >> >> > > much > >> > >> >> > > > > > > memory. > >> > >> >> > > > > > > > > This > >> > >> >> > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > then > >> > >> >> > > > > > > > > > > > > > > > > > easily > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > >> situation > >> > >> what > >> > >> >> we > >> > >> >> > > have > >> > >> >> > > > > with > >> > >> >> > > > > > > the > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> > >> >> > > > > > > > > > > > > > > > So > >> > >> >> > > > > > > > > > > > > > > > > > why > >> > >> >> > > > > > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > >> value > >> > >> for > >> > >> >> max > >> > >> >> > > > direct > >> > >> >> > > > > > > > memory > >> > >> >> > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > giving > >> > >> >> > > > > > > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > an > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if > >> he > >> > >> runs > >> > >> >> into > >> > >> >> > > an > >> > >> >> > > > > OOM. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> > >> alternative 2 > >> > >> >> > lead > >> > >> >> > > to > >> > >> >> > > > > > lower > >> > >> >> > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > utilization > >> > >> >> > > > > > > > > > > > > > > > > > than > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we > set > >> > the > >> > >> >> direct > >> > >> >> > > > > memory > >> > >> >> > > > > > > to a > >> > >> >> > > > > > > > > > > higher > >> > >> >> > > > > > > > > > > > > > value? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at > >> 9:12 > >> > AM > >> > >> >> > Xintong > >> > >> >> > > > > Song < > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > feedback, > >> > >> Yang. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > comments: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > >> Memory* > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > >> > large > >> > >> max > >> > >> >> > > direct > >> > >> >> > > > > > > memory > >> > >> >> > > > > > > > > size > >> > >> >> > > > > > > > > > > > > > > definitely > >> > >> >> > > > > > > > > > > > > > > > > has > >> > >> >> > > > > > > > > > > > > > > > > > > some > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we > do > >> not > >> > >> >> worry > >> > >> >> > > about > >> > >> >> > > > > > > direct > >> > >> >> > > > > > > > > OOM, > >> > >> >> > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > don't > >> > >> >> > > > > > > > > > > > > > > > > > even > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > >> > network > >> > >> >> > memory > >> > >> >> > > > with > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > also > >> > some > >> > >> >> down > >> > >> >> > > sides > >> > >> >> > > > > of > >> > >> >> > > > > > > > doing > >> > >> >> > > > > > > > > > > this. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > >> think > >> > >> of is > >> > >> >> > that > >> > >> >> > > > if > >> > >> >> > > > > a > >> > >> >> > > > > > > task > >> > >> >> > > > > > > > > > > > executor > >> > >> >> > > > > > > > > > > > > > > > > container > >> > >> >> > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > >> overusing > >> > >> >> memory, > >> > >> >> > it > >> > >> >> > > > > could > >> > >> >> > > > > > > be > >> > >> >> > > > > > > > > hard > >> > >> >> > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > use > >> > >> >> > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > know > >> > >> >> > > > > > > > > > > > > > > > > > > > which > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > >> > overused. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side > >> is > >> > >> that > >> > >> >> the > >> > >> >> > > JVM > >> > >> >> > > > > > never > >> > >> >> > > > > > > > > > trigger > >> > >> >> > > > > > > > > > > GC > >> > >> >> > > > > > > > > > > > > due > >> > >> >> > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > limit, > >> > >> because > >> > >> >> the > >> > >> >> > > > limit > >> > >> >> > > > > > is > >> > >> >> > > > > > > > too > >> > >> >> > > > > > > > > > high > >> > >> >> > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > be > >> > >> >> > > > > > > > > > > > > > > > > > reached. > >> > >> >> > > > > > > > > > > > > > > > > > > > That > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of > >> relay > >> > on > >> > >> >> heap > >> > >> >> > > > memory > >> > >> >> > > > > to > >> > >> >> > > > > > > > > trigger > >> > >> >> > > > > > > > > > > GC > >> > >> >> > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > release > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could > >> be a > >> > >> >> problem > >> > >> >> > in > >> > >> >> > > > > cases > >> > >> >> > > > > > > > where > >> > >> >> > > > > > > > > > we > >> > >> >> > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > more > >> > >> >> > > > > > > > > > > > > > > > > > direct > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > enough > >> > heap > >> > >> >> > activity > >> > >> >> > > > to > >> > >> >> > > > > > > > trigger > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > GC. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share > your > >> > >> reasons > >> > >> >> > for > >> > >> >> > > > > > > preferring > >> > >> >> > > > > > > > > > > > setting a > >> > >> >> > > > > > > > > > > > > > > very > >> > >> >> > > > > > > > > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything > >> else > >> > I > >> > >> >> > > > overlooked. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > conflict > >> > >> between > >> > >> >> > > > multiple > >> > >> >> > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > user > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, > I > >> > >> think we > >> > >> >> > > should > >> > >> >> > > > > > throw > >> > >> >> > > > > > > > an > >> > >> >> > > > > > > > > > > error. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking > >> on > >> > the > >> > >> >> > client > >> > >> >> > > > side > >> > >> >> > > > > > is > >> > >> >> > > > > > > a > >> > >> >> > > > > > > > > good > >> > >> >> > > > > > > > > > > > idea, > >> > >> >> > > > > > > > > > > > > > so > >> > >> >> > > > > > > > > > > > > > > > that > >> > >> >> > > > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover > the > >> > >> problem > >> > >> >> > > before > >> > >> >> > > > > > > > submitting > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > Flink > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > thing. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only > >> rely on > >> > >> the > >> > >> >> > > client > >> > >> >> > > > > side > >> > >> >> > > > > > > > > > checking, > >> > >> >> > > > > > > > > > > > > > because > >> > >> >> > > > > > > > > > > > > > > > for > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> > >> TaskManagers > >> > >> >> on > >> > >> >> > > > > > different > >> > >> >> > > > > > > > > > machines > >> > >> >> > > > > > > > > > > > may > >> > >> >> > > > > > > > > > > > > > > have > >> > >> >> > > > > > > > > > > > > > > > > > > > different > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > >> > client > >> > >> >> does > >> > >> >> > > see > >> > >> >> > > > > > that. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at > >> 5:09 > >> > >> PM > >> > >> >> Yang > >> > >> >> > > > Wang > >> > >> >> > > > > < > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > >> detailed > >> > >> >> > proposal. > >> > >> >> > > > > After > >> > >> >> > > > > > > all > >> > >> >> > > > > > > > > the > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will > be > >> > more > >> > >> >> > > powerful > >> > >> >> > > > to > >> > >> >> > > > > > > > control > >> > >> >> > > > > > > > > > the > >> > >> >> > > > > > > > > > > > > flink > >> > >> >> > > > > > > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > >> questions > >> > >> about > >> > >> >> it. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > Direct > >> > >> Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > >> differentiate > >> > >> user > >> > >> >> > direct > >> > >> >> > > > > > memory > >> > >> >> > > > > > > > and > >> > >> >> > > > > > > > > > > native > >> > >> >> > > > > > > > > > > > > > > memory. > >> > >> >> > > > > > > > > > > > > > > > > > They > >> > >> >> > > > > > > > > > > > > > > > > > > > are > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > >> off-heap > >> > >> >> memory. > >> > >> >> > > > > Right? > >> > >> >> > > > > > > So i > >> > >> >> > > > > > > > > > don’t > >> > >> >> > > > > > > > > > > > > think > >> > >> >> > > > > > > > > > > > > > > we > >> > >> >> > > > > > > > > > > > > > > > > > could > >> > >> >> > > > > > > > > > > > > > > > > > > > not > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > >> > -XX:MaxDirectMemorySize > >> > >> >> > > > properly. I > >> > >> >> > > > > > > > prefer > >> > >> >> > > > > > > > > > > > leaving > >> > >> >> > > > > > > > > > > > > > it a > >> > >> >> > > > > > > > > > > > > > > > > very > >> > >> >> > > > > > > > > > > > > > > > > > > > large > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > >> Calculation > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> > >> fine-grained > >> > >> >> > > > > > > memory(network > >> > >> >> > > > > > > > > > > memory, > >> > >> >> > > > > > > > > > > > > > > managed > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> > >> process > >> > >> >> > > memory, > >> > >> >> > > > > how > >> > >> >> > > > > > do > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > deal > >> > >> >> > > > > > > > > > > > > with > >> > >> >> > > > > > > > > > > > > > > this > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > >> > memory > >> > >> >> > > > > configuration > >> > >> >> > > > > > > in > >> > >> >> > > > > > > > > > > client? > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> > >> >> > > [hidden email]> > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > >> start > >> > a > >> > >> >> > > discussion > >> > >> >> > > > > > > thread > >> > >> >> > > > > > > > on > >> > >> >> > > > > > > > > > > > > "FLIP-49: > >> > >> >> > > > > > > > > > > > > > > > > Unified > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> > >> >> > > > TaskExecutors"[1], > >> > >> >> > > > > > > where > >> > >> >> > > > > > > > we > >> > >> >> > > > > > > > > > > > > describe > >> > >> >> > > > > > > > > > > > > > > how > >> > >> >> > > > > > > > > > > > > > > > to > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > memory > >> > >> >> > > configurations. > >> > >> >> > > > > The > >> > >> >> > > > > > > > FLIP > >> > >> >> > > > > > > > > > > > document > >> > >> >> > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > mostly > >> > >> >> > > > > > > > > > > > > > > > > > > > based > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > "Memory > >> > >> >> Management > >> > >> >> > > and > >> > >> >> > > > > > > > > > Configuration > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> > >> >> > > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> > >> follow-up > >> > >> >> > > > > discussions > >> > >> >> > > > > > > > both > >> > >> >> > > > > > > > > > > online > >> > >> >> > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > > offline. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > addresses > >> > >> several > >> > >> >> > > > > > shortcomings > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > > > current > >> > >> >> > > > > > > > > > > > > > > (Flink > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > memory > >> > >> >> > > configuration. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> > >> configuration > >> > >> >> > for > >> > >> >> > > > > > > Streaming > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > > Batch. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> > >> difficult > >> > >> >> > > > > > configuration > >> > >> >> > > > > > > of > >> > >> >> > > > > > > > > > > RocksDB > >> > >> >> > > > > > > > > > > > > in > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> > >> uncertain > >> > >> >> and > >> > >> >> > > > hard > >> > >> >> > > > > to > >> > >> >> > > > > > > > > > > understand. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > solve > >> > the > >> > >> >> > problems > >> > >> >> > > > can > >> > >> >> > > > > > be > >> > >> >> > > > > > > > > > > summarized > >> > >> >> > > > > > > > > > > > > as > >> > >> >> > > > > > > > > > > > > > > > > follows. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> > >> manager > >> > >> >> to > >> > >> >> > > also > >> > >> >> > > > > > > account > >> > >> >> > > > > > > > > for > >> > >> >> > > > > > > > > > > > memory > >> > >> >> > > > > > > > > > > > > > > usage > >> > >> >> > > > > > > > > > > > > > > > > by > >> > >> >> > > > > > > > > > > > > > > > > > > > state > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> > >> TaskExecutor > >> > >> >> > > memory > >> > >> >> > > > > is > >> > >> >> > > > > > > > > > > partitioned > >> > >> >> > > > > > > > > > > > > > > > accounted > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > >> reservations > >> > >> and > >> > >> >> > pools. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > memory > >> > >> >> > > configuration > >> > >> >> > > > > > > options > >> > >> >> > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > calculations > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > >> > details > >> > >> in > >> > >> >> the > >> > >> >> > > > FLIP > >> > >> >> > > > > > wiki > >> > >> >> > > > > > > > > > > document > >> > >> >> > > > > > > > > > > > > [1]. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that > >> the > >> > >> early > >> > >> >> > > design > >> > >> >> > > > > doc > >> > >> >> > > > > > > [2] > >> > >> >> > > > > > > > is > >> > >> >> > > > > > > > > > out > >> > >> >> > > > > > > > > > > > of > >> > >> >> > > > > > > > > > > > > > > sync, > >> > >> >> > > > > > > > > > > > > > > > > and > >> > >> >> > > > > > > > > > > > > > > > > > it > >> > >> >> > > > > > > > > > > > > > > > > > > > is > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > have > >> the > >> > >> >> > > discussion > >> > >> >> > > > in > >> > >> >> > > > > > > this > >> > >> >> > > > > > > > > > > mailing > >> > >> >> > > > > > > > > > > > > list > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to > >> your > >> > >> >> > > feedbacks. > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > >> > > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> >> > >> > >> > > >> > >> > >> > > > >> > > >> > > > |
Hi All,
While looking more into the implementation details of Step 4, we released during some offline discussions with @Till that there can be a performance degradation for the batch DataSet API if we simply continue to pull memory from the pool according the legacy option taskmanager.memory.off-heap. The reason is that if the cluster is newly configured to statically split heap/off-heap (not like previously either heap or 0ff-heap) then the batch DataSet API jobs will be able to use only one type of memory. Although it does not really matter where the memory segments come from and potentially batch jobs can use both. Also, currently the Dataset API does not result in absolute resource requirements and its batch jobs will always get a default share of TM resources. The suggestion is that we let the batch tasks of Dataset API pull from both pools according to their fair slot share of each memory type. For that we can have a special wrapping view of both pools which will pull segments (can be randomly) according to the slot limits. The view can wrap TM level memory pools and be given to the Task. Best, Andrey On Mon, Sep 2, 2019 at 1:35 PM Xintong Song <[hidden email]> wrote: > Thanks for your comments, Andrey. > > - Regarding Task Off-Heap Memory, I think you're right that the user need > to make sure that direct memory and native memory together used by the user > code (external libs) do not exceed the configured value. As far as I can > think of, there is nothing we can do about it. > > I addressed the rest of your comment in the wiki page [1]. Please take a > look. > > Thank you~ > > Xintong Song > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin <[hidden email]> > wrote: > > > EDIT: sorry for confusion I meant > > taskmanager.memory.off-heap > > instead of > > setting taskmanager.memory.preallocate > > > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin <[hidden email]> > > wrote: > > > > > Hi All, > > > > > > @Xitong thanks a lot for driving the discussion. > > > > > > I also reviewed the FLIP and it looks quite good to me. > > > Here are some comments: > > > > > > > > > - One thing I wanted to discuss is the backwards-compatibility with > > > the previous user setups. We could list which options we plan to > > deprecate. > > > From the first glance it looks possible to provide the same/similar > > > behaviour for the setups relying on the deprecated options. E.g. > > > setting taskmanager.memory.preallocate to true could override the > > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At the > > moment the > > > FLIP just states that in some cases it may require re-configuring of > > > cluster if migrated from prior versions. My suggestion is that we > try > > to > > > keep it backwards-compatible unless there is a good reason like some > > major > > > complication for the implementation. > > > > > > > > > Also couple of smaller things: > > > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP and leave > > > some general wording atm, like 'data structure to store' or 'utility > > > classes'. When the classes are implemented, we put the concrete > class > > > names. This way we can avoid confusion and stale documents. > > > > > > > > > - As I understand, if user task uses native memory (not direct > memory, > > > but e.g. unsafe.allocate or from external lib), there will be no > > > explicit guard against exceeding 'task off heap memory'. Then user > > should > > > still explicitly make sure that her/his direct buffer allocation > plus > > any > > > other memory usages does not exceed value announced as 'task off > > heap'. I > > > guess there is no so much that can be done about it except > mentioning > > in > > > docs, similar to controlling the heap state backend. > > > > > > > > > Thanks, > > > Andrey > > > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> > wrote: > > > > > >> I also agree that all the configuration should be calculated out of > > >> TaskManager. > > >> > > >> So a full configuration should be generated before TaskManager > started. > > >> > > >> Override the calculated configurations through -D now seems better. > > >> > > >> > > >> > > >> Best, > > >> > > >> Yang > > >> > > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > >> > > >> > I just updated the FLIP wiki page [1], with the following changes: > > >> > > > >> > - Network memory uses JVM direct memory, and is accounted when > > >> setting > > >> > JVM max direct memory size parameter. > > >> > - Use dynamic configurations (`-Dkey=value`) to pass calculated > > >> memory > > >> > configs into TaskExecutors, instead of ENV variables. > > >> > - Remove 'supporting memory reservation' from the scope of this > > FLIP. > > >> > > > >> > @till @stephan, please take another look see if there are any other > > >> > concerns. > > >> > > > >> > Thank you~ > > >> > > > >> > Xintong Song > > >> > > > >> > > > >> > [1] > > >> > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <[hidden email] > > > > >> > wrote: > > >> > > > >> > > Sorry for the late response. > > >> > > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the > > >> detail > > >> > > in PR. > > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 for > using > > >> > > dynamic configuration at the moment, given that there are more > > >> questions > > >> > to > > >> > > be discussed to have a general framework for overwriting > > >> configurations > > >> > > with ENV variables. > > >> > > - Regarding memory reservation, I double checked with Yu and he > will > > >> take > > >> > > care of it. > > >> > > > > >> > > Thank you~ > > >> > > > > >> > > Xintong Song > > >> > > > > >> > > > > >> > > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann < > [hidden email] > > > > > >> > > wrote: > > >> > > > > >> > >> What I forgot to add is that we could tackle specifying the > > >> > configuration > > >> > >> fully in an incremental way and that the full specification > should > > be > > >> > the > > >> > >> desired end state. > > >> > >> > > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > > [hidden email]> > > >> > >> wrote: > > >> > >> > > >> > >> > I think our goal should be that the configuration is fully > > >> specified > > >> > >> when > > >> > >> > the process is started. By considering the internal calculation > > >> step > > >> > to > > >> > >> be > > >> > >> > rather validate existing values and calculate missing ones, > these > > >> two > > >> > >> > proposal shouldn't even conflict (given determinism). > > >> > >> > > > >> > >> > Since we don't want to change an existing flink-conf.yaml, > > >> specifying > > >> > >> the > > >> > >> > full configuration would require to pass in the options > > >> differently. > > >> > >> > > > >> > >> > One way could be the ENV variables approach. The reason why I'm > > >> trying > > >> > >> to > > >> > >> > exclude this feature from the FLIP is that I believe it needs a > > bit > > >> > more > > >> > >> > discussion. Just some questions which come to my mind: What > would > > >> be > > >> > the > > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator > > >> which > > >> > is > > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the > dot > > >> > >> > separator what would be the order of precedence if there are > two > > >> ENV > > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is > > the > > >> > >> > precedence of env variable vs. dynamic configuration value > > >> specified > > >> > >> via -D? > > >> > >> > > > >> > >> > Another approach could be to pass in the dynamic configuration > > >> values > > >> > >> via > > >> > >> > `-Dkey=value` to the Flink process. For that we don't have to > > >> change > > >> > >> > anything because the functionality already exists. > > >> > >> > > > >> > >> > Cheers, > > >> > >> > Till > > >> > >> > > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen < > [hidden email]> > > >> > wrote: > > >> > >> > > > >> > >> >> I see. Under the assumption of strict determinism that should > > >> work. > > >> > >> >> > > >> > >> >> The original proposal had this point "don't compute inside the > > TM, > > >> > >> compute > > >> > >> >> outside and supply a full config", because that sounded more > > >> > intuitive. > > >> > >> >> > > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > > >> [hidden email] > > >> > > > > >> > >> >> wrote: > > >> > >> >> > > >> > >> >> > My understanding was that before starting the Flink process > we > > >> > call a > > >> > >> >> > utility which calculates these values. I assume that this > > >> utility > > >> > >> will > > >> > >> >> do > > >> > >> >> > the calculation based on a set of configured values (process > > >> > memory, > > >> > >> >> flink > > >> > >> >> > memory, network memory etc.). Assuming that these values > don't > > >> > differ > > >> > >> >> from > > >> > >> >> > the values with which the JVM is started, it should be > > possible > > >> to > > >> > >> >> > recompute them in the Flink process in order to set the > > values. > > >> > >> >> > > > >> > >> >> > > > >> > >> >> > > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > > [hidden email] > > >> > > > >> > >> wrote: > > >> > >> >> > > > >> > >> >> > > When computing the values in the JVM process after it > > started, > > >> > how > > >> > >> >> would > > >> > >> >> > > you deal with values like Max Direct Memory, Metaspace > size. > > >> > native > > >> > >> >> > memory > > >> > >> >> > > reservation (reduce heap size), etc? All the values that > are > > >> > >> >> parameters > > >> > >> >> > to > > >> > >> >> > > the JVM process and that need to be supplied at process > > >> startup? > > >> > >> >> > > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > >> > >> [hidden email]> > > >> > >> >> > > wrote: > > >> > >> >> > > > > >> > >> >> > > > Thanks for the clarification. I have some more comments: > > >> > >> >> > > > > > >> > >> >> > > > - I would actually split the logic to compute the > process > > >> > memory > > >> > >> >> > > > requirements and storing the values into two things. > E.g. > > >> one > > >> > >> could > > >> > >> >> > name > > >> > >> >> > > > the former TaskExecutorProcessUtility and the latter > > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on > the > > PR > > >> > >> since > > >> > >> >> it's > > >> > >> >> > > > just a naming detail. > > >> > >> >> > > > > > >> > >> >> > > > - Generally, I'm not opposed to making configuration > > values > > >> > >> >> overridable > > >> > >> >> > > by > > >> > >> >> > > > ENV variables. I think this is a very good idea and > makes > > >> the > > >> > >> >> > > > configurability of Flink processes easier. However, I > > think > > >> > that > > >> > >> >> adding > > >> > >> >> > > > this functionality should not be part of this FLIP > because > > >> it > > >> > >> would > > >> > >> >> > > simply > > >> > >> >> > > > widen the scope unnecessarily. > > >> > >> >> > > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are the > > >> following: > > >> > >> For > > >> > >> >> Yarn > > >> > >> >> > > we > > >> > >> >> > > > already create write a flink-conf.yaml which could be > > >> populated > > >> > >> with > > >> > >> >> > the > > >> > >> >> > > > memory settings. For the other processes it should not > > make > > >> a > > >> > >> >> > difference > > >> > >> >> > > > whether the loaded Configuration is populated with the > > >> memory > > >> > >> >> settings > > >> > >> >> > > from > > >> > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to > > >> compute > > >> > >> the > > >> > >> >> > > missing > > >> > >> >> > > > values from the loaded configuration. If the latter > would > > >> not > > >> > be > > >> > >> >> > possible > > >> > >> >> > > > (wrong or missing configuration values), then we should > > not > > >> > have > > >> > >> >> been > > >> > >> >> > > able > > >> > >> >> > > > to actually start the process in the first place. > > >> > >> >> > > > > > >> > >> >> > > > - Concerning the memory reservation: I agree with you > that > > >> we > > >> > >> need > > >> > >> >> the > > >> > >> >> > > > memory reservation functionality to make streaming jobs > > work > > >> > with > > >> > >> >> > > "managed" > > >> > >> >> > > > memory. However, w/o this functionality the whole Flip > > would > > >> > >> already > > >> > >> >> > > bring > > >> > >> >> > > > a good amount of improvements to our users when running > > >> batch > > >> > >> jobs. > > >> > >> >> > > > Moreover, by keeping the scope smaller we can complete > the > > >> FLIP > > >> > >> >> faster. > > >> > >> >> > > > Hence, I would propose to address the memory reservation > > >> > >> >> functionality > > >> > >> >> > > as a > > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not > > mistaken). > > >> > >> >> > > > > > >> > >> >> > > > Cheers, > > >> > >> >> > > > Till > > >> > >> >> > > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > >> > >> [hidden email]> > > >> > >> >> > > wrote: > > >> > >> >> > > > > > >> > >> >> > > > > Just add my 2 cents. > > >> > >> >> > > > > > > >> > >> >> > > > > Using environment variables to override the > > configuration > > >> for > > >> > >> >> > different > > >> > >> >> > > > > taskmanagers is better. > > >> > >> >> > > > > We do not need to generate dedicated flink-conf.yaml > for > > >> all > > >> > >> >> > > > taskmanagers. > > >> > >> >> > > > > A common flink-conf.yam and different environment > > >> variables > > >> > are > > >> > >> >> > enough. > > >> > >> >> > > > > By reducing the distributed cached files, it could > make > > >> > >> launching > > >> > >> >> a > > >> > >> >> > > > > taskmanager faster. > > >> > >> >> > > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could move the > > >> logic > > >> > >> into > > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > >> > >> >> > > > > Maybe the client could also benefit from this. > Different > > >> > users > > >> > >> do > > >> > >> >> not > > >> > >> >> > > > have > > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config options. > > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > Best, > > >> > >> >> > > > > Yang > > >> > >> >> > > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 > 上午1:21写道: > > >> > >> >> > > > > > > >> > >> >> > > > > > One note on the Environment Variables and > > Configuration > > >> > >> >> discussion. > > >> > >> >> > > > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables are > > added > > >> to > > >> > >> the > > >> > >> >> > > > > > configuration in the > > "GlobalConfiguration.loadConfig()" > > >> > >> method > > >> > >> >> (or > > >> > >> >> > > > > > similar). > > >> > >> >> > > > > > For all the code inside Flink, it looks like the > data > > >> was > > >> > in > > >> > >> the > > >> > >> >> > > config > > >> > >> >> > > > > to > > >> > >> >> > > > > > start with, just that the scripts that compute the > > >> > variables > > >> > >> can > > >> > >> >> > pass > > >> > >> >> > > > the > > >> > >> >> > > > > > values to the process without actually needing to > > write > > >> a > > >> > >> file. > > >> > >> >> > > > > > > > >> > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" > > >> method > > >> > >> would > > >> > >> >> > take > > >> > >> >> > > > any > > >> > >> >> > > > > > ENV variable prefixed with "flink" and add it as a > > >> config > > >> > >> key. > > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > > >> > >> >> > > > "taskmanager.memory.size: > > >> > >> >> > > > > > 2g". > > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > >> > >> >> > [hidden email]> > > >> > >> >> > > > > > wrote: > > >> > >> >> > > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki page, but > > >> let's > > >> > >> keep > > >> > >> >> the > > >> > >> >> > > > > > > discussion here. > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you > > think > > >> > about > > >> > >> >> > naming > > >> > >> >> > > it > > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > >> > >> >> > > > > > > - Regarding passing memory configurations into > task > > >> > >> executors, > > >> > >> >> > I'm > > >> > >> >> > > in > > >> > >> >> > > > > > favor > > >> > >> >> > > > > > > of do it via environment variables rather than > > >> > >> configurations, > > >> > >> >> > with > > >> > >> >> > > > the > > >> > >> >> > > > > > > following two reasons. > > >> > >> >> > > > > > > - It is easier to keep the memory options once > > >> > calculate > > >> > >> >> not to > > >> > >> >> > > be > > >> > >> >> > > > > > > changed with environment variables rather than > > >> > >> configurations. > > >> > >> >> > > > > > > - I'm not sure whether we should write the > > >> > configuration > > >> > >> in > > >> > >> >> > > startup > > >> > >> >> > > > > > > scripts. Writing changes into the configuration > > files > > >> > when > > >> > >> >> > running > > >> > >> >> > > > the > > >> > >> >> > > > > > > startup scripts does not sounds right to me. Or we > > >> could > > >> > >> make > > >> > >> >> a > > >> > >> >> > > copy > > >> > >> >> > > > of > > >> > >> >> > > > > > > configuration files per flink cluster, and make > the > > >> task > > >> > >> >> executor > > >> > >> >> > > to > > >> > >> >> > > > > load > > >> > >> >> > > > > > > from the copy, and clean up the copy after the > > >> cluster is > > >> > >> >> > shutdown, > > >> > >> >> > > > > which > > >> > >> >> > > > > > > is complicated. (I think this is also what Stephan > > >> means > > >> > in > > >> > >> >> his > > >> > >> >> > > > comment > > >> > >> >> > > > > > on > > >> > >> >> > > > > > > the wiki page?) > > >> > >> >> > > > > > > - Regarding reserving memory, I think this change > > >> should > > >> > be > > >> > >> >> > > included > > >> > >> >> > > > in > > >> > >> >> > > > > > > this FLIP. I think a big part of motivations of > this > > >> FLIP > > >> > >> is > > >> > >> >> to > > >> > >> >> > > unify > > >> > >> >> > > > > > > memory configuration for streaming / batch and > make > > it > > >> > easy > > >> > >> >> for > > >> > >> >> > > > > > configuring > > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > > >> reservation, > > >> > >> then > > >> > >> >> > > > streaming > > >> > >> >> > > > > > jobs > > >> > >> >> > > > > > > cannot use managed memory (neither on-heap or > > >> off-heap), > > >> > >> which > > >> > >> >> > > makes > > >> > >> >> > > > > this > > >> > >> >> > > > > > > FLIP incomplete. > > >> > >> >> > > > > > > - Regarding network memory, I think you are > right. I > > >> > think > > >> > >> we > > >> > >> >> > > > probably > > >> > >> >> > > > > > > don't need to change network stack from using > direct > > >> > >> memory to > > >> > >> >> > > using > > >> > >> >> > > > > > unsafe > > >> > >> >> > > > > > > native memory. Network memory size is > deterministic, > > >> > >> cannot be > > >> > >> >> > > > reserved > > >> > >> >> > > > > > as > > >> > >> >> > > > > > > managed memory does, and cannot be overused. I > think > > >> it > > >> > >> also > > >> > >> >> > works > > >> > >> >> > > if > > >> > >> >> > > > > we > > >> > >> >> > > > > > > simply keep using direct memory for network and > > >> include > > >> > it > > >> > >> in > > >> > >> >> jvm > > >> > >> >> > > max > > >> > >> >> > > > > > > direct memory size. > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > Thank you~ > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > Xintong Song > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > >> > >> >> > > [hidden email]> > > >> > >> >> > > > > > > wrote: > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and adding a > > more > > >> > >> >> detailed > > >> > >> >> > > > > > > > implementation plan. I have a couple of comments > > >> > >> concerning > > >> > >> >> the > > >> > >> >> > > > > > > > implementation plan: > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > > >> > >> >> descriptive. > > >> > >> >> > > > > Choosing > > >> > >> >> > > > > > a > > >> > >> >> > > > > > > > different name could help here. > > >> > >> >> > > > > > > > - I'm not sure whether I would pass the memory > > >> > >> >> configuration to > > >> > >> >> > > the > > >> > >> >> > > > > > > > TaskExecutor via environment variables. I think > it > > >> > would > > >> > >> be > > >> > >> >> > > better > > >> > >> >> > > > to > > >> > >> >> > > > > > > write > > >> > >> >> > > > > > > > it into the configuration one uses to start the > TM > > >> > >> process. > > >> > >> >> > > > > > > > - If possible, I would exclude the memory > > >> reservation > > >> > >> from > > >> > >> >> this > > >> > >> >> > > > FLIP > > >> > >> >> > > > > > and > > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > > >> > >> >> > > > > > > > - If possible, then I would exclude changes to > the > > >> > >> network > > >> > >> >> > stack > > >> > >> >> > > > from > > >> > >> >> > > > > > > this > > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct > > memory > > >> > >> needed > > >> > >> >> by > > >> > >> >> > > the > > >> > >> >> > > > > > > network > > >> > >> >> > > > > > > > stack is the framework direct memory > requirement. > > >> > >> Changing > > >> > >> >> how > > >> > >> >> > > the > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > is allocated can happen in a second step. This > > would > > >> > keep > > >> > >> >> the > > >> > >> >> > > scope > > >> > >> >> > > > > of > > >> > >> >> > > > > > > this > > >> > >> >> > > > > > > > FLIP smaller. > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > Cheers, > > >> > >> >> > > > > > > > Till > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > >> > >> >> > > > [hidden email]> > > >> > >> >> > > > > > > > wrote: > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], > > with > > >> > the > > >> > >> >> > > following > > >> > >> >> > > > > > > changes. > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > > MemorySegment > > >> > >> >> > allocation. > > >> > >> >> > > As > > >> > >> >> > > > > > > > > discussed, we exclude this topic from the > > >> scope of > > >> > >> this > > >> > >> >> > > FLIP. > > >> > >> >> > > > > > > > > - Updated content about JVM direct memory > > >> > parameter > > >> > >> >> > > according > > >> > >> >> > > > to > > >> > >> >> > > > > > > > recent > > >> > >> >> > > > > > > > > discussions, and moved the other options to > > >> > >> "Rejected > > >> > >> >> > > > > > Alternatives" > > >> > >> >> > > > > > > > for > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > moment. > > >> > >> >> > > > > > > > > - Added implementation steps. > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > [1] > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > > >> > >> >> > [hidden email] > > >> > >> >> > > > > > >> > >> >> > > > > > wrote: > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users > > >> before > > >> > >> task > > >> > >> >> > > dispose > > >> > >> >> > > > > and > > >> > >> >> > > > > > > > > memory > > >> > >> >> > > > > > > > > > release": I agree, that's how it should be. > > >> Let's > > >> > >> try it > > >> > >> >> > out. > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does > not > > >> wait > > >> > >> for > > >> > >> >> GC > > >> > >> >> > > when > > >> > >> >> > > > > > > > allocating > > >> > >> >> > > > > > > > > > direct memory buffer": There seems to be > > pretty > > >> > >> >> elaborate > > >> > >> >> > > logic > > >> > >> >> > > > > to > > >> > >> >> > > > > > > free > > >> > >> >> > > > > > > > > > buffers when allocating new ones. See > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM > > default > > >> > works > > >> > >> >> (like > > >> > >> >> > > > going > > >> > >> >> > > > > > > with > > >> > >> >> > > > > > > > > > option 2 and not setting > > >> "-XX:MaxDirectMemorySize" > > >> > at > > >> > >> >> all), > > >> > >> >> > > > then > > >> > >> >> > > > > I > > >> > >> >> > > > > > > > think > > >> > >> >> > > > > > > > > it > > >> > >> >> > > > > > > > > > should be okay to set > > "-XX:MaxDirectMemorySize" > > >> to > > >> > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" > even > > >> if > > >> > we > > >> > >> use > > >> > >> >> > > > RocksDB. > > >> > >> >> > > > > > > That > > >> > >> >> > > > > > > > > is a > > >> > >> >> > > > > > > > > > big if, though, I honestly have no idea :D > > >> Would be > > >> > >> >> good to > > >> > >> >> > > > > > > understand > > >> > >> >> > > > > > > > > > this, though, because this would affect > option > > >> (2) > > >> > >> and > > >> > >> >> > option > > >> > >> >> > > > > > (1.2). > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong > Song < > > >> > >> >> > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. > Please > > >> > correct > > >> > >> >> me if > > >> > >> >> > > I'm > > >> > >> >> > > > > > > wrong. > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always avoid > > >> > returning > > >> > >> >> > memory > > >> > >> >> > > > > > segments > > >> > >> >> > > > > > > > to > > >> > >> >> > > > > > > > > > > memory manager while there are still > > >> > un-cleaned > > >> > >> >> > > > structures / > > >> > >> >> > > > > > > > threads > > >> > >> >> > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it would > > >> cause > > >> > >> >> serious > > >> > >> >> > > > > problems > > >> > >> >> > > > > > > by > > >> > >> >> > > > > > > > > > having > > >> > >> >> > > > > > > > > > > multiple consumers trying to use the > same > > >> > memory > > >> > >> >> > > segment. > > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > > allocating > > >> > >> direct > > >> > >> >> > memory > > >> > >> >> > > > > > buffer. > > >> > >> >> > > > > > > > > > > Therefore even we set proper max direct > > >> memory > > >> > >> size > > >> > >> >> > > limit, > > >> > >> >> > > > > we > > >> > >> >> > > > > > > may > > >> > >> >> > > > > > > > > > still > > >> > >> >> > > > > > > > > > > encounter direct memory oom if the GC > > >> cleaning > > >> > >> >> memory > > >> > >> >> > > > slower > > >> > >> >> > > > > > > than > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > > > direct memory allocation. > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM > JingsongLee > > < > > >> > >> >> > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > .invalid> > > >> > >> >> > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly shut > > down > > >> > >> before > > >> > >> >> we > > >> > >> >> > can > > >> > >> >> > > > > exit > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > task: > > >> > >> >> > > > > > > > > > > > In the current case of memory reuse, it > > has > > >> > >> freed up > > >> > >> >> > the > > >> > >> >> > > > > memory > > >> > >> >> > > > > > > it > > >> > >> >> > > > > > > > > > > > uses. If this memory is used by other > > tasks > > >> > and > > >> > >> >> > > > asynchronous > > >> > >> >> > > > > > > > threads > > >> > >> >> > > > > > > > > > > > of exited task may still be writing, > > there > > >> > will > > >> > >> be > > >> > >> >> > > > > concurrent > > >> > >> >> > > > > > > > > security > > >> > >> >> > > > > > > > > > > > problems, and even lead to errors in > user > > >> > >> computing > > >> > >> >> > > > results. > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > > intolerable > > >> > >> bug, No > > >> > >> >> > > matter > > >> > >> >> > > > > > what > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > > > option is, it should be avoided. > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > > >> > >> encountered so > > >> > >> >> > many > > >> > >> >> > > > > > > > situations > > >> > >> >> > > > > > > > > > > > that it's too late for GC to cause > > >> > DirectMemory > > >> > >> >> OOM. > > >> > >> >> > > > Release > > >> > >> >> > > > > > and > > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on the > type > > of > > >> > user > > >> > >> >> job, > > >> > >> >> > > > which > > >> > >> >> > > > > is > > >> > >> >> > > > > > > > > > > > often beyond our control. > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > > >> > >> >> > > > > > > > > > > > Jingsong Lee > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > >> ------------------------------------------------------------------ > > >> > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified > > >> Memory > > >> > >> >> > > Configuration > > >> > >> >> > > > > for > > >> > >> >> > > > > > > > > > > > TaskExecutors > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 (manually > > >> release > > >> > >> >> memory) > > >> > >> >> > > is > > >> > >> >> > > > > that > > >> > >> >> > > > > > > > > > segfaults > > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms > on > > >> user > > >> > >> >> ends. > > >> > >> >> > So > > >> > >> >> > > we > > >> > >> >> > > > > > need > > >> > >> >> > > > > > > to > > >> > >> >> > > > > > > > > > > > guarantee that this never happens. > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that uses > data > > >> > >> >> structures / > > >> > >> >> > > > > > algorithms > > >> > >> >> > > > > > > > > with > > >> > >> >> > > > > > > > > > > > additional threads, like hash table > > >> spill/read > > >> > >> and > > >> > >> >> > > sorting > > >> > >> >> > > > > > > threads. > > >> > >> >> > > > > > > > > We > > >> > >> >> > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut down > > >> before > > >> > we > > >> > >> can > > >> > >> >> > exit > > >> > >> >> > > > the > > >> > >> >> > > > > > > task. > > >> > >> >> > > > > > > > > > > > I am not sure that we have that > guaranteed > > >> > >> already, > > >> > >> >> > > that's > > >> > >> >> > > > > why > > >> > >> >> > > > > > > > option > > >> > >> >> > > > > > > > > > 1.1 > > >> > >> >> > > > > > > > > > > > seemed simpler to me. > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong > > >> Song < > > >> > >> >> > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. > > >> Summarized > > >> > in > > >> > >> >> this > > >> > >> >> > > way > > >> > >> >> > > > > > really > > >> > >> >> > > > > > > > > makes > > >> > >> >> > > > > > > > > > > > > things easier to understand. > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for > > the > > >> > >> >> moment. I > > >> > >> >> > > > think > > >> > >> >> > > > > it > > >> > >> >> > > > > > > is > > >> > >> >> > > > > > > > > not > > >> > >> >> > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for > > >> memory > > >> > >> >> > manager, > > >> > >> >> > > as > > >> > >> >> > > > > > long > > >> > >> >> > > > > > > as > > >> > >> >> > > > > > > > > we > > >> > >> >> > > > > > > > > > > > always > > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment when it > > is > > >> > >> released > > >> > >> >> > from > > >> > >> >> > > > the > > >> > >> >> > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer > > >> > continue > > >> > >> >> using > > >> > >> >> > > the > > >> > >> >> > > > > > > buffer > > >> > >> >> > > > > > > > of > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > segment after releasing it, in which > > case > > >> we > > >> > do > > >> > >> >> want > > >> > >> >> > > the > > >> > >> >> > > > > job > > >> > >> >> > > > > > to > > >> > >> >> > > > > > > > > fail > > >> > >> >> > > > > > > > > > so > > >> > >> >> > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is > a > > >> good > > >> > >> idea. > > >> > >> >> > Not > > >> > >> >> > > > only > > >> > >> >> > > > > > > > because > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough to > > clean > > >> > >> direct > > >> > >> >> > > buffers) > > >> > >> >> > > > > may > > >> > >> >> > > > > > > not > > >> > >> >> > > > > > > > > > > always > > >> > >> >> > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > true, but also it makes harder for > > finding > > >> > >> >> problems > > >> > >> >> > in > > >> > >> >> > > > > cases > > >> > >> >> > > > > > of > > >> > >> >> > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured some > > direct > > >> > >> memory > > >> > >> >> for > > >> > >> >> > > the > > >> > >> >> > > > > > user > > >> > >> >> > > > > > > > > > > libraries. > > >> > >> >> > > > > > > > > > > > > If the library actually use more > direct > > >> > memory > > >> > >> >> then > > >> > >> >> > > > > > configured, > > >> > >> >> > > > > > > > > which > > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they > are > > >> > still > > >> > >> in > > >> > >> >> > use, > > >> > >> >> > > > may > > >> > >> >> > > > > > lead > > >> > >> >> > > > > > > > to > > >> > >> >> > > > > > > > > > > > overuse > > >> > >> >> > > > > > > > > > > > > of the total container memory. In that > > >> case, > > >> > >> if it > > >> > >> >> > > didn't > > >> > >> >> > > > > > touch > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > default max direct memory limit, we > > cannot > > >> > get > > >> > >> a > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > OOM > > >> > >> >> > > > > > > > > > and > > >> > >> >> > > > > > > > > > > it > > >> > >> >> > > > > > > > > > > > > will become super hard to understand > > which > > >> > >> part of > > >> > >> >> > the > > >> > >> >> > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > > to be updated. > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the similar > > >> problem as > > >> > >> >> 1.2, if > > >> > >> >> > > the > > >> > >> >> > > > > > > > exceeded > > >> > >> >> > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > memory does not reach the max direct > > >> memory > > >> > >> limit > > >> > >> >> > > > specified > > >> > >> >> > > > > > by > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it is > > >> slightly > > >> > >> better > > >> > >> >> > than > > >> > >> >> > > > > 1.2, > > >> > >> >> > > > > > > only > > >> > >> >> > > > > > > > > > > because > > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM > Stephan > > >> Ewen > > >> > < > > >> > >> >> > > > > > [hidden email] > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > > >> > >> discussion, > > >> > >> >> > maybe > > >> > >> >> > > > let > > >> > >> >> > > > > > me > > >> > >> >> > > > > > > > > > > summarize > > >> > >> >> > > > > > > > > > > > > it a > > >> > >> >> > > > > > > > > > > > > > bit differently: > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two options: > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > > >> de-allocated > > >> > by > > >> > >> the > > >> > >> >> > GC. > > >> > >> >> > > > That > > >> > >> >> > > > > > > makes > > >> > >> >> > > > > > > > > it > > >> > >> >> > > > > > > > > > > > > segfault > > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way to > > trigger > > >> GC > > >> > in > > >> > >> >> case > > >> > >> >> > > > > > > > de-allocation > > >> > >> >> > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments > > >> > happens > > >> > >> >> > quickly, > > >> > >> >> > > > > which > > >> > >> >> > > > > > > is > > >> > >> >> > > > > > > > > > often > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > case during batch scheduling or task > > >> > restart. > > >> > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" > > >> (option > > >> > >> 1.1) > > >> > >> >> is > > >> > >> >> > one > > >> > >> >> > > > way > > >> > >> >> > > > > > to > > >> > >> >> > > > > > > do > > >> > >> >> > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > - Another way could be to have a > > >> > dedicated > > >> > >> >> > > > bookkeeping > > >> > >> >> > > > > in > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that > > this > > >> > is a > > >> > >> >> > number > > >> > >> >> > > > > > > > independent > > >> > >> >> > > > > > > > > of > > >> > >> >> > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > > de-allocate > > >> > the > > >> > >> >> memory > > >> > >> >> > > for > > >> > >> >> > > > > the > > >> > >> >> > > > > > > > > > > > > MemorySegments > > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need not > worry > > >> > about > > >> > >> >> > > triggering > > >> > >> >> > > > > GC > > >> > >> >> > > > > > by > > >> > >> >> > > > > > > > > some > > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is > > >> harder > > >> > to > > >> > >> >> > prevent > > >> > >> >> > > > > > > > segfaults. > > >> > >> >> > > > > > > > > We > > >> > >> >> > > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > be very careful about when we > release > > >> the > > >> > >> memory > > >> > >> >> > > > segments > > >> > >> >> > > > > > > (only > > >> > >> >> > > > > > > > > in > > >> > >> >> > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we > probably > > >> need > > >> > to > > >> > >> >> set > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > >> > >> >> > > "off_heap_managed_memory + > > >> > >> >> > > > > > > > > > > direct_memory" > > >> > >> >> > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate > > >> reserved > > >> > >> >> memory > > >> > >> >> > > > pool. > > >> > >> >> > > > > > > > Because > > >> > >> >> > > > > > > > > if > > >> > >> >> > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > just > > >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > >> > >> >> > > > > "off_heap_managed_memory + > > >> > >> >> > > > > > > > > > > > > jvm_overhead", > > >> > >> >> > > > > > > > > > > > > > then there will be times when that > > >> entire > > >> > >> >> memory is > > >> > >> >> > > > > > allocated > > >> > >> >> > > > > > > > by > > >> > >> >> > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing left for > > the > > >> > JVM > > >> > >> >> > > overhead. > > >> > >> >> > > > So > > >> > >> >> > > > > > we > > >> > >> >> > > > > > > > > either > > >> > >> >> > > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > > a > > >> > >> >> > > > > > > > > > > > > > way to compensate for that (again > some > > >> > safety > > >> > >> >> > margin > > >> > >> >> > > > > cutoff > > >> > >> >> > > > > > > > > value) > > >> > >> >> > > > > > > > > > or > > >> > >> >> > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > will exceed container memory. > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to > > be > > >> > aware > > >> > >> >> that > > >> > >> >> > it > > >> > >> >> > > > > takes > > >> > >> >> > > > > > > > > > elaborate > > >> > >> >> > > > > > > > > > > > > logic > > >> > >> >> > > > > > > > > > > > > > to push recycling of direct buffers > > >> without > > >> > >> >> always > > >> > >> >> > > > > > > triggering a > > >> > >> >> > > > > > > > > > full > > >> > >> >> > > > > > > > > > > > GC. > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the options > > will > > >> be > > >> > >> >> easiest > > >> > >> >> > to > > >> > >> >> > > > do > > >> > >> >> > > > > in > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > following > > >> > >> >> > > > > > > > > > > > > > order: > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > > >> > direct_memory > > >> > >> >> > > > parameter, > > >> > >> >> > > > > as > > >> > >> >> > > > > > > > > > discussed > > >> > >> >> > > > > > > > > > > > > > above. We would need to find a way > to > > >> set > > >> > the > > >> > >> >> > > > > direct_memory > > >> > >> >> > > > > > > > > > parameter > > >> > >> >> > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > default. We could start with 64 MB > and > > >> see > > >> > >> how > > >> > >> >> it > > >> > >> >> > > goes > > >> > >> >> > > > in > > >> > >> >> > > > > > > > > practice. > > >> > >> >> > > > > > > > > > > One > > >> > >> >> > > > > > > > > > > > > > danger I see is that setting this > loo > > >> low > > >> > can > > >> > >> >> > cause a > > >> > >> >> > > > > bunch > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > > > > additional > > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we need to > > watch > > >> > this > > >> > >> >> > > > carefully). > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite > > >> simple > > >> > to > > >> > >> >> > > implement, > > >> > >> >> > > > > we > > >> > >> >> > > > > > > > could > > >> > >> >> > > > > > > > > > try > > >> > >> >> > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch > the > > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > >> > >> >> > > > > > > > > > > > > parameter > > >> > >> >> > > > > > > > > > > > > > at all and assume that all the > direct > > >> > memory > > >> > >> >> > > > allocations > > >> > >> >> > > > > > that > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be > > >> > cleaned > > >> > >> up > > >> > >> >> > fast > > >> > >> >> > > > > enough > > >> > >> >> > > > > > > > > through > > >> > >> >> > > > > > > > > > > > > regular > > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > > >> > >> assumption, > > >> > >> >> > > > though. > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > > >> > >> >> > > > > > > > > > > > > > Stephan > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > Xintong > > >> > Song > > >> > >> < > > >> > >> >> > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > > Till. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative > 2. > > I > > >> was > > >> > >> >> > wondering > > >> > >> >> > > > > > whether > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > can > > >> > >> >> > > > > > > > > > > > > avoid > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > off-heap > > >> > >> managed > > >> > >> >> > memory > > >> > >> >> > > > and > > >> > >> >> > > > > > > > network > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > with > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving > it a > > >> > second > > >> > >> >> > > thought, > > >> > >> >> > > > I > > >> > >> >> > > > > > > think > > >> > >> >> > > > > > > > > even > > >> > >> >> > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory > > for > > >> > >> off-heap > > >> > >> >> > > > managed > > >> > >> >> > > > > > > memory > > >> > >> >> > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > cause > > >> > >> >> > > > > > > > > > > > > > > problems. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think > what > > >> > >> proposed > > >> > >> >> in > > >> > >> >> > > this > > >> > >> >> > > > > > FLIP > > >> > >> >> > > > > > > it > > >> > >> >> > > > > > > > > to > > >> > >> >> > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > both > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > network > > >> > memory > > >> > >> >> > > allocated > > >> > >> >> > > > > > > through > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > they > > >> are > > >> > >> >> > practically > > >> > >> >> > > > > > native > > >> > >> >> > > > > > > > > memory > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. > > The > > >> > only > > >> > >> >> parts > > >> > >> >> > of > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > > > limited > > >> > >> >> > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > off-heap > > >> > memory > > >> > >> and > > >> > >> >> > JVM > > >> > >> >> > > > > > > overhead, > > >> > >> >> > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to > > set > > >> the > > >> > >> JVM > > >> > >> >> max > > >> > >> >> > > > > direct > > >> > >> >> > > > > > > > memory > > >> > >> >> > > > > > > > > > to. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM > Till > > >> > >> Rohrmann > > >> > >> >> < > > >> > >> >> > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > >> Xintong. I > > >> > >> >> > > understand > > >> > >> >> > > > > the > > >> > >> >> > > > > > > two > > >> > >> >> > > > > > > > > > > > > alternatives > > >> > >> >> > > > > > > > > > > > > > > > now. > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > > >> > because > > >> > >> it > > >> > >> >> > makes > > >> > >> >> > > > > > things > > >> > >> >> > > > > > > > > > > explicit. > > >> > >> >> > > > > > > > > > > > If > > >> > >> >> > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I > > >> fear > > >> > >> that > > >> > >> >> we > > >> > >> >> > > might > > >> > >> >> > > > > end > > >> > >> >> > > > > > > up > > >> > >> >> > > > > > > > > in a > > >> > >> >> > > > > > > > > > > > > similar > > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently > in: > > >> The > > >> > >> user > > >> > >> >> > might > > >> > >> >> > > > see > > >> > >> >> > > > > > that > > >> > >> >> > > > > > > > her > > >> > >> >> > > > > > > > > > > > process > > >> > >> >> > > > > > > > > > > > > > > gets > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not > know > > >> why > > >> > >> this > > >> > >> >> is > > >> > >> >> > > the > > >> > >> >> > > > > > case. > > >> > >> >> > > > > > > > > > > > > Consequently, > > >> > >> >> > > > > > > > > > > > > > > she > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > > memory > > >> > size > > >> > >> >> > > (similar > > >> > >> >> > > > to > > >> > >> >> > > > > > > > > > increasing > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > cutoff > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate > for > > >> the > > >> > >> extra > > >> > >> >> > > direct > > >> > >> >> > > > > > > memory. > > >> > >> >> > > > > > > > > > Even > > >> > >> >> > > > > > > > > > > > > worse, > > >> > >> >> > > > > > > > > > > > > > > she > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > > >> which > > >> > >> are > > >> > >> >> not > > >> > >> >> > > > fully > > >> > >> >> > > > > > used > > >> > >> >> > > > > > > > and > > >> > >> >> > > > > > > > > > > hence > > >> > >> >> > > > > > > > > > > > > > won't > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > >> consumption. > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > > >> > Xintong > > >> > >> >> Song < > > >> > >> >> > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > >> concrete > > >> > >> >> example > > >> > >> >> > > Till. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > following > > >> > >> scenario. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > Off-Heap > > >> > >> Memory + > > >> > >> >> JVM > > >> > >> >> > > > > > > Overhead): > > >> > >> >> > > > > > > > > > 200MB > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, > > JVM > > >> > >> >> Metaspace, > > >> > >> >> > > > > > Off-Heap > > >> > >> >> > > > > > > > > > Managed > > >> > >> >> > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> > >> >> > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > 200MB. > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> > >> >> > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > to > > >> > >> >> > > > > > a > > >> > >> >> > > > > > > > very > > >> > >> >> > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > value, > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > usage > > >> of > > >> > >> Task > > >> > >> >> > > > Off-Heap > > >> > >> >> > > > > > > Memory > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > >> > alternative 2 > > >> > >> >> and > > >> > >> >> > > > > > > alternative 3 > > >> > >> >> > > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > > > will > > >> > >> >> > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > reduce > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > pools. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > usage > > >> of > > >> > >> Task > > >> > >> >> > > > Off-Heap > > >> > >> >> > > > > > > Memory > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > > 200MB, > > >> > then > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > from > > >> > >> frequent > > >> > >> >> OOM. > > >> > >> >> > > To > > >> > >> >> > > > > > avoid > > >> > >> >> > > > > > > > > that, > > >> > >> >> > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > only > > >> > >> >> > > > > > > > > > > > > > > > thing > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify > the > > >> > >> >> configuration > > >> > >> >> > > and > > >> > >> >> > > > > > > > increase > > >> > >> >> > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > Direct > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > >> > >> Overhead). > > >> > >> >> > Let's > > >> > >> >> > > > say > > >> > >> >> > > > > > > that > > >> > >> >> > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > increases > > >> > >> >> > > > > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > this > > >> will > > >> > >> >> reduce > > >> > >> >> > the > > >> > >> >> > > > > total > > >> > >> >> > > > > > > > size > > >> > >> >> > > > > > > > > of > > >> > >> >> > > > > > > > > > > > other > > >> > >> >> > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > > total > > >> > >> process > > >> > >> >> > > memory > > >> > >> >> > > > > > > remains > > >> > >> >> > > > > > > > > > 1GB. > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there > is > > >> no > > >> > >> >> chance of > > >> > >> >> > > > > direct > > >> > >> >> > > > > > > OOM. > > >> > >> >> > > > > > > > > > There > > >> > >> >> > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > chances > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > process > > >> > >> memory > > >> > >> >> > limit, > > >> > >> >> > > > but > > >> > >> >> > > > > > > given > > >> > >> >> > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > process > > >> > >> >> > > > > > > > > > > > > > > > > may > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > > >> native > > >> > >> >> memory > > >> > >> >> > > > > > (Off-Heap > > >> > >> >> > > > > > > > > > Managed > > >> > >> >> > > > > > > > > > > > > > Memory, > > >> > >> >> > > > > > > > > > > > > > > > > Network > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if > > the > > >> > >> actual > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > > usage > > >> > >> >> > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > slightly > > >> > >> >> > > > > > > > > > > > > > > > > above > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, > user > > >> > >> probably > > >> > >> >> do > > >> > >> >> > > not > > >> > >> >> > > > > need > > >> > >> >> > > > > > > to > > >> > >> >> > > > > > > > > > change > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > > user's > > >> > >> >> > > perspective, a > > >> > >> >> > > > > > > > feasible > > >> > >> >> > > > > > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to > > >> lower > > >> > >> >> resource > > >> > >> >> > > > > > > utilization > > >> > >> >> > > > > > > > > > > compared > > >> > >> >> > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 > AM > > >> Till > > >> > >> >> > Rohrmann > > >> > >> >> > > < > > >> > >> >> > > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > > >> > >> understand > > >> > >> >> the > > >> > >> >> > > > > > difference > > >> > >> >> > > > > > > > > > between > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > >> > utilization > > >> > >> >> > > Xintong. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> > >> >> XX:MaxDirectMemorySize > > >> > >> >> > > to > > >> > >> >> > > > > Task > > >> > >> >> > > > > > > > > > Off-Heap > > >> > >> >> > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the > > risk > > >> > that > > >> > >> >> this > > >> > >> >> > > size > > >> > >> >> > > > > is > > >> > >> >> > > > > > > too > > >> > >> >> > > > > > > > > low > > >> > >> >> > > > > > > > > > > > > > resulting > > >> > >> >> > > > > > > > > > > > > > > > in a > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection > and > > >> > >> >> potentially > > >> > >> >> > an > > >> > >> >> > > > OOM. > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> > >> >> XX:MaxDirectMemorySize > > >> > >> >> > > to > > >> > >> >> > > > > > > > something > > >> > >> >> > > > > > > > > > > larger > > >> > >> >> > > > > > > > > > > > > > than > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > > >> course > > >> > >> >> reduce > > >> > >> >> > > the > > >> > >> >> > > > > > sizes > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > > > > other > > >> > >> >> > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > > >> result > > >> > >> in an > > >> > >> >> > > under > > >> > >> >> > > > > > > > > utilization > > >> > >> >> > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? > If > > >> > >> >> alternative 3 > > >> > >> >> > > > > > strictly > > >> > >> >> > > > > > > > > sets a > > >> > >> >> > > > > > > > > > > > > higher > > >> > >> >> > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we > use > > >> only > > >> > >> >> little, > > >> > >> >> > > > then I > > >> > >> >> > > > > > > would > > >> > >> >> > > > > > > > > > > expect > > >> > >> >> > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > memory > > >> > under > > >> > >> >> > > > > utilization. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 > > PM > > >> > Yang > > >> > >> >> Wang < > > >> > >> >> > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > > >> large > > >> > >> max > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > size > > >> > >> >> > > > > > > > > > > when > > >> > >> >> > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > do > > >> > >> >> > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > > >> native > > >> > >> >> memory. > > >> > >> >> > If > > >> > >> >> > > > the > > >> > >> >> > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > >> > >> >> > > > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > framework > > >> > direct > > >> > >> >> > > > memory,could > > >> > >> >> > > > > > be > > >> > >> >> > > > > > > > > > > calculated > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > > >> direct > > >> > >> memory > > >> > >> >> > with > > >> > >> >> > > > > fixed > > >> > >> >> > > > > > > > > value. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For > > Yarn > > >> > and > > >> > >> >> k8s,we > > >> > >> >> > > > need > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > check > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client > to > > >> avoid > > >> > >> >> > > submitting > > >> > >> >> > > > > > > > > successfully > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > failing > > >> > >> >> > > > > > > > > > > > > > > > > in > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > >> [hidden email] > > >> > >> >> > > > > >于2019年8月13日 > > >> > >> >> > > > > > > > > > 周二22:07写道: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > Till. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > > think > > >> > you > > >> > >> are > > >> > >> >> > > right > > >> > >> >> > > > > that > > >> > >> >> > > > > > > we > > >> > >> >> > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > include > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of > this > > >> > FLIP. > > >> > >> >> This > > >> > >> >> > > FLIP > > >> > >> >> > > > > > should > > >> > >> >> > > > > > > > > > > > concentrate > > >> > >> >> > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools > for > > >> > >> >> > TaskExecutors, > > >> > >> >> > > > > with > > >> > >> >> > > > > > > > > minimum > > >> > >> >> > > > > > > > > > > > > > > involvement > > >> > >> >> > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > > think > > >> > >> >> > alternative > > >> > >> >> > > 3 > > >> > >> >> > > > > may > > >> > >> >> > > > > > > not > > >> > >> >> > > > > > > > > > having > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > same > > >> > >> >> > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > >> > >> alternative 2 > > >> > >> >> > > does, > > >> > >> >> > > > > but > > >> > >> >> > > > > > at > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > > cost > > >> > >> >> > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > risk > > >> > >> >> > > > > > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > >> container > > >> > >> level, > > >> > >> >> > > which > > >> > >> >> > > > is > > >> > >> >> > > > > > not > > >> > >> >> > > > > > > > > good. > > >> > >> >> > > > > > > > > > > My > > >> > >> >> > > > > > > > > > > > > > point > > >> > >> >> > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > Memory" > > >> and > > >> > >> "JVM > > >> > >> >> > > > > Overhead" > > >> > >> >> > > > > > > are > > >> > >> >> > > > > > > > > not > > >> > >> >> > > > > > > > > > > easy > > >> > >> >> > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > config. > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > might > > >> > >> configure > > >> > >> >> > them > > >> > >> >> > > > > > higher > > >> > >> >> > > > > > > > than > > >> > >> >> > > > > > > > > > > what > > >> > >> >> > > > > > > > > > > > > > > actually > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > > >> direct > > >> > >> OOM. > > >> > >> >> For > > >> > >> >> > > > > > > alternative > > >> > >> >> > > > > > > > > 3, > > >> > >> >> > > > > > > > > > > > users > > >> > >> >> > > > > > > > > > > > > do > > >> > >> >> > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > get > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may > > not > > >> > >> config > > >> > >> >> the > > >> > >> >> > > two > > >> > >> >> > > > > > > options > > >> > >> >> > > > > > > > > > > > > aggressively > > >> > >> >> > > > > > > > > > > > > > > > high. > > >> > >> >> > > > > > > > > > > > > > > > > > But > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > risks > > >> of > > >> > >> >> overall > > >> > >> >> > > > > container > > >> > >> >> > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > usage > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > >> > >> >> > > > > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > > >> 9:39 AM > > >> > >> Till > > >> > >> >> > > > > Rohrmann < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > > this > > >> > FLIP > > >> > >> >> > Xintong. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > > >> already > > >> > >> >> looks > > >> > >> >> > > quite > > >> > >> >> > > > > > good. > > >> > >> >> > > > > > > > > > > > Concerning > > >> > >> >> > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > first > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > allocating > > >> > >> memory > > >> > >> >> > > > segments, > > >> > >> >> > > > > I > > >> > >> >> > > > > > > was > > >> > >> >> > > > > > > > > > > > wondering > > >> > >> >> > > > > > > > > > > > > > > > whether > > >> > >> >> > > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to > do > > >> in > > >> > the > > >> > >> >> > context > > >> > >> >> > > > of > > >> > >> >> > > > > > this > > >> > >> >> > > > > > > > > FLIP > > >> > >> >> > > > > > > > > > or > > >> > >> >> > > > > > > > > > > > > > whether > > >> > >> >> > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow > up? > > >> > Without > > >> > >> >> > knowing > > >> > >> >> > > > all > > >> > >> >> > > > > > > > > details, > > >> > >> >> > > > > > > > > > I > > >> > >> >> > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen > the > > >> scope > > >> > >> of > > >> > >> >> this > > >> > >> >> > > > FLIP > > >> > >> >> > > > > > too > > >> > >> >> > > > > > > > much > > >> > >> >> > > > > > > > > > > > because > > >> > >> >> > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > existing > > >> > call > > >> > >> >> sites > > >> > >> >> > of > > >> > >> >> > > > the > > >> > >> >> > > > > > > > > > > MemoryManager > > >> > >> >> > > > > > > > > > > > > > where > > >> > >> >> > > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > > >> should > > >> > >> >> mainly > > >> > >> >> > be > > >> > >> >> > > > > batch > > >> > >> >> > > > > > > > > > > operators). > > >> > >> >> > > > > > > > > > > > > The > > >> > >> >> > > > > > > > > > > > > > > > > addition > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation > > >> call > > >> > to > > >> > >> the > > >> > >> >> > > > > > > MemoryManager > > >> > >> >> > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope > > that > > >> > >> this is > > >> > >> >> > the > > >> > >> >> > > > only > > >> > >> >> > > > > > > point > > >> > >> >> > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > interaction > > >> > >> >> > > > > > > > > > > > > > > > a > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would > have > > >> with > > >> > >> the > > >> > >> >> > > > > > > MemoryManager. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second > > open > > >> > >> >> question > > >> > >> >> > > about > > >> > >> >> > > > > > > setting > > >> > >> >> > > > > > > > > or > > >> > >> >> > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > setting > > >> > >> >> > > > > > > > > > > > > > > > a > > >> > >> >> > > > > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > > >> would > > >> > >> also > > >> > >> >> be > > >> > >> >> > > > > > interested > > >> > >> >> > > > > > > > why > > >> > >> >> > > > > > > > > > > Yang > > >> > >> >> > > > > > > > > > > > > Wang > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would > be > > >> > best. > > >> > >> My > > >> > >> >> > > concern > > >> > >> >> > > > > > about > > >> > >> >> > > > > > > > > this > > >> > >> >> > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > situation > > >> as > > >> > we > > >> > >> >> are > > >> > >> >> > now > > >> > >> >> > > > > with > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > > pools > > >> > are > > >> > >> not > > >> > >> >> > > > clearly > > >> > >> >> > > > > > > > > separated > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > can > > >> > >> >> > > > > > > > > > > > > > > > spill > > >> > >> >> > > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then > > it > > >> is > > >> > >> quite > > >> > >> >> > hard > > >> > >> >> > > > to > > >> > >> >> > > > > > > > > understand > > >> > >> >> > > > > > > > > > > > what > > >> > >> >> > > > > > > > > > > > > > > > exactly > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed > > for > > >> > using > > >> > >> >> too > > >> > >> >> > > much > > >> > >> >> > > > > > > memory. > > >> > >> >> > > > > > > > > This > > >> > >> >> > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > then > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > >> situation > > >> > >> what > > >> > >> >> we > > >> > >> >> > > have > > >> > >> >> > > > > with > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > >> > >> >> > > > > > > > > > > > > > > > So > > >> > >> >> > > > > > > > > > > > > > > > > > why > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > > >> value > > >> > >> for > > >> > >> >> max > > >> > >> >> > > > direct > > >> > >> >> > > > > > > > memory > > >> > >> >> > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > giving > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it > if > > >> he > > >> > >> runs > > >> > >> >> into > > >> > >> >> > > an > > >> > >> >> > > > > OOM. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > >> > >> alternative 2 > > >> > >> >> > lead > > >> > >> >> > > to > > >> > >> >> > > > > > lower > > >> > >> >> > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > utilization > > >> > >> >> > > > > > > > > > > > > > > > > > than > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we > > set > > >> > the > > >> > >> >> direct > > >> > >> >> > > > > memory > > >> > >> >> > > > > > > to a > > >> > >> >> > > > > > > > > > > higher > > >> > >> >> > > > > > > > > > > > > > value? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at > > >> 9:12 > > >> > AM > > >> > >> >> > Xintong > > >> > >> >> > > > > Song < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > feedback, > > >> > >> Yang. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > comments: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > > >> Memory* > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a > very > > >> > large > > >> > >> max > > >> > >> >> > > direct > > >> > >> >> > > > > > > memory > > >> > >> >> > > > > > > > > size > > >> > >> >> > > > > > > > > > > > > > > definitely > > >> > >> >> > > > > > > > > > > > > > > > > has > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we > > do > > >> not > > >> > >> >> worry > > >> > >> >> > > about > > >> > >> >> > > > > > > direct > > >> > >> >> > > > > > > > > OOM, > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > don't > > >> > >> >> > > > > > > > > > > > > > > > > > even > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed > / > > >> > network > > >> > >> >> > memory > > >> > >> >> > > > with > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > > also > > >> > some > > >> > >> >> down > > >> > >> >> > > sides > > >> > >> >> > > > > of > > >> > >> >> > > > > > > > doing > > >> > >> >> > > > > > > > > > > this. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > > >> think > > >> > >> of is > > >> > >> >> > that > > >> > >> >> > > > if > > >> > >> >> > > > > a > > >> > >> >> > > > > > > task > > >> > >> >> > > > > > > > > > > > executor > > >> > >> >> > > > > > > > > > > > > > > > > container > > >> > >> >> > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > >> overusing > > >> > >> >> memory, > > >> > >> >> > it > > >> > >> >> > > > > could > > >> > >> >> > > > > > > be > > >> > >> >> > > > > > > > > hard > > >> > >> >> > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > use > > >> > >> >> > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > know > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > > >> > overused. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down > side > > >> is > > >> > >> that > > >> > >> >> the > > >> > >> >> > > JVM > > >> > >> >> > > > > > never > > >> > >> >> > > > > > > > > > trigger > > >> > >> >> > > > > > > > > > > GC > > >> > >> >> > > > > > > > > > > > > due > > >> > >> >> > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > limit, > > >> > >> because > > >> > >> >> the > > >> > >> >> > > > limit > > >> > >> >> > > > > > is > > >> > >> >> > > > > > > > too > > >> > >> >> > > > > > > > > > high > > >> > >> >> > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of > > >> relay > > >> > on > > >> > >> >> heap > > >> > >> >> > > > memory > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > > trigger > > >> > >> >> > > > > > > > > > > GC > > >> > >> >> > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > release > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > could > > >> be a > > >> > >> >> problem > > >> > >> >> > in > > >> > >> >> > > > > cases > > >> > >> >> > > > > > > > where > > >> > >> >> > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > more > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > enough > > >> > heap > > >> > >> >> > activity > > >> > >> >> > > > to > > >> > >> >> > > > > > > > trigger > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > GC. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share > > your > > >> > >> reasons > > >> > >> >> > for > > >> > >> >> > > > > > > preferring > > >> > >> >> > > > > > > > > > > > setting a > > >> > >> >> > > > > > > > > > > > > > > very > > >> > >> >> > > > > > > > > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > anything > > >> else > > >> > I > > >> > >> >> > > > overlooked. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > conflict > > >> > >> between > > >> > >> >> > > > multiple > > >> > >> >> > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > specified, > > I > > >> > >> think we > > >> > >> >> > > should > > >> > >> >> > > > > > throw > > >> > >> >> > > > > > > > an > > >> > >> >> > > > > > > > > > > error. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > checking > > >> on > > >> > the > > >> > >> >> > client > > >> > >> >> > > > side > > >> > >> >> > > > > > is > > >> > >> >> > > > > > > a > > >> > >> >> > > > > > > > > good > > >> > >> >> > > > > > > > > > > > idea, > > >> > >> >> > > > > > > > > > > > > > so > > >> > >> >> > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover > > the > > >> > >> problem > > >> > >> >> > > before > > >> > >> >> > > > > > > > submitting > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > Flink > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > thing. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only > > >> rely on > > >> > >> the > > >> > >> >> > > client > > >> > >> >> > > > > side > > >> > >> >> > > > > > > > > > checking, > > >> > >> >> > > > > > > > > > > > > > because > > >> > >> >> > > > > > > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > >> > >> TaskManagers > > >> > >> >> on > > >> > >> >> > > > > > different > > >> > >> >> > > > > > > > > > machines > > >> > >> >> > > > > > > > > > > > may > > >> > >> >> > > > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and > the > > >> > client > > >> > >> >> does > > >> > >> >> > > see > > >> > >> >> > > > > > that. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 > at > > >> 5:09 > > >> > >> PM > > >> > >> >> Yang > > >> > >> >> > > > Wang > > >> > >> >> > > > > < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > > >> detailed > > >> > >> >> > proposal. > > >> > >> >> > > > > After > > >> > >> >> > > > > > > all > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > will > > be > > >> > more > > >> > >> >> > > powerful > > >> > >> >> > > > to > > >> > >> >> > > > > > > > control > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > flink > > >> > >> >> > > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > >> questions > > >> > >> about > > >> > >> >> it. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > > Direct > > >> > >> Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > >> differentiate > > >> > >> user > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > and > > >> > >> >> > > > > > > > > > > native > > >> > >> >> > > > > > > > > > > > > > > memory. > > >> > >> >> > > > > > > > > > > > > > > > > > They > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > > >> off-heap > > >> > >> >> memory. > > >> > >> >> > > > > Right? > > >> > >> >> > > > > > > So i > > >> > >> >> > > > > > > > > > don’t > > >> > >> >> > > > > > > > > > > > > think > > >> > >> >> > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > >> > -XX:MaxDirectMemorySize > > >> > >> >> > > > properly. I > > >> > >> >> > > > > > > > prefer > > >> > >> >> > > > > > > > > > > > leaving > > >> > >> >> > > > > > > > > > > > > > it a > > >> > >> >> > > > > > > > > > > > > > > > > very > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > >> Calculation > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > >> > >> fine-grained > > >> > >> >> > > > > > > memory(network > > >> > >> >> > > > > > > > > > > memory, > > >> > >> >> > > > > > > > > > > > > > > managed > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > total > > >> > >> process > > >> > >> >> > > memory, > > >> > >> >> > > > > how > > >> > >> >> > > > > > do > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > deal > > >> > >> >> > > > > > > > > > > > > with > > >> > >> >> > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check > the > > >> > memory > > >> > >> >> > > > > configuration > > >> > >> >> > > > > > > in > > >> > >> >> > > > > > > > > > > client? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > >> >> > > [hidden email]> > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > > >> start > > >> > a > > >> > >> >> > > discussion > > >> > >> >> > > > > > > thread > > >> > >> >> > > > > > > > on > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration > for > > >> > >> >> > > > TaskExecutors"[1], > > >> > >> >> > > > > > > where > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > > > > describe > > >> > >> >> > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > memory > > >> > >> >> > > configurations. > > >> > >> >> > > > > The > > >> > >> >> > > > > > > > FLIP > > >> > >> >> > > > > > > > > > > > document > > >> > >> >> > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > "Memory > > >> > >> >> Management > > >> > >> >> > > and > > >> > >> >> > > > > > > > > > Configuration > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> > >> >> > > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > from > > >> > >> follow-up > > >> > >> >> > > > > discussions > > >> > >> >> > > > > > > > both > > >> > >> >> > > > > > > > > > > online > > >> > >> >> > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > addresses > > >> > >> several > > >> > >> >> > > > > > shortcomings > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > > > current > > >> > >> >> > > > > > > > > > > > > > > (Flink > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > memory > > >> > >> >> > > configuration. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > >> > >> configuration > > >> > >> >> > for > > >> > >> >> > > > > > > Streaming > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > > Batch. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > > >> > >> difficult > > >> > >> >> > > > > > configuration > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > > > RocksDB > > >> > >> >> > > > > > > > > > > > > in > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Complicated, > > >> > >> uncertain > > >> > >> >> and > > >> > >> >> > > > hard > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > > > > understand. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > > solve > > >> > the > > >> > >> >> > problems > > >> > >> >> > > > can > > >> > >> >> > > > > > be > > >> > >> >> > > > > > > > > > > summarized > > >> > >> >> > > > > > > > > > > > > as > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > memory > > >> > >> manager > > >> > >> >> to > > >> > >> >> > > also > > >> > >> >> > > > > > > account > > >> > >> >> > > > > > > > > for > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > usage > > >> > >> >> > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > >> > >> TaskExecutor > > >> > >> >> > > memory > > >> > >> >> > > > > is > > >> > >> >> > > > > > > > > > > partitioned > > >> > >> >> > > > > > > > > > > > > > > > accounted > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > >> reservations > > >> > >> and > > >> > >> >> > pools. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > > memory > > >> > >> >> > > configuration > > >> > >> >> > > > > > > options > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > calculations > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > > >> > details > > >> > >> in > > >> > >> >> the > > >> > >> >> > > > FLIP > > >> > >> >> > > > > > wiki > > >> > >> >> > > > > > > > > > > document > > >> > >> >> > > > > > > > > > > > > [1]. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > that > > >> the > > >> > >> early > > >> > >> >> > > design > > >> > >> >> > > > > doc > > >> > >> >> > > > > > > [2] > > >> > >> >> > > > > > > > is > > >> > >> >> > > > > > > > > > out > > >> > >> >> > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > sync, > > >> > >> >> > > > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > > it > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > > have > > >> the > > >> > >> >> > > discussion > > >> > >> >> > > > in > > >> > >> >> > > > > > > this > > >> > >> >> > > > > > > > > > > mailing > > >> > >> >> > > > > > > > > > > > > list > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward > to > > >> your > > >> > >> >> > > feedbacks. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > Xintong > > >> > Song > > >> > >> < > > >> > >> >> > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > > Till. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative > 2. > > I > > >> was > > >> > >> >> > wondering > > >> > >> >> > > > > > whether > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > can > > >> > >> >> > > > > > > > > > > > > avoid > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > off-heap > > >> > >> managed > > >> > >> >> > memory > > >> > >> >> > > > and > > >> > >> >> > > > > > > > network > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > with > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving > it a > > >> > second > > >> > >> >> > > thought, > > >> > >> >> > > > I > > >> > >> >> > > > > > > think > > >> > >> >> > > > > > > > > even > > >> > >> >> > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory > > for > > >> > >> off-heap > > >> > >> >> > > > managed > > >> > >> >> > > > > > > memory > > >> > >> >> > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > cause > > >> > >> >> > > > > > > > > > > > > > > problems. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think > what > > >> > >> proposed > > >> > >> >> in > > >> > >> >> > > this > > >> > >> >> > > > > > FLIP > > >> > >> >> > > > > > > it > > >> > >> >> > > > > > > > > to > > >> > >> >> > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > both > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > network > > >> > memory > > >> > >> >> > > allocated > > >> > >> >> > > > > > > through > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > they > > >> are > > >> > >> >> > practically > > >> > >> >> > > > > > native > > >> > >> >> > > > > > > > > memory > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. > > The > > >> > only > > >> > >> >> parts > > >> > >> >> > of > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > > > limited > > >> > >> >> > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > off-heap > > >> > memory > > >> > >> and > > >> > >> >> > JVM > > >> > >> >> > > > > > > overhead, > > >> > >> >> > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to > > set > > >> the > > >> > >> JVM > > >> > >> >> max > > >> > >> >> > > > > direct > > >> > >> >> > > > > > > > memory > > >> > >> >> > > > > > > > > > to. > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM > Till > > >> > >> Rohrmann > > >> > >> >> < > > >> > >> >> > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > >> Xintong. I > > >> > >> >> > > understand > > >> > >> >> > > > > the > > >> > >> >> > > > > > > two > > >> > >> >> > > > > > > > > > > > > alternatives > > >> > >> >> > > > > > > > > > > > > > > > now. > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > > >> > because > > >> > >> it > > >> > >> >> > makes > > >> > >> >> > > > > > things > > >> > >> >> > > > > > > > > > > explicit. > > >> > >> >> > > > > > > > > > > > If > > >> > >> >> > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I > > >> fear > > >> > >> that > > >> > >> >> we > > >> > >> >> > > might > > >> > >> >> > > > > end > > >> > >> >> > > > > > > up > > >> > >> >> > > > > > > > > in a > > >> > >> >> > > > > > > > > > > > > similar > > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently > in: > > >> The > > >> > >> user > > >> > >> >> > might > > >> > >> >> > > > see > > >> > >> >> > > > > > that > > >> > >> >> > > > > > > > her > > >> > >> >> > > > > > > > > > > > process > > >> > >> >> > > > > > > > > > > > > > > gets > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not > know > > >> why > > >> > >> this > > >> > >> >> is > > >> > >> >> > > the > > >> > >> >> > > > > > case. > > >> > >> >> > > > > > > > > > > > > Consequently, > > >> > >> >> > > > > > > > > > > > > > > she > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > > memory > > >> > size > > >> > >> >> > > (similar > > >> > >> >> > > > to > > >> > >> >> > > > > > > > > > increasing > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > cutoff > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate > for > > >> the > > >> > >> extra > > >> > >> >> > > direct > > >> > >> >> > > > > > > memory. > > >> > >> >> > > > > > > > > > Even > > >> > >> >> > > > > > > > > > > > > worse, > > >> > >> >> > > > > > > > > > > > > > > she > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets > > >> which > > >> > >> are > > >> > >> >> not > > >> > >> >> > > > fully > > >> > >> >> > > > > > used > > >> > >> >> > > > > > > > and > > >> > >> >> > > > > > > > > > > hence > > >> > >> >> > > > > > > > > > > > > > won't > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > >> consumption. > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > > >> > Xintong > > >> > >> >> Song < > > >> > >> >> > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > >> concrete > > >> > >> >> example > > >> > >> >> > > Till. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > following > > >> > >> scenario. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > Off-Heap > > >> > >> Memory + > > >> > >> >> JVM > > >> > >> >> > > > > > > Overhead): > > >> > >> >> > > > > > > > > > 200MB > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, > > JVM > > >> > >> >> Metaspace, > > >> > >> >> > > > > > Off-Heap > > >> > >> >> > > > > > > > > > Managed > > >> > >> >> > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > >> > >> >> > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > 200MB. > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > >> > >> >> > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > to > > >> > >> >> > > > > > a > > >> > >> >> > > > > > > > very > > >> > >> >> > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > value, > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > usage > > >> of > > >> > >> Task > > >> > >> >> > > > Off-Heap > > >> > >> >> > > > > > > Memory > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > >> > alternative 2 > > >> > >> >> and > > >> > >> >> > > > > > > alternative 3 > > >> > >> >> > > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > >> > >> >> > > > > > > will > > >> > >> >> > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > reduce > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > pools. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > usage > > >> of > > >> > >> Task > > >> > >> >> > > > Off-Heap > > >> > >> >> > > > > > > Memory > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > > 200MB, > > >> > then > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > from > > >> > >> frequent > > >> > >> >> OOM. > > >> > >> >> > > To > > >> > >> >> > > > > > avoid > > >> > >> >> > > > > > > > > that, > > >> > >> >> > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > only > > >> > >> >> > > > > > > > > > > > > > > > thing > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify > the > > >> > >> >> configuration > > >> > >> >> > > and > > >> > >> >> > > > > > > > increase > > >> > >> >> > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > Direct > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > > >> > >> Overhead). > > >> > >> >> > Let's > > >> > >> >> > > > say > > >> > >> >> > > > > > > that > > >> > >> >> > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > increases > > >> > >> >> > > > > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > this > > >> will > > >> > >> >> reduce > > >> > >> >> > the > > >> > >> >> > > > > total > > >> > >> >> > > > > > > > size > > >> > >> >> > > > > > > > > of > > >> > >> >> > > > > > > > > > > > other > > >> > >> >> > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > > total > > >> > >> process > > >> > >> >> > > memory > > >> > >> >> > > > > > > remains > > >> > >> >> > > > > > > > > > 1GB. > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there > is > > >> no > > >> > >> >> chance of > > >> > >> >> > > > > direct > > >> > >> >> > > > > > > OOM. > > >> > >> >> > > > > > > > > > There > > >> > >> >> > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > chances > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > process > > >> > >> memory > > >> > >> >> > limit, > > >> > >> >> > > > but > > >> > >> >> > > > > > > given > > >> > >> >> > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > process > > >> > >> >> > > > > > > > > > > > > > > > > may > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the reserved > > >> native > > >> > >> >> memory > > >> > >> >> > > > > > (Off-Heap > > >> > >> >> > > > > > > > > > Managed > > >> > >> >> > > > > > > > > > > > > > Memory, > > >> > >> >> > > > > > > > > > > > > > > > > Network > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if > > the > > >> > >> actual > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > > usage > > >> > >> >> > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > slightly > > >> > >> >> > > > > > > > > > > > > > > > > above > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, > user > > >> > >> probably > > >> > >> >> do > > >> > >> >> > > not > > >> > >> >> > > > > need > > >> > >> >> > > > > > > to > > >> > >> >> > > > > > > > > > change > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > > user's > > >> > >> >> > > perspective, a > > >> > >> >> > > > > > > > feasible > > >> > >> >> > > > > > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to > > >> lower > > >> > >> >> resource > > >> > >> >> > > > > > > utilization > > >> > >> >> > > > > > > > > > > compared > > >> > >> >> > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 > AM > > >> Till > > >> > >> >> > Rohrmann > > >> > >> >> > > < > > >> > >> >> > > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > > >> > >> understand > > >> > >> >> the > > >> > >> >> > > > > > difference > > >> > >> >> > > > > > > > > > between > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > >> > utilization > > >> > >> >> > > Xintong. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > >> > >> >> XX:MaxDirectMemorySize > > >> > >> >> > > to > > >> > >> >> > > > > Task > > >> > >> >> > > > > > > > > > Off-Heap > > >> > >> >> > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the > > risk > > >> > that > > >> > >> >> this > > >> > >> >> > > size > > >> > >> >> > > > > is > > >> > >> >> > > > > > > too > > >> > >> >> > > > > > > > > low > > >> > >> >> > > > > > > > > > > > > > resulting > > >> > >> >> > > > > > > > > > > > > > > > in a > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection > and > > >> > >> >> potentially > > >> > >> >> > an > > >> > >> >> > > > OOM. > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > >> > >> >> XX:MaxDirectMemorySize > > >> > >> >> > > to > > >> > >> >> > > > > > > > something > > >> > >> >> > > > > > > > > > > larger > > >> > >> >> > > > > > > > > > > > > > than > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of > > >> course > > >> > >> >> reduce > > >> > >> >> > > the > > >> > >> >> > > > > > sizes > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > > > > other > > >> > >> >> > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now > > >> result > > >> > >> in an > > >> > >> >> > > under > > >> > >> >> > > > > > > > > utilization > > >> > >> >> > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? > If > > >> > >> >> alternative 3 > > >> > >> >> > > > > > strictly > > >> > >> >> > > > > > > > > sets a > > >> > >> >> > > > > > > > > > > > > higher > > >> > >> >> > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we > use > > >> only > > >> > >> >> little, > > >> > >> >> > > > then I > > >> > >> >> > > > > > > would > > >> > >> >> > > > > > > > > > > expect > > >> > >> >> > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > memory > > >> > under > > >> > >> >> > > > > utilization. > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 > > PM > > >> > Yang > > >> > >> >> Wang < > > >> > >> >> > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very > > >> large > > >> > >> max > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > size > > >> > >> >> > > > > > > > > > > when > > >> > >> >> > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > do > > >> > >> >> > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > > >> native > > >> > >> >> memory. > > >> > >> >> > If > > >> > >> >> > > > the > > >> > >> >> > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > >> > >> >> > > > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > framework > > >> > direct > > >> > >> >> > > > memory,could > > >> > >> >> > > > > > be > > >> > >> >> > > > > > > > > > > calculated > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > > >> direct > > >> > >> memory > > >> > >> >> > with > > >> > >> >> > > > > fixed > > >> > >> >> > > > > > > > > value. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For > > Yarn > > >> > and > > >> > >> >> k8s,we > > >> > >> >> > > > need > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > check > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client > to > > >> avoid > > >> > >> >> > > submitting > > >> > >> >> > > > > > > > > successfully > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > failing > > >> > >> >> > > > > > > > > > > > > > > > > in > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > >> [hidden email] > > >> > >> >> > > > > >于2019年8月13日 > > >> > >> >> > > > > > > > > > 周二22:07写道: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > Till. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > > think > > >> > you > > >> > >> are > > >> > >> >> > > right > > >> > >> >> > > > > that > > >> > >> >> > > > > > > we > > >> > >> >> > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > include > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of > this > > >> > FLIP. > > >> > >> >> This > > >> > >> >> > > FLIP > > >> > >> >> > > > > > should > > >> > >> >> > > > > > > > > > > > concentrate > > >> > >> >> > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools > for > > >> > >> >> > TaskExecutors, > > >> > >> >> > > > > with > > >> > >> >> > > > > > > > > minimum > > >> > >> >> > > > > > > > > > > > > > > involvement > > >> > >> >> > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > > think > > >> > >> >> > alternative > > >> > >> >> > > 3 > > >> > >> >> > > > > may > > >> > >> >> > > > > > > not > > >> > >> >> > > > > > > > > > having > > >> > >> >> > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > same > > >> > >> >> > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > >> > >> alternative 2 > > >> > >> >> > > does, > > >> > >> >> > > > > but > > >> > >> >> > > > > > at > > >> > >> >> > > > > > > > the > > >> > >> >> > > > > > > > > > > cost > > >> > >> >> > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > risk > > >> > >> >> > > > > > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > >> container > > >> > >> level, > > >> > >> >> > > which > > >> > >> >> > > > is > > >> > >> >> > > > > > not > > >> > >> >> > > > > > > > > good. > > >> > >> >> > > > > > > > > > > My > > >> > >> >> > > > > > > > > > > > > > point > > >> > >> >> > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > Memory" > > >> and > > >> > >> "JVM > > >> > >> >> > > > > Overhead" > > >> > >> >> > > > > > > are > > >> > >> >> > > > > > > > > not > > >> > >> >> > > > > > > > > > > easy > > >> > >> >> > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > config. > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > might > > >> > >> configure > > >> > >> >> > them > > >> > >> >> > > > > > higher > > >> > >> >> > > > > > > > than > > >> > >> >> > > > > > > > > > > what > > >> > >> >> > > > > > > > > > > > > > > actually > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a > > >> direct > > >> > >> OOM. > > >> > >> >> For > > >> > >> >> > > > > > > alternative > > >> > >> >> > > > > > > > > 3, > > >> > >> >> > > > > > > > > > > > users > > >> > >> >> > > > > > > > > > > > > do > > >> > >> >> > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > get > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may > > not > > >> > >> config > > >> > >> >> the > > >> > >> >> > > two > > >> > >> >> > > > > > > options > > >> > >> >> > > > > > > > > > > > > aggressively > > >> > >> >> > > > > > > > > > > > > > > > high. > > >> > >> >> > > > > > > > > > > > > > > > > > But > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > risks > > >> of > > >> > >> >> overall > > >> > >> >> > > > > container > > >> > >> >> > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > usage > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > >> > >> >> > > > > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > > >> 9:39 AM > > >> > >> Till > > >> > >> >> > > > > Rohrmann < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > > this > > >> > FLIP > > >> > >> >> > Xintong. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it > > >> already > > >> > >> >> looks > > >> > >> >> > > quite > > >> > >> >> > > > > > good. > > >> > >> >> > > > > > > > > > > > Concerning > > >> > >> >> > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > first > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > allocating > > >> > >> memory > > >> > >> >> > > > segments, > > >> > >> >> > > > > I > > >> > >> >> > > > > > > was > > >> > >> >> > > > > > > > > > > > wondering > > >> > >> >> > > > > > > > > > > > > > > > whether > > >> > >> >> > > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to > do > > >> in > > >> > the > > >> > >> >> > context > > >> > >> >> > > > of > > >> > >> >> > > > > > this > > >> > >> >> > > > > > > > > FLIP > > >> > >> >> > > > > > > > > > or > > >> > >> >> > > > > > > > > > > > > > whether > > >> > >> >> > > > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow > up? > > >> > Without > > >> > >> >> > knowing > > >> > >> >> > > > all > > >> > >> >> > > > > > > > > details, > > >> > >> >> > > > > > > > > > I > > >> > >> >> > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen > the > > >> scope > > >> > >> of > > >> > >> >> this > > >> > >> >> > > > FLIP > > >> > >> >> > > > > > too > > >> > >> >> > > > > > > > much > > >> > >> >> > > > > > > > > > > > because > > >> > >> >> > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > existing > > >> > call > > >> > >> >> sites > > >> > >> >> > of > > >> > >> >> > > > the > > >> > >> >> > > > > > > > > > > MemoryManager > > >> > >> >> > > > > > > > > > > > > > where > > >> > >> >> > > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this > > >> should > > >> > >> >> mainly > > >> > >> >> > be > > >> > >> >> > > > > batch > > >> > >> >> > > > > > > > > > > operators). > > >> > >> >> > > > > > > > > > > > > The > > >> > >> >> > > > > > > > > > > > > > > > > addition > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation > > >> call > > >> > to > > >> > >> the > > >> > >> >> > > > > > > MemoryManager > > >> > >> >> > > > > > > > > > should > > >> > >> >> > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope > > that > > >> > >> this is > > >> > >> >> > the > > >> > >> >> > > > only > > >> > >> >> > > > > > > point > > >> > >> >> > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > interaction > > >> > >> >> > > > > > > > > > > > > > > > a > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would > have > > >> with > > >> > >> the > > >> > >> >> > > > > > > MemoryManager. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second > > open > > >> > >> >> question > > >> > >> >> > > about > > >> > >> >> > > > > > > setting > > >> > >> >> > > > > > > > > or > > >> > >> >> > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > setting > > >> > >> >> > > > > > > > > > > > > > > > a > > >> > >> >> > > > > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I > > >> would > > >> > >> also > > >> > >> >> be > > >> > >> >> > > > > > interested > > >> > >> >> > > > > > > > why > > >> > >> >> > > > > > > > > > > Yang > > >> > >> >> > > > > > > > > > > > > Wang > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would > be > > >> > best. > > >> > >> My > > >> > >> >> > > concern > > >> > >> >> > > > > > about > > >> > >> >> > > > > > > > > this > > >> > >> >> > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > situation > > >> as > > >> > we > > >> > >> >> are > > >> > >> >> > now > > >> > >> >> > > > > with > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > > pools > > >> > are > > >> > >> not > > >> > >> >> > > > clearly > > >> > >> >> > > > > > > > > separated > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > can > > >> > >> >> > > > > > > > > > > > > > > > spill > > >> > >> >> > > > > > > > > > > > > > > > > > over > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then > > it > > >> is > > >> > >> quite > > >> > >> >> > hard > > >> > >> >> > > > to > > >> > >> >> > > > > > > > > understand > > >> > >> >> > > > > > > > > > > > what > > >> > >> >> > > > > > > > > > > > > > > > exactly > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed > > for > > >> > using > > >> > >> >> too > > >> > >> >> > > much > > >> > >> >> > > > > > > memory. > > >> > >> >> > > > > > > > > This > > >> > >> >> > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > then > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > >> situation > > >> > >> what > > >> > >> >> we > > >> > >> >> > > have > > >> > >> >> > > > > with > > >> > >> >> > > > > > > the > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > >> > >> >> > > > > > > > > > > > > > > > So > > >> > >> >> > > > > > > > > > > > > > > > > > why > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default > > >> value > > >> > >> for > > >> > >> >> max > > >> > >> >> > > > direct > > >> > >> >> > > > > > > > memory > > >> > >> >> > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > giving > > >> > >> >> > > > > > > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it > if > > >> he > > >> > >> runs > > >> > >> >> into > > >> > >> >> > > an > > >> > >> >> > > > > OOM. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > >> > >> alternative 2 > > >> > >> >> > lead > > >> > >> >> > > to > > >> > >> >> > > > > > lower > > >> > >> >> > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > utilization > > >> > >> >> > > > > > > > > > > > > > > > > > than > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we > > set > > >> > the > > >> > >> >> direct > > >> > >> >> > > > > memory > > >> > >> >> > > > > > > to a > > >> > >> >> > > > > > > > > > > higher > > >> > >> >> > > > > > > > > > > > > > value? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at > > >> 9:12 > > >> > AM > > >> > >> >> > Xintong > > >> > >> >> > > > > Song < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > feedback, > > >> > >> Yang. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > comments: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > > >> Memory* > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a > very > > >> > large > > >> > >> max > > >> > >> >> > > direct > > >> > >> >> > > > > > > memory > > >> > >> >> > > > > > > > > size > > >> > >> >> > > > > > > > > > > > > > > definitely > > >> > >> >> > > > > > > > > > > > > > > > > has > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we > > do > > >> not > > >> > >> >> worry > > >> > >> >> > > about > > >> > >> >> > > > > > > direct > > >> > >> >> > > > > > > > > OOM, > > >> > >> >> > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > don't > > >> > >> >> > > > > > > > > > > > > > > > > > even > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed > / > > >> > network > > >> > >> >> > memory > > >> > >> >> > > > with > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > > also > > >> > some > > >> > >> >> down > > >> > >> >> > > sides > > >> > >> >> > > > > of > > >> > >> >> > > > > > > > doing > > >> > >> >> > > > > > > > > > > this. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can > > >> think > > >> > >> of is > > >> > >> >> > that > > >> > >> >> > > > if > > >> > >> >> > > > > a > > >> > >> >> > > > > > > task > > >> > >> >> > > > > > > > > > > > executor > > >> > >> >> > > > > > > > > > > > > > > > > container > > >> > >> >> > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > >> overusing > > >> > >> >> memory, > > >> > >> >> > it > > >> > >> >> > > > > could > > >> > >> >> > > > > > > be > > >> > >> >> > > > > > > > > hard > > >> > >> >> > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > use > > >> > >> >> > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > know > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > > >> > overused. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down > side > > >> is > > >> > >> that > > >> > >> >> the > > >> > >> >> > > JVM > > >> > >> >> > > > > > never > > >> > >> >> > > > > > > > > > trigger > > >> > >> >> > > > > > > > > > > GC > > >> > >> >> > > > > > > > > > > > > due > > >> > >> >> > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > limit, > > >> > >> because > > >> > >> >> the > > >> > >> >> > > > limit > > >> > >> >> > > > > > is > > >> > >> >> > > > > > > > too > > >> > >> >> > > > > > > > > > high > > >> > >> >> > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > be > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of > > >> relay > > >> > on > > >> > >> >> heap > > >> > >> >> > > > memory > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > > trigger > > >> > >> >> > > > > > > > > > > GC > > >> > >> >> > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > release > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > could > > >> be a > > >> > >> >> problem > > >> > >> >> > in > > >> > >> >> > > > > cases > > >> > >> >> > > > > > > > where > > >> > >> >> > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > more > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > enough > > >> > heap > > >> > >> >> > activity > > >> > >> >> > > > to > > >> > >> >> > > > > > > > trigger > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > GC. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share > > your > > >> > >> reasons > > >> > >> >> > for > > >> > >> >> > > > > > > preferring > > >> > >> >> > > > > > > > > > > > setting a > > >> > >> >> > > > > > > > > > > > > > > very > > >> > >> >> > > > > > > > > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > anything > > >> else > > >> > I > > >> > >> >> > > > overlooked. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > conflict > > >> > >> between > > >> > >> >> > > > multiple > > >> > >> >> > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > user > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > specified, > > I > > >> > >> think we > > >> > >> >> > > should > > >> > >> >> > > > > > throw > > >> > >> >> > > > > > > > an > > >> > >> >> > > > > > > > > > > error. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > checking > > >> on > > >> > the > > >> > >> >> > client > > >> > >> >> > > > side > > >> > >> >> > > > > > is > > >> > >> >> > > > > > > a > > >> > >> >> > > > > > > > > good > > >> > >> >> > > > > > > > > > > > idea, > > >> > >> >> > > > > > > > > > > > > > so > > >> > >> >> > > > > > > > > > > > > > > > that > > >> > >> >> > > > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover > > the > > >> > >> problem > > >> > >> >> > > before > > >> > >> >> > > > > > > > submitting > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > Flink > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > thing. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only > > >> rely on > > >> > >> the > > >> > >> >> > > client > > >> > >> >> > > > > side > > >> > >> >> > > > > > > > > > checking, > > >> > >> >> > > > > > > > > > > > > > because > > >> > >> >> > > > > > > > > > > > > > > > for > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > >> > >> TaskManagers > > >> > >> >> on > > >> > >> >> > > > > > different > > >> > >> >> > > > > > > > > > machines > > >> > >> >> > > > > > > > > > > > may > > >> > >> >> > > > > > > > > > > > > > > have > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and > the > > >> > client > > >> > >> >> does > > >> > >> >> > > see > > >> > >> >> > > > > > that. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 > at > > >> 5:09 > > >> > >> PM > > >> > >> >> Yang > > >> > >> >> > > > Wang > > >> > >> >> > > > > < > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > > >> detailed > > >> > >> >> > proposal. > > >> > >> >> > > > > After > > >> > >> >> > > > > > > all > > >> > >> >> > > > > > > > > the > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > will > > be > > >> > more > > >> > >> >> > > powerful > > >> > >> >> > > > to > > >> > >> >> > > > > > > > control > > >> > >> >> > > > > > > > > > the > > >> > >> >> > > > > > > > > > > > > flink > > >> > >> >> > > > > > > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > >> questions > > >> > >> about > > >> > >> >> it. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > > Direct > > >> > >> Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > >> differentiate > > >> > >> user > > >> > >> >> > direct > > >> > >> >> > > > > > memory > > >> > >> >> > > > > > > > and > > >> > >> >> > > > > > > > > > > native > > >> > >> >> > > > > > > > > > > > > > > memory. > > >> > >> >> > > > > > > > > > > > > > > > > > They > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > > >> off-heap > > >> > >> >> memory. > > >> > >> >> > > > > Right? > > >> > >> >> > > > > > > So i > > >> > >> >> > > > > > > > > > don’t > > >> > >> >> > > > > > > > > > > > > think > > >> > >> >> > > > > > > > > > > > > > > we > > >> > >> >> > > > > > > > > > > > > > > > > > could > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > >> > -XX:MaxDirectMemorySize > > >> > >> >> > > > properly. I > > >> > >> >> > > > > > > > prefer > > >> > >> >> > > > > > > > > > > > leaving > > >> > >> >> > > > > > > > > > > > > > it a > > >> > >> >> > > > > > > > > > > > > > > > > very > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > >> Calculation > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > > >> > >> fine-grained > > >> > >> >> > > > > > > memory(network > > >> > >> >> > > > > > > > > > > memory, > > >> > >> >> > > > > > > > > > > > > > > managed > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > total > > >> > >> process > > >> > >> >> > > memory, > > >> > >> >> > > > > how > > >> > >> >> > > > > > do > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > deal > > >> > >> >> > > > > > > > > > > > > with > > >> > >> >> > > > > > > > > > > > > > > this > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check > the > > >> > memory > > >> > >> >> > > > > configuration > > >> > >> >> > > > > > > in > > >> > >> >> > > > > > > > > > > client? > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > >> > >> >> > > [hidden email]> > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to > > >> start > > >> > a > > >> > >> >> > > discussion > > >> > >> >> > > > > > > thread > > >> > >> >> > > > > > > > on > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration > for > > >> > >> >> > > > TaskExecutors"[1], > > >> > >> >> > > > > > > where > > >> > >> >> > > > > > > > we > > >> > >> >> > > > > > > > > > > > > describe > > >> > >> >> > > > > > > > > > > > > > > how > > >> > >> >> > > > > > > > > > > > > > > > to > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > memory > > >> > >> >> > > configurations. > > >> > >> >> > > > > The > > >> > >> >> > > > > > > > FLIP > > >> > >> >> > > > > > > > > > > > document > > >> > >> >> > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > "Memory > > >> > >> >> Management > > >> > >> >> > > and > > >> > >> >> > > > > > > > > > Configuration > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > >> > >> >> > > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > from > > >> > >> follow-up > > >> > >> >> > > > > discussions > > >> > >> >> > > > > > > > both > > >> > >> >> > > > > > > > > > > online > > >> > >> >> > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > addresses > > >> > >> several > > >> > >> >> > > > > > shortcomings > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > > > current > > >> > >> >> > > > > > > > > > > > > > > (Flink > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > memory > > >> > >> >> > > configuration. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > >> > >> configuration > > >> > >> >> > for > > >> > >> >> > > > > > > Streaming > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > > Batch. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > > >> > >> difficult > > >> > >> >> > > > > > configuration > > >> > >> >> > > > > > > of > > >> > >> >> > > > > > > > > > > RocksDB > > >> > >> >> > > > > > > > > > > > > in > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Complicated, > > >> > >> uncertain > > >> > >> >> and > > >> > >> >> > > > hard > > >> > >> >> > > > > to > > >> > >> >> > > > > > > > > > > understand. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > > solve > > >> > the > > >> > >> >> > problems > > >> > >> >> > > > can > > >> > >> >> > > > > > be > > >> > >> >> > > > > > > > > > > summarized > > >> > >> >> > > > > > > > > > > > > as > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > memory > > >> > >> manager > > >> > >> >> to > > >> > >> >> > > also > > >> > >> >> > > > > > > account > > >> > >> >> > > > > > > > > for > > >> > >> >> > > > > > > > > > > > memory > > >> > >> >> > > > > > > > > > > > > > > usage > > >> > >> >> > > > > > > > > > > > > > > > > by > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > > >> > >> TaskExecutor > > >> > >> >> > > memory > > >> > >> >> > > > > is > > >> > >> >> > > > > > > > > > > partitioned > > >> > >> >> > > > > > > > > > > > > > > > accounted > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > >> reservations > > >> > >> and > > >> > >> >> > pools. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > > memory > > >> > >> >> > > configuration > > >> > >> >> > > > > > > options > > >> > >> >> > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > calculations > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > > >> > details > > >> > >> in > > >> > >> >> the > > >> > >> >> > > > FLIP > > >> > >> >> > > > > > wiki > > >> > >> >> > > > > > > > > > > document > > >> > >> >> > > > > > > > > > > > > [1]. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > that > > >> the > > >> > >> early > > >> > >> >> > > design > > >> > >> >> > > > > doc > > >> > >> >> > > > > > > [2] > > >> > >> >> > > > > > > > is > > >> > >> >> > > > > > > > > > out > > >> > >> >> > > > > > > > > > > > of > > >> > >> >> > > > > > > > > > > > > > > sync, > > >> > >> >> > > > > > > > > > > > > > > > > and > > >> > >> >> > > > > > > > > > > > > > > > > > it > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > > have > > >> the > > >> > >> >> > > discussion > > >> > >> >> > > > in > > >> > >> >> > > > > > > this > > >> > >> >> > > > > > > > > > > mailing > > >> > >> >> > > > > > > > > > > > > list > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward > to > > >> your > > >> > >> >> > > feedbacks. > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > >> > > > >> > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> >> > > >> > >> > > > >> > >> > > >> > > > > >> > > > >> > > > > > > |
Hi Andrey,
Thanks for bringing this up. If I understand correctly, this issue only occurs where the cluster is configured with both on-heap and off-heap memory. There should be no regression for clusters configured in the old way (either all on-heap or all off-heap). I also agree that it would be good if the DataSet API jobs can use both memory types. The only question I can see is that, from which pool (heap / off-heap) should we allocate memory for DataSet API operators? Do we always prioritize one pool over the other? Or do we always prioritize the pool with more available memory left? Thank you~ Xintong Song On Tue, Sep 10, 2019 at 8:15 PM Andrey Zagrebin <[hidden email]> wrote: > Hi All, > > While looking more into the implementation details of Step 4, we released > during some offline discussions with @Till > that there can be a performance degradation for the batch DataSet API if we > simply continue to pull memory from the pool > according the legacy option taskmanager.memory.off-heap. > > The reason is that if the cluster is newly configured to statically split > heap/off-heap (not like previously either heap or 0ff-heap) > then the batch DataSet API jobs will be able to use only one type of > memory. Although it does not really matter where the memory segments come > from > and potentially batch jobs can use both. Also, currently the Dataset API > does not result in absolute resource requirements and its batch jobs will > always get a default share of TM resources. > > The suggestion is that we let the batch tasks of Dataset API pull from both > pools according to their fair slot share of each memory type. > For that we can have a special wrapping view of both pools which will pull > segments (can be randomly) according to the slot limits. > The view can wrap TM level memory pools and be given to the Task. > > Best, > Andrey > > On Mon, Sep 2, 2019 at 1:35 PM Xintong Song <[hidden email]> wrote: > > > Thanks for your comments, Andrey. > > > > - Regarding Task Off-Heap Memory, I think you're right that the user need > > to make sure that direct memory and native memory together used by the > user > > code (external libs) do not exceed the configured value. As far as I can > > think of, there is nothing we can do about it. > > > > I addressed the rest of your comment in the wiki page [1]. Please take a > > look. > > > > Thank you~ > > > > Xintong Song > > > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin <[hidden email]> > > wrote: > > > > > EDIT: sorry for confusion I meant > > > taskmanager.memory.off-heap > > > instead of > > > setting taskmanager.memory.preallocate > > > > > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin <[hidden email]> > > > wrote: > > > > > > > Hi All, > > > > > > > > @Xitong thanks a lot for driving the discussion. > > > > > > > > I also reviewed the FLIP and it looks quite good to me. > > > > Here are some comments: > > > > > > > > > > > > - One thing I wanted to discuss is the backwards-compatibility > with > > > > the previous user setups. We could list which options we plan to > > > deprecate. > > > > From the first glance it looks possible to provide the > same/similar > > > > behaviour for the setups relying on the deprecated options. E.g. > > > > setting taskmanager.memory.preallocate to true could override the > > > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At the > > > moment the > > > > FLIP just states that in some cases it may require re-configuring > of > > > > cluster if migrated from prior versions. My suggestion is that we > > try > > > to > > > > keep it backwards-compatible unless there is a good reason like > some > > > major > > > > complication for the implementation. > > > > > > > > > > > > Also couple of smaller things: > > > > > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP and > leave > > > > some general wording atm, like 'data structure to store' or > 'utility > > > > classes'. When the classes are implemented, we put the concrete > > class > > > > names. This way we can avoid confusion and stale documents. > > > > > > > > > > > > - As I understand, if user task uses native memory (not direct > > memory, > > > > but e.g. unsafe.allocate or from external lib), there will be no > > > > explicit guard against exceeding 'task off heap memory'. Then user > > > should > > > > still explicitly make sure that her/his direct buffer allocation > > plus > > > any > > > > other memory usages does not exceed value announced as 'task off > > > heap'. I > > > > guess there is no so much that can be done about it except > > mentioning > > > in > > > > docs, similar to controlling the heap state backend. > > > > > > > > > > > > Thanks, > > > > Andrey > > > > > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> > > wrote: > > > > > > > >> I also agree that all the configuration should be calculated out of > > > >> TaskManager. > > > >> > > > >> So a full configuration should be generated before TaskManager > > started. > > > >> > > > >> Override the calculated configurations through -D now seems better. > > > >> > > > >> > > > >> > > > >> Best, > > > >> > > > >> Yang > > > >> > > > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > > >> > > > >> > I just updated the FLIP wiki page [1], with the following changes: > > > >> > > > > >> > - Network memory uses JVM direct memory, and is accounted when > > > >> setting > > > >> > JVM max direct memory size parameter. > > > >> > - Use dynamic configurations (`-Dkey=value`) to pass calculated > > > >> memory > > > >> > configs into TaskExecutors, instead of ENV variables. > > > >> > - Remove 'supporting memory reservation' from the scope of this > > > FLIP. > > > >> > > > > >> > @till @stephan, please take another look see if there are any > other > > > >> > concerns. > > > >> > > > > >> > Thank you~ > > > >> > > > > >> > Xintong Song > > > >> > > > > >> > > > > >> > [1] > > > >> > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > >> > > > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song < > [hidden email] > > > > > > >> > wrote: > > > >> > > > > >> > > Sorry for the late response. > > > >> > > > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss > the > > > >> detail > > > >> > > in PR. > > > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 for > > using > > > >> > > dynamic configuration at the moment, given that there are more > > > >> questions > > > >> > to > > > >> > > be discussed to have a general framework for overwriting > > > >> configurations > > > >> > > with ENV variables. > > > >> > > - Regarding memory reservation, I double checked with Yu and he > > will > > > >> take > > > >> > > care of it. > > > >> > > > > > >> > > Thank you~ > > > >> > > > > > >> > > Xintong Song > > > >> > > > > > >> > > > > > >> > > > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann < > > [hidden email] > > > > > > > >> > > wrote: > > > >> > > > > > >> > >> What I forgot to add is that we could tackle specifying the > > > >> > configuration > > > >> > >> fully in an incremental way and that the full specification > > should > > > be > > > >> > the > > > >> > >> desired end state. > > > >> > >> > > > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > > > [hidden email]> > > > >> > >> wrote: > > > >> > >> > > > >> > >> > I think our goal should be that the configuration is fully > > > >> specified > > > >> > >> when > > > >> > >> > the process is started. By considering the internal > calculation > > > >> step > > > >> > to > > > >> > >> be > > > >> > >> > rather validate existing values and calculate missing ones, > > these > > > >> two > > > >> > >> > proposal shouldn't even conflict (given determinism). > > > >> > >> > > > > >> > >> > Since we don't want to change an existing flink-conf.yaml, > > > >> specifying > > > >> > >> the > > > >> > >> > full configuration would require to pass in the options > > > >> differently. > > > >> > >> > > > > >> > >> > One way could be the ENV variables approach. The reason why > I'm > > > >> trying > > > >> > >> to > > > >> > >> > exclude this feature from the FLIP is that I believe it > needs a > > > bit > > > >> > more > > > >> > >> > discussion. Just some questions which come to my mind: What > > would > > > >> be > > > >> > the > > > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot > separator > > > >> which > > > >> > is > > > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the > > dot > > > >> > >> > separator what would be the order of precedence if there are > > two > > > >> ENV > > > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What > is > > > the > > > >> > >> > precedence of env variable vs. dynamic configuration value > > > >> specified > > > >> > >> via -D? > > > >> > >> > > > > >> > >> > Another approach could be to pass in the dynamic > configuration > > > >> values > > > >> > >> via > > > >> > >> > `-Dkey=value` to the Flink process. For that we don't have to > > > >> change > > > >> > >> > anything because the functionality already exists. > > > >> > >> > > > > >> > >> > Cheers, > > > >> > >> > Till > > > >> > >> > > > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen < > > [hidden email]> > > > >> > wrote: > > > >> > >> > > > > >> > >> >> I see. Under the assumption of strict determinism that > should > > > >> work. > > > >> > >> >> > > > >> > >> >> The original proposal had this point "don't compute inside > the > > > TM, > > > >> > >> compute > > > >> > >> >> outside and supply a full config", because that sounded more > > > >> > intuitive. > > > >> > >> >> > > > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > > > >> [hidden email] > > > >> > > > > > >> > >> >> wrote: > > > >> > >> >> > > > >> > >> >> > My understanding was that before starting the Flink > process > > we > > > >> > call a > > > >> > >> >> > utility which calculates these values. I assume that this > > > >> utility > > > >> > >> will > > > >> > >> >> do > > > >> > >> >> > the calculation based on a set of configured values > (process > > > >> > memory, > > > >> > >> >> flink > > > >> > >> >> > memory, network memory etc.). Assuming that these values > > don't > > > >> > differ > > > >> > >> >> from > > > >> > >> >> > the values with which the JVM is started, it should be > > > possible > > > >> to > > > >> > >> >> > recompute them in the Flink process in order to set the > > > values. > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > > > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > > > [hidden email] > > > >> > > > > >> > >> wrote: > > > >> > >> >> > > > > >> > >> >> > > When computing the values in the JVM process after it > > > started, > > > >> > how > > > >> > >> >> would > > > >> > >> >> > > you deal with values like Max Direct Memory, Metaspace > > size. > > > >> > native > > > >> > >> >> > memory > > > >> > >> >> > > reservation (reduce heap size), etc? All the values that > > are > > > >> > >> >> parameters > > > >> > >> >> > to > > > >> > >> >> > > the JVM process and that need to be supplied at process > > > >> startup? > > > >> > >> >> > > > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > > >> > >> [hidden email]> > > > >> > >> >> > > wrote: > > > >> > >> >> > > > > > >> > >> >> > > > Thanks for the clarification. I have some more > comments: > > > >> > >> >> > > > > > > >> > >> >> > > > - I would actually split the logic to compute the > > process > > > >> > memory > > > >> > >> >> > > > requirements and storing the values into two things. > > E.g. > > > >> one > > > >> > >> could > > > >> > >> >> > name > > > >> > >> >> > > > the former TaskExecutorProcessUtility and the latter > > > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on > > the > > > PR > > > >> > >> since > > > >> > >> >> it's > > > >> > >> >> > > > just a naming detail. > > > >> > >> >> > > > > > > >> > >> >> > > > - Generally, I'm not opposed to making configuration > > > values > > > >> > >> >> overridable > > > >> > >> >> > > by > > > >> > >> >> > > > ENV variables. I think this is a very good idea and > > makes > > > >> the > > > >> > >> >> > > > configurability of Flink processes easier. However, I > > > think > > > >> > that > > > >> > >> >> adding > > > >> > >> >> > > > this functionality should not be part of this FLIP > > because > > > >> it > > > >> > >> would > > > >> > >> >> > > simply > > > >> > >> >> > > > widen the scope unnecessarily. > > > >> > >> >> > > > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are the > > > >> following: > > > >> > >> For > > > >> > >> >> Yarn > > > >> > >> >> > > we > > > >> > >> >> > > > already create write a flink-conf.yaml which could be > > > >> populated > > > >> > >> with > > > >> > >> >> > the > > > >> > >> >> > > > memory settings. For the other processes it should not > > > make > > > >> a > > > >> > >> >> > difference > > > >> > >> >> > > > whether the loaded Configuration is populated with the > > > >> memory > > > >> > >> >> settings > > > >> > >> >> > > from > > > >> > >> >> > > > ENV variables or by using TaskExecutorProcessUtility > to > > > >> compute > > > >> > >> the > > > >> > >> >> > > missing > > > >> > >> >> > > > values from the loaded configuration. If the latter > > would > > > >> not > > > >> > be > > > >> > >> >> > possible > > > >> > >> >> > > > (wrong or missing configuration values), then we > should > > > not > > > >> > have > > > >> > >> >> been > > > >> > >> >> > > able > > > >> > >> >> > > > to actually start the process in the first place. > > > >> > >> >> > > > > > > >> > >> >> > > > - Concerning the memory reservation: I agree with you > > that > > > >> we > > > >> > >> need > > > >> > >> >> the > > > >> > >> >> > > > memory reservation functionality to make streaming > jobs > > > work > > > >> > with > > > >> > >> >> > > "managed" > > > >> > >> >> > > > memory. However, w/o this functionality the whole Flip > > > would > > > >> > >> already > > > >> > >> >> > > bring > > > >> > >> >> > > > a good amount of improvements to our users when > running > > > >> batch > > > >> > >> jobs. > > > >> > >> >> > > > Moreover, by keeping the scope smaller we can complete > > the > > > >> FLIP > > > >> > >> >> faster. > > > >> > >> >> > > > Hence, I would propose to address the memory > reservation > > > >> > >> >> functionality > > > >> > >> >> > > as a > > > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not > > > mistaken). > > > >> > >> >> > > > > > > >> > >> >> > > > Cheers, > > > >> > >> >> > > > Till > > > >> > >> >> > > > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > > >> > >> [hidden email]> > > > >> > >> >> > > wrote: > > > >> > >> >> > > > > > > >> > >> >> > > > > Just add my 2 cents. > > > >> > >> >> > > > > > > > >> > >> >> > > > > Using environment variables to override the > > > configuration > > > >> for > > > >> > >> >> > different > > > >> > >> >> > > > > taskmanagers is better. > > > >> > >> >> > > > > We do not need to generate dedicated flink-conf.yaml > > for > > > >> all > > > >> > >> >> > > > taskmanagers. > > > >> > >> >> > > > > A common flink-conf.yam and different environment > > > >> variables > > > >> > are > > > >> > >> >> > enough. > > > >> > >> >> > > > > By reducing the distributed cached files, it could > > make > > > >> > >> launching > > > >> > >> >> a > > > >> > >> >> > > > > taskmanager faster. > > > >> > >> >> > > > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could move > the > > > >> logic > > > >> > >> into > > > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > > >> > >> >> > > > > Maybe the client could also benefit from this. > > Different > > > >> > users > > > >> > >> do > > > >> > >> >> not > > > >> > >> >> > > > have > > > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config > options. > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > Best, > > > >> > >> >> > > > > Yang > > > >> > >> >> > > > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 > > 上午1:21写道: > > > >> > >> >> > > > > > > > >> > >> >> > > > > > One note on the Environment Variables and > > > Configuration > > > >> > >> >> discussion. > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables are > > > added > > > >> to > > > >> > >> the > > > >> > >> >> > > > > > configuration in the > > > "GlobalConfiguration.loadConfig()" > > > >> > >> method > > > >> > >> >> (or > > > >> > >> >> > > > > > similar). > > > >> > >> >> > > > > > For all the code inside Flink, it looks like the > > data > > > >> was > > > >> > in > > > >> > >> the > > > >> > >> >> > > config > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > start with, just that the scripts that compute the > > > >> > variables > > > >> > >> can > > > >> > >> >> > pass > > > >> > >> >> > > > the > > > >> > >> >> > > > > > values to the process without actually needing to > > > write > > > >> a > > > >> > >> file. > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" > > > >> method > > > >> > >> would > > > >> > >> >> > take > > > >> > >> >> > > > any > > > >> > >> >> > > > > > ENV variable prefixed with "flink" and add it as a > > > >> config > > > >> > >> key. > > > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > > > >> > >> >> > > > "taskmanager.memory.size: > > > >> > >> >> > > > > > 2g". > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > > >> > >> >> > [hidden email]> > > > >> > >> >> > > > > > wrote: > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki page, > but > > > >> let's > > > >> > >> keep > > > >> > >> >> the > > > >> > >> >> > > > > > > discussion here. > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you > > > think > > > >> > about > > > >> > >> >> > naming > > > >> > >> >> > > it > > > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > > >> > >> >> > > > > > > - Regarding passing memory configurations into > > task > > > >> > >> executors, > > > >> > >> >> > I'm > > > >> > >> >> > > in > > > >> > >> >> > > > > > favor > > > >> > >> >> > > > > > > of do it via environment variables rather than > > > >> > >> configurations, > > > >> > >> >> > with > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > following two reasons. > > > >> > >> >> > > > > > > - It is easier to keep the memory options once > > > >> > calculate > > > >> > >> >> not to > > > >> > >> >> > > be > > > >> > >> >> > > > > > > changed with environment variables rather than > > > >> > >> configurations. > > > >> > >> >> > > > > > > - I'm not sure whether we should write the > > > >> > configuration > > > >> > >> in > > > >> > >> >> > > startup > > > >> > >> >> > > > > > > scripts. Writing changes into the configuration > > > files > > > >> > when > > > >> > >> >> > running > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > startup scripts does not sounds right to me. Or > we > > > >> could > > > >> > >> make > > > >> > >> >> a > > > >> > >> >> > > copy > > > >> > >> >> > > > of > > > >> > >> >> > > > > > > configuration files per flink cluster, and make > > the > > > >> task > > > >> > >> >> executor > > > >> > >> >> > > to > > > >> > >> >> > > > > load > > > >> > >> >> > > > > > > from the copy, and clean up the copy after the > > > >> cluster is > > > >> > >> >> > shutdown, > > > >> > >> >> > > > > which > > > >> > >> >> > > > > > > is complicated. (I think this is also what > Stephan > > > >> means > > > >> > in > > > >> > >> >> his > > > >> > >> >> > > > comment > > > >> > >> >> > > > > > on > > > >> > >> >> > > > > > > the wiki page?) > > > >> > >> >> > > > > > > - Regarding reserving memory, I think this > change > > > >> should > > > >> > be > > > >> > >> >> > > included > > > >> > >> >> > > > in > > > >> > >> >> > > > > > > this FLIP. I think a big part of motivations of > > this > > > >> FLIP > > > >> > >> is > > > >> > >> >> to > > > >> > >> >> > > unify > > > >> > >> >> > > > > > > memory configuration for streaming / batch and > > make > > > it > > > >> > easy > > > >> > >> >> for > > > >> > >> >> > > > > > configuring > > > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > > > >> reservation, > > > >> > >> then > > > >> > >> >> > > > streaming > > > >> > >> >> > > > > > jobs > > > >> > >> >> > > > > > > cannot use managed memory (neither on-heap or > > > >> off-heap), > > > >> > >> which > > > >> > >> >> > > makes > > > >> > >> >> > > > > this > > > >> > >> >> > > > > > > FLIP incomplete. > > > >> > >> >> > > > > > > - Regarding network memory, I think you are > > right. I > > > >> > think > > > >> > >> we > > > >> > >> >> > > > probably > > > >> > >> >> > > > > > > don't need to change network stack from using > > direct > > > >> > >> memory to > > > >> > >> >> > > using > > > >> > >> >> > > > > > unsafe > > > >> > >> >> > > > > > > native memory. Network memory size is > > deterministic, > > > >> > >> cannot be > > > >> > >> >> > > > reserved > > > >> > >> >> > > > > > as > > > >> > >> >> > > > > > > managed memory does, and cannot be overused. I > > think > > > >> it > > > >> > >> also > > > >> > >> >> > works > > > >> > >> >> > > if > > > >> > >> >> > > > > we > > > >> > >> >> > > > > > > simply keep using direct memory for network and > > > >> include > > > >> > it > > > >> > >> in > > > >> > >> >> jvm > > > >> > >> >> > > max > > > >> > >> >> > > > > > > direct memory size. > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > > > >> > >> >> > > [hidden email]> > > > >> > >> >> > > > > > > wrote: > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and adding > a > > > more > > > >> > >> >> detailed > > > >> > >> >> > > > > > > > implementation plan. I have a couple of > comments > > > >> > >> concerning > > > >> > >> >> the > > > >> > >> >> > > > > > > > implementation plan: > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not > really > > > >> > >> >> descriptive. > > > >> > >> >> > > > > Choosing > > > >> > >> >> > > > > > a > > > >> > >> >> > > > > > > > different name could help here. > > > >> > >> >> > > > > > > > - I'm not sure whether I would pass the memory > > > >> > >> >> configuration to > > > >> > >> >> > > the > > > >> > >> >> > > > > > > > TaskExecutor via environment variables. I > think > > it > > > >> > would > > > >> > >> be > > > >> > >> >> > > better > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > write > > > >> > >> >> > > > > > > > it into the configuration one uses to start > the > > TM > > > >> > >> process. > > > >> > >> >> > > > > > > > - If possible, I would exclude the memory > > > >> reservation > > > >> > >> from > > > >> > >> >> this > > > >> > >> >> > > > FLIP > > > >> > >> >> > > > > > and > > > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > > > >> > >> >> > > > > > > > - If possible, then I would exclude changes to > > the > > > >> > >> network > > > >> > >> >> > stack > > > >> > >> >> > > > from > > > >> > >> >> > > > > > > this > > > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct > > > memory > > > >> > >> needed > > > >> > >> >> by > > > >> > >> >> > > the > > > >> > >> >> > > > > > > network > > > >> > >> >> > > > > > > > stack is the framework direct memory > > requirement. > > > >> > >> Changing > > > >> > >> >> how > > > >> > >> >> > > the > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > is allocated can happen in a second step. This > > > would > > > >> > keep > > > >> > >> >> the > > > >> > >> >> > > scope > > > >> > >> >> > > > > of > > > >> > >> >> > > > > > > this > > > >> > >> >> > > > > > > > FLIP smaller. > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > Cheers, > > > >> > >> >> > > > > > > > Till > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > > > >> > >> >> > > > [hidden email]> > > > >> > >> >> > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki > [1], > > > with > > > >> > the > > > >> > >> >> > > following > > > >> > >> >> > > > > > > changes. > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > > > MemorySegment > > > >> > >> >> > allocation. > > > >> > >> >> > > As > > > >> > >> >> > > > > > > > > discussed, we exclude this topic from the > > > >> scope of > > > >> > >> this > > > >> > >> >> > > FLIP. > > > >> > >> >> > > > > > > > > - Updated content about JVM direct memory > > > >> > parameter > > > >> > >> >> > > according > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > recent > > > >> > >> >> > > > > > > > > discussions, and moved the other options > to > > > >> > >> "Rejected > > > >> > >> >> > > > > > Alternatives" > > > >> > >> >> > > > > > > > for > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > moment. > > > >> > >> >> > > > > > > > > - Added implementation steps. > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > [1] > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan > Ewen < > > > >> > >> >> > [hidden email] > > > >> > >> >> > > > > > > >> > >> >> > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory > users > > > >> before > > > >> > >> task > > > >> > >> >> > > dispose > > > >> > >> >> > > > > and > > > >> > >> >> > > > > > > > > memory > > > >> > >> >> > > > > > > > > > release": I agree, that's how it should > be. > > > >> Let's > > > >> > >> try it > > > >> > >> >> > out. > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does > > not > > > >> wait > > > >> > >> for > > > >> > >> >> GC > > > >> > >> >> > > when > > > >> > >> >> > > > > > > > allocating > > > >> > >> >> > > > > > > > > > direct memory buffer": There seems to be > > > pretty > > > >> > >> >> elaborate > > > >> > >> >> > > logic > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > free > > > >> > >> >> > > > > > > > > > buffers when allocating new ones. See > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM > > > default > > > >> > works > > > >> > >> >> (like > > > >> > >> >> > > > going > > > >> > >> >> > > > > > > with > > > >> > >> >> > > > > > > > > > option 2 and not setting > > > >> "-XX:MaxDirectMemorySize" > > > >> > at > > > >> > >> >> all), > > > >> > >> >> > > > then > > > >> > >> >> > > > > I > > > >> > >> >> > > > > > > > think > > > >> > >> >> > > > > > > > > it > > > >> > >> >> > > > > > > > > > should be okay to set > > > "-XX:MaxDirectMemorySize" > > > >> to > > > >> > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" > > even > > > >> if > > > >> > we > > > >> > >> use > > > >> > >> >> > > > RocksDB. > > > >> > >> >> > > > > > > That > > > >> > >> >> > > > > > > > > is a > > > >> > >> >> > > > > > > > > > big if, though, I honestly have no idea :D > > > >> Would be > > > >> > >> >> good to > > > >> > >> >> > > > > > > understand > > > >> > >> >> > > > > > > > > > this, though, because this would affect > > option > > > >> (2) > > > >> > >> and > > > >> > >> >> > option > > > >> > >> >> > > > > > (1.2). > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong > > Song < > > > >> > >> >> > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. > > Please > > > >> > correct > > > >> > >> >> me if > > > >> > >> >> > > I'm > > > >> > >> >> > > > > > > wrong. > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always > avoid > > > >> > returning > > > >> > >> >> > memory > > > >> > >> >> > > > > > segments > > > >> > >> >> > > > > > > > to > > > >> > >> >> > > > > > > > > > > memory manager while there are still > > > >> > un-cleaned > > > >> > >> >> > > > structures / > > > >> > >> >> > > > > > > > threads > > > >> > >> >> > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it > would > > > >> cause > > > >> > >> >> serious > > > >> > >> >> > > > > problems > > > >> > >> >> > > > > > > by > > > >> > >> >> > > > > > > > > > having > > > >> > >> >> > > > > > > > > > > multiple consumers trying to use the > > same > > > >> > memory > > > >> > >> >> > > segment. > > > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > > > allocating > > > >> > >> direct > > > >> > >> >> > memory > > > >> > >> >> > > > > > buffer. > > > >> > >> >> > > > > > > > > > > Therefore even we set proper max > direct > > > >> memory > > > >> > >> size > > > >> > >> >> > > limit, > > > >> > >> >> > > > > we > > > >> > >> >> > > > > > > may > > > >> > >> >> > > > > > > > > > still > > > >> > >> >> > > > > > > > > > > encounter direct memory oom if the GC > > > >> cleaning > > > >> > >> >> memory > > > >> > >> >> > > > slower > > > >> > >> >> > > > > > > than > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > > > direct memory allocation. > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM > > JingsongLee > > > < > > > >> > >> >> > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > .invalid> > > > >> > >> >> > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly shut > > > down > > > >> > >> before > > > >> > >> >> we > > > >> > >> >> > can > > > >> > >> >> > > > > exit > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > task: > > > >> > >> >> > > > > > > > > > > > In the current case of memory reuse, > it > > > has > > > >> > >> freed up > > > >> > >> >> > the > > > >> > >> >> > > > > memory > > > >> > >> >> > > > > > > it > > > >> > >> >> > > > > > > > > > > > uses. If this memory is used by other > > > tasks > > > >> > and > > > >> > >> >> > > > asynchronous > > > >> > >> >> > > > > > > > threads > > > >> > >> >> > > > > > > > > > > > of exited task may still be writing, > > > there > > > >> > will > > > >> > >> be > > > >> > >> >> > > > > concurrent > > > >> > >> >> > > > > > > > > security > > > >> > >> >> > > > > > > > > > > > problems, and even lead to errors in > > user > > > >> > >> computing > > > >> > >> >> > > > results. > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > > > intolerable > > > >> > >> bug, No > > > >> > >> >> > > matter > > > >> > >> >> > > > > > what > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > > > option is, it should be avoided. > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > > > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > > > >> > >> encountered so > > > >> > >> >> > many > > > >> > >> >> > > > > > > > situations > > > >> > >> >> > > > > > > > > > > > that it's too late for GC to cause > > > >> > DirectMemory > > > >> > >> >> OOM. > > > >> > >> >> > > > Release > > > >> > >> >> > > > > > and > > > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on the > > type > > > of > > > >> > user > > > >> > >> >> job, > > > >> > >> >> > > > which > > > >> > >> >> > > > > is > > > >> > >> >> > > > > > > > > > > > often beyond our control. > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > > > >> > >> >> > > > > > > > > > > > Jingsong Lee > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > >> ------------------------------------------------------------------ > > > >> > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email]> > > > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified > > > >> Memory > > > >> > >> >> > > Configuration > > > >> > >> >> > > > > for > > > >> > >> >> > > > > > > > > > > > TaskExecutors > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 > (manually > > > >> release > > > >> > >> >> memory) > > > >> > >> >> > > is > > > >> > >> >> > > > > that > > > >> > >> >> > > > > > > > > > segfaults > > > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of > alarms > > on > > > >> user > > > >> > >> >> ends. > > > >> > >> >> > So > > > >> > >> >> > > we > > > >> > >> >> > > > > > need > > > >> > >> >> > > > > > > to > > > >> > >> >> > > > > > > > > > > > guarantee that this never happens. > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that uses > > data > > > >> > >> >> structures / > > > >> > >> >> > > > > > algorithms > > > >> > >> >> > > > > > > > > with > > > >> > >> >> > > > > > > > > > > > additional threads, like hash table > > > >> spill/read > > > >> > >> and > > > >> > >> >> > > sorting > > > >> > >> >> > > > > > > threads. > > > >> > >> >> > > > > > > > > We > > > >> > >> >> > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut down > > > >> before > > > >> > we > > > >> > >> can > > > >> > >> >> > exit > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > task. > > > >> > >> >> > > > > > > > > > > > I am not sure that we have that > > guaranteed > > > >> > >> already, > > > >> > >> >> > > that's > > > >> > >> >> > > > > why > > > >> > >> >> > > > > > > > option > > > >> > >> >> > > > > > > > > > 1.1 > > > >> > >> >> > > > > > > > > > > > seemed simpler to me. > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM > Xintong > > > >> Song < > > > >> > >> >> > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. > > > >> Summarized > > > >> > in > > > >> > >> >> this > > > >> > >> >> > > way > > > >> > >> >> > > > > > really > > > >> > >> >> > > > > > > > > makes > > > >> > >> >> > > > > > > > > > > > > things easier to understand. > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least > for > > > the > > > >> > >> >> moment. I > > > >> > >> >> > > > think > > > >> > >> >> > > > > it > > > >> > >> >> > > > > > > is > > > >> > >> >> > > > > > > > > not > > > >> > >> >> > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault safe > for > > > >> memory > > > >> > >> >> > manager, > > > >> > >> >> > > as > > > >> > >> >> > > > > > long > > > >> > >> >> > > > > > > as > > > >> > >> >> > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > always > > > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment when > it > > > is > > > >> > >> released > > > >> > >> >> > from > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory > consumer > > > >> > continue > > > >> > >> >> using > > > >> > >> >> > > the > > > >> > >> >> > > > > > > buffer > > > >> > >> >> > > > > > > > of > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > segment after releasing it, in which > > > case > > > >> we > > > >> > do > > > >> > >> >> want > > > >> > >> >> > > the > > > >> > >> >> > > > > job > > > >> > >> >> > > > > > to > > > >> > >> >> > > > > > > > > fail > > > >> > >> >> > > > > > > > > > so > > > >> > >> >> > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think this > is > > a > > > >> good > > > >> > >> idea. > > > >> > >> >> > Not > > > >> > >> >> > > > only > > > >> > >> >> > > > > > > > because > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough to > > > clean > > > >> > >> direct > > > >> > >> >> > > buffers) > > > >> > >> >> > > > > may > > > >> > >> >> > > > > > > not > > > >> > >> >> > > > > > > > > > > always > > > >> > >> >> > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > true, but also it makes harder for > > > finding > > > >> > >> >> problems > > > >> > >> >> > in > > > >> > >> >> > > > > cases > > > >> > >> >> > > > > > of > > > >> > >> >> > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured some > > > direct > > > >> > >> memory > > > >> > >> >> for > > > >> > >> >> > > the > > > >> > >> >> > > > > > user > > > >> > >> >> > > > > > > > > > > libraries. > > > >> > >> >> > > > > > > > > > > > > If the library actually use more > > direct > > > >> > memory > > > >> > >> >> then > > > >> > >> >> > > > > > configured, > > > >> > >> >> > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they > > are > > > >> > still > > > >> > >> in > > > >> > >> >> > use, > > > >> > >> >> > > > may > > > >> > >> >> > > > > > lead > > > >> > >> >> > > > > > > > to > > > >> > >> >> > > > > > > > > > > > overuse > > > >> > >> >> > > > > > > > > > > > > of the total container memory. In > that > > > >> case, > > > >> > >> if it > > > >> > >> >> > > didn't > > > >> > >> >> > > > > > touch > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > default max direct memory limit, we > > > cannot > > > >> > get > > > >> > >> a > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > OOM > > > >> > >> >> > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > it > > > >> > >> >> > > > > > > > > > > > > will become super hard to understand > > > which > > > >> > >> part of > > > >> > >> >> > the > > > >> > >> >> > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > > to be updated. > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the similar > > > >> problem as > > > >> > >> >> 1.2, if > > > >> > >> >> > > the > > > >> > >> >> > > > > > > > exceeded > > > >> > >> >> > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > memory does not reach the max direct > > > >> memory > > > >> > >> limit > > > >> > >> >> > > > specified > > > >> > >> >> > > > > > by > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it is > > > >> slightly > > > >> > >> better > > > >> > >> >> > than > > > >> > >> >> > > > > 1.2, > > > >> > >> >> > > > > > > only > > > >> > >> >> > > > > > > > > > > because > > > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM > > Stephan > > > >> Ewen > > > >> > < > > > >> > >> >> > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the > "-XX:MaxDirectMemorySize" > > > >> > >> discussion, > > > >> > >> >> > maybe > > > >> > >> >> > > > let > > > >> > >> >> > > > > > me > > > >> > >> >> > > > > > > > > > > summarize > > > >> > >> >> > > > > > > > > > > > > it a > > > >> > >> >> > > > > > > > > > > > > > bit differently: > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two options: > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > > > >> de-allocated > > > >> > by > > > >> > >> the > > > >> > >> >> > GC. > > > >> > >> >> > > > That > > > >> > >> >> > > > > > > makes > > > >> > >> >> > > > > > > > > it > > > >> > >> >> > > > > > > > > > > > > segfault > > > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way to > > > trigger > > > >> GC > > > >> > in > > > >> > >> >> case > > > >> > >> >> > > > > > > > de-allocation > > > >> > >> >> > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of > segments > > > >> > happens > > > >> > >> >> > quickly, > > > >> > >> >> > > > > which > > > >> > >> >> > > > > > > is > > > >> > >> >> > > > > > > > > > often > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > case during batch scheduling or > task > > > >> > restart. > > > >> > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" > > > >> (option > > > >> > >> 1.1) > > > >> > >> >> is > > > >> > >> >> > one > > > >> > >> >> > > > way > > > >> > >> >> > > > > > to > > > >> > >> >> > > > > > > do > > > >> > >> >> > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > - Another way could be to have a > > > >> > dedicated > > > >> > >> >> > > > bookkeeping > > > >> > >> >> > > > > in > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so > that > > > this > > > >> > is a > > > >> > >> >> > number > > > >> > >> >> > > > > > > > independent > > > >> > >> >> > > > > > > > > of > > > >> > >> >> > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" > parameter. > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > > > de-allocate > > > >> > the > > > >> > >> >> memory > > > >> > >> >> > > for > > > >> > >> >> > > > > the > > > >> > >> >> > > > > > > > > > > > > MemorySegments > > > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need not > > worry > > > >> > about > > > >> > >> >> > > triggering > > > >> > >> >> > > > > GC > > > >> > >> >> > > > > > by > > > >> > >> >> > > > > > > > > some > > > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it > is > > > >> harder > > > >> > to > > > >> > >> >> > prevent > > > >> > >> >> > > > > > > > segfaults. > > > >> > >> >> > > > > > > > > We > > > >> > >> >> > > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > be very careful about when we > > release > > > >> the > > > >> > >> memory > > > >> > >> >> > > > segments > > > >> > >> >> > > > > > > (only > > > >> > >> >> > > > > > > > > in > > > >> > >> >> > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we > > probably > > > >> need > > > >> > to > > > >> > >> >> set > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > > >> > >> >> > > "off_heap_managed_memory + > > > >> > >> >> > > > > > > > > > > direct_memory" > > > >> > >> >> > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate > > > >> reserved > > > >> > >> >> memory > > > >> > >> >> > > > pool. > > > >> > >> >> > > > > > > > Because > > > >> > >> >> > > > > > > > > if > > > >> > >> >> > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > just > > > >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > > >> > >> >> > > > > "off_heap_managed_memory + > > > >> > >> >> > > > > > > > > > > > > jvm_overhead", > > > >> > >> >> > > > > > > > > > > > > > then there will be times when that > > > >> entire > > > >> > >> >> memory is > > > >> > >> >> > > > > > allocated > > > >> > >> >> > > > > > > > by > > > >> > >> >> > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing left > for > > > the > > > >> > JVM > > > >> > >> >> > > overhead. > > > >> > >> >> > > > So > > > >> > >> >> > > > > > we > > > >> > >> >> > > > > > > > > either > > > >> > >> >> > > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > > a > > > >> > >> >> > > > > > > > > > > > > > way to compensate for that (again > > some > > > >> > safety > > > >> > >> >> > margin > > > >> > >> >> > > > > cutoff > > > >> > >> >> > > > > > > > > value) > > > >> > >> >> > > > > > > > > > or > > > >> > >> >> > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > will exceed container memory. > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need > to > > > be > > > >> > aware > > > >> > >> >> that > > > >> > >> >> > it > > > >> > >> >> > > > > takes > > > >> > >> >> > > > > > > > > > elaborate > > > >> > >> >> > > > > > > > > > > > > logic > > > >> > >> >> > > > > > > > > > > > > > to push recycling of direct > buffers > > > >> without > > > >> > >> >> always > > > >> > >> >> > > > > > > triggering a > > > >> > >> >> > > > > > > > > > full > > > >> > >> >> > > > > > > > > > > > GC. > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the options > > > will > > > >> be > > > >> > >> >> easiest > > > >> > >> >> > to > > > >> > >> >> > > > do > > > >> > >> >> > > > > in > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > following > > > >> > >> >> > > > > > > > > > > > > > order: > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > > > >> > direct_memory > > > >> > >> >> > > > parameter, > > > >> > >> >> > > > > as > > > >> > >> >> > > > > > > > > > discussed > > > >> > >> >> > > > > > > > > > > > > > above. We would need to find a way > > to > > > >> set > > > >> > the > > > >> > >> >> > > > > direct_memory > > > >> > >> >> > > > > > > > > > parameter > > > >> > >> >> > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > default. We could start with 64 MB > > and > > > >> see > > > >> > >> how > > > >> > >> >> it > > > >> > >> >> > > goes > > > >> > >> >> > > > in > > > >> > >> >> > > > > > > > > practice. > > > >> > >> >> > > > > > > > > > > One > > > >> > >> >> > > > > > > > > > > > > > danger I see is that setting this > > loo > > > >> low > > > >> > can > > > >> > >> >> > cause a > > > >> > >> >> > > > > bunch > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > > > > additional > > > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we need to > > > watch > > > >> > this > > > >> > >> >> > > > carefully). > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite > > > >> simple > > > >> > to > > > >> > >> >> > > implement, > > > >> > >> >> > > > > we > > > >> > >> >> > > > > > > > could > > > >> > >> >> > > > > > > > > > try > > > >> > >> >> > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the > moment. > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch > > the > > > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > > >> > >> >> > > > > > > > > > > > > parameter > > > >> > >> >> > > > > > > > > > > > > > at all and assume that all the > > direct > > > >> > memory > > > >> > >> >> > > > allocations > > > >> > >> >> > > > > > that > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to > be > > > >> > cleaned > > > >> > >> up > > > >> > >> >> > fast > > > >> > >> >> > > > > enough > > > >> > >> >> > > > > > > > > through > > > >> > >> >> > > > > > > > > > > > > regular > > > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a > valid > > > >> > >> assumption, > > > >> > >> >> > > > though. > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > > > >> > >> >> > > > > > > > > > > > > > Stephan > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > > Xintong > > > >> > Song > > > >> > >> < > > > >> > >> >> > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative > > 2. > > > I > > > >> was > > > >> > >> >> > wondering > > > >> > >> >> > > > > > whether > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > can > > > >> > >> >> > > > > > > > > > > > > avoid > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > off-heap > > > >> > >> managed > > > >> > >> >> > memory > > > >> > >> >> > > > and > > > >> > >> >> > > > > > > > network > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > with > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving > > it a > > > >> > second > > > >> > >> >> > > thought, > > > >> > >> >> > > > I > > > >> > >> >> > > > > > > think > > > >> > >> >> > > > > > > > > even > > > >> > >> >> > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > memory > > > for > > > >> > >> off-heap > > > >> > >> >> > > > managed > > > >> > >> >> > > > > > > memory > > > >> > >> >> > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > cause > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think > > what > > > >> > >> proposed > > > >> > >> >> in > > > >> > >> >> > > this > > > >> > >> >> > > > > > FLIP > > > >> > >> >> > > > > > > it > > > >> > >> >> > > > > > > > > to > > > >> > >> >> > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > both > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > network > > > >> > memory > > > >> > >> >> > > allocated > > > >> > >> >> > > > > > > through > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > > they > > > >> are > > > >> > >> >> > practically > > > >> > >> >> > > > > > native > > > >> > >> >> > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > memory. > > > The > > > >> > only > > > >> > >> >> parts > > > >> > >> >> > of > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > > > limited > > > >> > >> >> > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > off-heap > > > >> > memory > > > >> > >> and > > > >> > >> >> > JVM > > > >> > >> >> > > > > > > overhead, > > > >> > >> >> > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests > to > > > set > > > >> the > > > >> > >> JVM > > > >> > >> >> max > > > >> > >> >> > > > > direct > > > >> > >> >> > > > > > > > memory > > > >> > >> >> > > > > > > > > > to. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM > > Till > > > >> > >> Rohrmann > > > >> > >> >> < > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > > >> Xintong. I > > > >> > >> >> > > understand > > > >> > >> >> > > > > the > > > >> > >> >> > > > > > > two > > > >> > >> >> > > > > > > > > > > > > alternatives > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > option 2 > > > >> > because > > > >> > >> it > > > >> > >> >> > makes > > > >> > >> >> > > > > > things > > > >> > >> >> > > > > > > > > > > explicit. > > > >> > >> >> > > > > > > > > > > > If > > > >> > >> >> > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > memory, I > > > >> fear > > > >> > >> that > > > >> > >> >> we > > > >> > >> >> > > might > > > >> > >> >> > > > > end > > > >> > >> >> > > > > > > up > > > >> > >> >> > > > > > > > > in a > > > >> > >> >> > > > > > > > > > > > > similar > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently > > in: > > > >> The > > > >> > >> user > > > >> > >> >> > might > > > >> > >> >> > > > see > > > >> > >> >> > > > > > that > > > >> > >> >> > > > > > > > her > > > >> > >> >> > > > > > > > > > > > process > > > >> > >> >> > > > > > > > > > > > > > > gets > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not > > know > > > >> why > > > >> > >> this > > > >> > >> >> is > > > >> > >> >> > > the > > > >> > >> >> > > > > > case. > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > >> > >> >> > > > > > > > > > > > > > > she > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > > > memory > > > >> > size > > > >> > >> >> > > (similar > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > > > increasing > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate > > for > > > >> the > > > >> > >> extra > > > >> > >> >> > > direct > > > >> > >> >> > > > > > > memory. > > > >> > >> >> > > > > > > > > > Even > > > >> > >> >> > > > > > > > > > > > > worse, > > > >> > >> >> > > > > > > > > > > > > > > she > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > budgets > > > >> which > > > >> > >> are > > > >> > >> >> not > > > >> > >> >> > > > fully > > > >> > >> >> > > > > > used > > > >> > >> >> > > > > > > > and > > > >> > >> >> > > > > > > > > > > hence > > > >> > >> >> > > > > > > > > > > > > > won't > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > >> consumption. > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 > AM > > > >> > Xintong > > > >> > >> >> Song < > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > > >> concrete > > > >> > >> >> example > > > >> > >> >> > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > following > > > >> > >> scenario. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > Off-Heap > > > >> > >> Memory + > > > >> > >> >> JVM > > > >> > >> >> > > > > > > Overhead): > > > >> > >> >> > > > > > > > > > 200MB > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > Memory, > > > JVM > > > >> > >> >> Metaspace, > > > >> > >> >> > > > > > Off-Heap > > > >> > >> >> > > > > > > > > > Managed > > > >> > >> >> > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > 200MB. > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > a > > > >> > >> >> > > > > > > > very > > > >> > >> >> > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > value, > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > > usage > > > >> of > > > >> > >> Task > > > >> > >> >> > > > Off-Heap > > > >> > >> >> > > > > > > Memory > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > > >> > alternative 2 > > > >> > >> >> and > > > >> > >> >> > > > > > > alternative 3 > > > >> > >> >> > > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > > > will > > > >> > >> >> > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > reduce > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > > pools. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > > usage > > > >> of > > > >> > >> Task > > > >> > >> >> > > > Off-Heap > > > >> > >> >> > > > > > > Memory > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > > > 200MB, > > > >> > then > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > > from > > > >> > >> frequent > > > >> > >> >> OOM. > > > >> > >> >> > > To > > > >> > >> >> > > > > > avoid > > > >> > >> >> > > > > > > > > that, > > > >> > >> >> > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > only > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify > > the > > > >> > >> >> configuration > > > >> > >> >> > > and > > > >> > >> >> > > > > > > > increase > > > >> > >> >> > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > Direct > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + > JVM > > > >> > >> Overhead). > > > >> > >> >> > Let's > > > >> > >> >> > > > say > > > >> > >> >> > > > > > > that > > > >> > >> >> > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > increases > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > > this > > > >> will > > > >> > >> >> reduce > > > >> > >> >> > the > > > >> > >> >> > > > > total > > > >> > >> >> > > > > > > > size > > > >> > >> >> > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > other > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > > > total > > > >> > >> process > > > >> > >> >> > > memory > > > >> > >> >> > > > > > > remains > > > >> > >> >> > > > > > > > > > 1GB. > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > there > > is > > > >> no > > > >> > >> >> chance of > > > >> > >> >> > > > > direct > > > >> > >> >> > > > > > > OOM. > > > >> > >> >> > > > > > > > > > There > > > >> > >> >> > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > > process > > > >> > >> memory > > > >> > >> >> > limit, > > > >> > >> >> > > > but > > > >> > >> >> > > > > > > given > > > >> > >> >> > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > process > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > reserved > > > >> native > > > >> > >> >> memory > > > >> > >> >> > > > > > (Off-Heap > > > >> > >> >> > > > > > > > > > Managed > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), > if > > > the > > > >> > >> actual > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > > usage > > > >> > >> >> > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, > > user > > > >> > >> probably > > > >> > >> >> do > > > >> > >> >> > > not > > > >> > >> >> > > > > need > > > >> > >> >> > > > > > > to > > > >> > >> >> > > > > > > > > > change > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > > > user's > > > >> > >> >> > > perspective, a > > > >> > >> >> > > > > > > > feasible > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead > to > > > >> lower > > > >> > >> >> resource > > > >> > >> >> > > > > > > utilization > > > >> > >> >> > > > > > > > > > > compared > > > >> > >> >> > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 10:28 > > AM > > > >> Till > > > >> > >> >> > Rohrmann > > > >> > >> >> > > < > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help > me > > > >> > >> understand > > > >> > >> >> the > > > >> > >> >> > > > > > difference > > > >> > >> >> > > > > > > > > > between > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > > >> > utilization > > > >> > >> >> > > Xintong. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > >> > >> >> XX:MaxDirectMemorySize > > > >> > >> >> > > to > > > >> > >> >> > > > > Task > > > >> > >> >> > > > > > > > > > Off-Heap > > > >> > >> >> > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is > the > > > risk > > > >> > that > > > >> > >> >> this > > > >> > >> >> > > size > > > >> > >> >> > > > > is > > > >> > >> >> > > > > > > too > > > >> > >> >> > > > > > > > > low > > > >> > >> >> > > > > > > > > > > > > > resulting > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection > > and > > > >> > >> >> potentially > > > >> > >> >> > an > > > >> > >> >> > > > OOM. > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > >> > >> >> XX:MaxDirectMemorySize > > > >> > >> >> > > to > > > >> > >> >> > > > > > > > something > > > >> > >> >> > > > > > > > > > > larger > > > >> > >> >> > > > > > > > > > > > > > than > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would > of > > > >> course > > > >> > >> >> reduce > > > >> > >> >> > > the > > > >> > >> >> > > > > > sizes > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > other > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 > now > > > >> result > > > >> > >> in an > > > >> > >> >> > > under > > > >> > >> >> > > > > > > > > utilization > > > >> > >> >> > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? > > If > > > >> > >> >> alternative 3 > > > >> > >> >> > > > > > strictly > > > >> > >> >> > > > > > > > > sets a > > > >> > >> >> > > > > > > > > > > > > higher > > > >> > >> >> > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we > > use > > > >> only > > > >> > >> >> little, > > > >> > >> >> > > > then I > > > >> > >> >> > > > > > > would > > > >> > >> >> > > > > > > > > > > expect > > > >> > >> >> > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > > memory > > > >> > under > > > >> > >> >> > > > > utilization. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > 4:19 > > > PM > > > >> > Yang > > > >> > >> >> Wang < > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a > very > > > >> large > > > >> > >> max > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > size > > > >> > >> >> > > > > > > > > > > when > > > >> > >> >> > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > do > > > >> > >> >> > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > > > >> native > > > >> > >> >> memory. > > > >> > >> >> > If > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > framework > > > >> > direct > > > >> > >> >> > > > memory,could > > > >> > >> >> > > > > > be > > > >> > >> >> > > > > > > > > > > calculated > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > > > >> direct > > > >> > >> memory > > > >> > >> >> > with > > > >> > >> >> > > > > fixed > > > >> > >> >> > > > > > > > > value. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. > For > > > Yarn > > > >> > and > > > >> > >> >> k8s,we > > > >> > >> >> > > > need > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > check > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client > > to > > > >> avoid > > > >> > >> >> > > submitting > > > >> > >> >> > > > > > > > > successfully > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > failing > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > >> > >> [hidden email] > > > >> > >> >> > > > > >于2019年8月13日 > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > > > think > > > >> > you > > > >> > >> are > > > >> > >> >> > > right > > > >> > >> >> > > > > that > > > >> > >> >> > > > > > > we > > > >> > >> >> > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > include > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of > > this > > > >> > FLIP. > > > >> > >> >> This > > > >> > >> >> > > FLIP > > > >> > >> >> > > > > > should > > > >> > >> >> > > > > > > > > > > > concentrate > > > >> > >> >> > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools > > for > > > >> > >> >> > TaskExecutors, > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > > > minimum > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > >> > >> >> > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use > it. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > > > think > > > >> > >> >> > alternative > > > >> > >> >> > > 3 > > > >> > >> >> > > > > may > > > >> > >> >> > > > > > > not > > > >> > >> >> > > > > > > > > > having > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > same > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > > >> > >> alternative 2 > > > >> > >> >> > > does, > > > >> > >> >> > > > > but > > > >> > >> >> > > > > > at > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > > cost > > > >> > >> >> > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > risk > > > >> > >> >> > > > > > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > > >> container > > > >> > >> level, > > > >> > >> >> > > which > > > >> > >> >> > > > is > > > >> > >> >> > > > > > not > > > >> > >> >> > > > > > > > > good. > > > >> > >> >> > > > > > > > > > > My > > > >> > >> >> > > > > > > > > > > > > > point > > > >> > >> >> > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > > Memory" > > > >> and > > > >> > >> "JVM > > > >> > >> >> > > > > Overhead" > > > >> > >> >> > > > > > > are > > > >> > >> >> > > > > > > > > not > > > >> > >> >> > > > > > > > > > > easy > > > >> > >> >> > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > > might > > > >> > >> configure > > > >> > >> >> > them > > > >> > >> >> > > > > > higher > > > >> > >> >> > > > > > > > than > > > >> > >> >> > > > > > > > > > > what > > > >> > >> >> > > > > > > > > > > > > > > actually > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting > a > > > >> direct > > > >> > >> OOM. > > > >> > >> >> For > > > >> > >> >> > > > > > > alternative > > > >> > >> >> > > > > > > > > 3, > > > >> > >> >> > > > > > > > > > > > users > > > >> > >> >> > > > > > > > > > > > > do > > > >> > >> >> > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they > may > > > not > > > >> > >> config > > > >> > >> >> the > > > >> > >> >> > > two > > > >> > >> >> > > > > > > options > > > >> > >> >> > > > > > > > > > > > > aggressively > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > > risks > > > >> of > > > >> > >> >> overall > > > >> > >> >> > > > > container > > > >> > >> >> > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > usage > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > at > > > >> 9:39 AM > > > >> > >> Till > > > >> > >> >> > > > > Rohrmann < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > > > this > > > >> > FLIP > > > >> > >> >> > Xintong. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think > it > > > >> already > > > >> > >> >> looks > > > >> > >> >> > > quite > > > >> > >> >> > > > > > good. > > > >> > >> >> > > > > > > > > > > > Concerning > > > >> > >> >> > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > allocating > > > >> > >> memory > > > >> > >> >> > > > segments, > > > >> > >> >> > > > > I > > > >> > >> >> > > > > > > was > > > >> > >> >> > > > > > > > > > > > wondering > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary > to > > do > > > >> in > > > >> > the > > > >> > >> >> > context > > > >> > >> >> > > > of > > > >> > >> >> > > > > > this > > > >> > >> >> > > > > > > > > FLIP > > > >> > >> >> > > > > > > > > > or > > > >> > >> >> > > > > > > > > > > > > > whether > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow > > up? > > > >> > Without > > > >> > >> >> > knowing > > > >> > >> >> > > > all > > > >> > >> >> > > > > > > > > details, > > > >> > >> >> > > > > > > > > > I > > > >> > >> >> > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen > > the > > > >> scope > > > >> > >> of > > > >> > >> >> this > > > >> > >> >> > > > FLIP > > > >> > >> >> > > > > > too > > > >> > >> >> > > > > > > > much > > > >> > >> >> > > > > > > > > > > > because > > > >> > >> >> > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > existing > > > >> > call > > > >> > >> >> sites > > > >> > >> >> > of > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > > > > > MemoryManager > > > >> > >> >> > > > > > > > > > > > > > where > > > >> > >> >> > > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > (this > > > >> should > > > >> > >> >> mainly > > > >> > >> >> > be > > > >> > >> >> > > > > batch > > > >> > >> >> > > > > > > > > > > operators). > > > >> > >> >> > > > > > > > > > > > > The > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > reservation > > > >> call > > > >> > to > > > >> > >> the > > > >> > >> >> > > > > > > MemoryManager > > > >> > >> >> > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > hope > > > that > > > >> > >> this is > > > >> > >> >> > the > > > >> > >> >> > > > only > > > >> > >> >> > > > > > > point > > > >> > >> >> > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > >> > >> >> > > > > > > > > > > > > > > > a > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would > > have > > > >> with > > > >> > >> the > > > >> > >> >> > > > > > > MemoryManager. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > second > > > open > > > >> > >> >> question > > > >> > >> >> > > about > > > >> > >> >> > > > > > > setting > > > >> > >> >> > > > > > > > > or > > > >> > >> >> > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > setting > > > >> > >> >> > > > > > > > > > > > > > > > a > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > limit, I > > > >> would > > > >> > >> also > > > >> > >> >> be > > > >> > >> >> > > > > > interested > > > >> > >> >> > > > > > > > why > > > >> > >> >> > > > > > > > > > > Yang > > > >> > >> >> > > > > > > > > > > > > Wang > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > would > > be > > > >> > best. > > > >> > >> My > > > >> > >> >> > > concern > > > >> > >> >> > > > > > about > > > >> > >> >> > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > situation > > > >> as > > > >> > we > > > >> > >> >> are > > > >> > >> >> > now > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > > > pools > > > >> > are > > > >> > >> not > > > >> > >> >> > > > clearly > > > >> > >> >> > > > > > > > > separated > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > can > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, > then > > > it > > > >> is > > > >> > >> quite > > > >> > >> >> > hard > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > > understand > > > >> > >> >> > > > > > > > > > > > what > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > killed > > > for > > > >> > using > > > >> > >> >> too > > > >> > >> >> > > much > > > >> > >> >> > > > > > > memory. > > > >> > >> >> > > > > > > > > This > > > >> > >> >> > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > then > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > > >> situation > > > >> > >> what > > > >> > >> >> we > > > >> > >> >> > > have > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > >> > >> >> > > > > > > > > > > > > > > > So > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > default > > > >> value > > > >> > >> for > > > >> > >> >> max > > > >> > >> >> > > > direct > > > >> > >> >> > > > > > > > memory > > > >> > >> >> > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > giving > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase > it > > if > > > >> he > > > >> > >> runs > > > >> > >> >> into > > > >> > >> >> > > an > > > >> > >> >> > > > > OOM. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > > >> > >> alternative 2 > > > >> > >> >> > lead > > > >> > >> >> > > to > > > >> > >> >> > > > > > lower > > > >> > >> >> > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where > we > > > set > > > >> > the > > > >> > >> >> direct > > > >> > >> >> > > > > memory > > > >> > >> >> > > > > > > to a > > > >> > >> >> > > > > > > > > > > higher > > > >> > >> >> > > > > > > > > > > > > > value? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 > at > > > >> 9:12 > > > >> > AM > > > >> > >> >> > Xintong > > > >> > >> >> > > > > Song < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > feedback, > > > >> > >> Yang. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > comments: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > > > >> Memory* > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a > > very > > > >> > large > > > >> > >> max > > > >> > >> >> > > direct > > > >> > >> >> > > > > > > memory > > > >> > >> >> > > > > > > > > size > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., > we > > > do > > > >> not > > > >> > >> >> worry > > > >> > >> >> > > about > > > >> > >> >> > > > > > > direct > > > >> > >> >> > > > > > > > > OOM, > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > managed > > / > > > >> > network > > > >> > >> >> > memory > > > >> > >> >> > > > with > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > > > also > > > >> > some > > > >> > >> >> down > > > >> > >> >> > > sides > > > >> > >> >> > > > > of > > > >> > >> >> > > > > > > > doing > > > >> > >> >> > > > > > > > > > > this. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I > can > > > >> think > > > >> > >> of is > > > >> > >> >> > that > > > >> > >> >> > > > if > > > >> > >> >> > > > > a > > > >> > >> >> > > > > > > task > > > >> > >> >> > > > > > > > > > > > executor > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > > >> overusing > > > >> > >> >> memory, > > > >> > >> >> > it > > > >> > >> >> > > > > could > > > >> > >> >> > > > > > > be > > > >> > >> >> > > > > > > > > hard > > > >> > >> >> > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > use > > > >> > >> >> > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory > is > > > >> > overused. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down > > side > > > >> is > > > >> > >> that > > > >> > >> >> the > > > >> > >> >> > > JVM > > > >> > >> >> > > > > > never > > > >> > >> >> > > > > > > > > > trigger > > > >> > >> >> > > > > > > > > > > GC > > > >> > >> >> > > > > > > > > > > > > due > > > >> > >> >> > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > > limit, > > > >> > >> because > > > >> > >> >> the > > > >> > >> >> > > > limit > > > >> > >> >> > > > > > is > > > >> > >> >> > > > > > > > too > > > >> > >> >> > > > > > > > > > high > > > >> > >> >> > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind > of > > > >> relay > > > >> > on > > > >> > >> >> heap > > > >> > >> >> > > > memory > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > > trigger > > > >> > >> >> > > > > > > > > > > GC > > > >> > >> >> > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > > could > > > >> be a > > > >> > >> >> problem > > > >> > >> >> > in > > > >> > >> >> > > > > cases > > > >> > >> >> > > > > > > > where > > > >> > >> >> > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > more > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > > enough > > > >> > heap > > > >> > >> >> > activity > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > trigger > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > GC. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > share > > > your > > > >> > >> reasons > > > >> > >> >> > for > > > >> > >> >> > > > > > > preferring > > > >> > >> >> > > > > > > > > > > > setting a > > > >> > >> >> > > > > > > > > > > > > > > very > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > anything > > > >> else > > > >> > I > > > >> > >> >> > > > overlooked. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > Calculation* > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > > conflict > > > >> > >> between > > > >> > >> >> > > > multiple > > > >> > >> >> > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > specified, > > > I > > > >> > >> think we > > > >> > >> >> > > should > > > >> > >> >> > > > > > throw > > > >> > >> >> > > > > > > > an > > > >> > >> >> > > > > > > > > > > error. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > checking > > > >> on > > > >> > the > > > >> > >> >> > client > > > >> > >> >> > > > side > > > >> > >> >> > > > > > is > > > >> > >> >> > > > > > > a > > > >> > >> >> > > > > > > > > good > > > >> > >> >> > > > > > > > > > > > idea, > > > >> > >> >> > > > > > > > > > > > > > so > > > >> > >> >> > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > discover > > > the > > > >> > >> problem > > > >> > >> >> > > before > > > >> > >> >> > > > > > > > submitting > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > Flink > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > > thing. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > only > > > >> rely on > > > >> > >> the > > > >> > >> >> > > client > > > >> > >> >> > > > > side > > > >> > >> >> > > > > > > > > > checking, > > > >> > >> >> > > > > > > > > > > > > > because > > > >> > >> >> > > > > > > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > > >> > >> TaskManagers > > > >> > >> >> on > > > >> > >> >> > > > > > different > > > >> > >> >> > > > > > > > > > machines > > > >> > >> >> > > > > > > > > > > > may > > > >> > >> >> > > > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and > > the > > > >> > client > > > >> > >> >> does > > > >> > >> >> > > see > > > >> > >> >> > > > > > that. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > 2019 > > at > > > >> 5:09 > > > >> > >> PM > > > >> > >> >> Yang > > > >> > >> >> > > > Wang > > > >> > >> >> > > > > < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > > > >> detailed > > > >> > >> >> > proposal. > > > >> > >> >> > > > > After > > > >> > >> >> > > > > > > all > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > > will > > > be > > > >> > more > > > >> > >> >> > > powerful > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > control > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > flink > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > > >> questions > > > >> > >> about > > > >> > >> >> it. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > > > Direct > > > >> > >> Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > >> differentiate > > > >> > >> user > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > and > > > >> > >> >> > > > > > > > > > > native > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > > > >> off-heap > > > >> > >> >> memory. > > > >> > >> >> > > > > Right? > > > >> > >> >> > > > > > > So i > > > >> > >> >> > > > > > > > > > don’t > > > >> > >> >> > > > > > > > > > > > > think > > > >> > >> >> > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > >> > -XX:MaxDirectMemorySize > > > >> > >> >> > > > properly. I > > > >> > >> >> > > > > > > > prefer > > > >> > >> >> > > > > > > > > > > > leaving > > > >> > >> >> > > > > > > > > > > > > > it a > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > >> Calculation > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of > and > > > >> > >> fine-grained > > > >> > >> >> > > > > > > memory(network > > > >> > >> >> > > > > > > > > > > memory, > > > >> > >> >> > > > > > > > > > > > > > > managed > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > > total > > > >> > >> process > > > >> > >> >> > > memory, > > > >> > >> >> > > > > how > > > >> > >> >> > > > > > do > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > deal > > > >> > >> >> > > > > > > > > > > > > with > > > >> > >> >> > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check > > the > > > >> > memory > > > >> > >> >> > > > > configuration > > > >> > >> >> > > > > > > in > > > >> > >> >> > > > > > > > > > > client? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > >> > >> >> > > [hidden email]> > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like > to > > > >> start > > > >> > a > > > >> > >> >> > > discussion > > > >> > >> >> > > > > > > thread > > > >> > >> >> > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration > > for > > > >> > >> >> > > > TaskExecutors"[1], > > > >> > >> >> > > > > > > where > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > describe > > > >> > >> >> > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > memory > > > >> > >> >> > > configurations. > > > >> > >> >> > > > > The > > > >> > >> >> > > > > > > > FLIP > > > >> > >> >> > > > > > > > > > > > document > > > >> > >> >> > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > > "Memory > > > >> > >> >> Management > > > >> > >> >> > > and > > > >> > >> >> > > > > > > > > > Configuration > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > > from > > > >> > >> follow-up > > > >> > >> >> > > > > discussions > > > >> > >> >> > > > > > > > both > > > >> > >> >> > > > > > > > > > > online > > > >> > >> >> > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > addresses > > > >> > >> several > > > >> > >> >> > > > > > shortcomings > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > > > current > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > memory > > > >> > >> >> > > configuration. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > > >> > >> configuration > > > >> > >> >> > for > > > >> > >> >> > > > > > > Streaming > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > Batch. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex > and > > > >> > >> difficult > > > >> > >> >> > > > > > configuration > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > > > RocksDB > > > >> > >> >> > > > > > > > > > > > > in > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Complicated, > > > >> > >> uncertain > > > >> > >> >> and > > > >> > >> >> > > > hard > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > > > > understand. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > > > solve > > > >> > the > > > >> > >> >> > problems > > > >> > >> >> > > > can > > > >> > >> >> > > > > > be > > > >> > >> >> > > > > > > > > > > summarized > > > >> > >> >> > > > > > > > > > > > > as > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > > memory > > > >> > >> manager > > > >> > >> >> to > > > >> > >> >> > > also > > > >> > >> >> > > > > > > account > > > >> > >> >> > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > usage > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify > how > > > >> > >> TaskExecutor > > > >> > >> >> > > memory > > > >> > >> >> > > > > is > > > >> > >> >> > > > > > > > > > > partitioned > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > >> reservations > > > >> > >> and > > > >> > >> >> > pools. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > > > memory > > > >> > >> >> > > configuration > > > >> > >> >> > > > > > > options > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find > more > > > >> > details > > > >> > >> in > > > >> > >> >> the > > > >> > >> >> > > > FLIP > > > >> > >> >> > > > > > wiki > > > >> > >> >> > > > > > > > > > > document > > > >> > >> >> > > > > > > > > > > > > [1]. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > > that > > > >> the > > > >> > >> early > > > >> > >> >> > > design > > > >> > >> >> > > > > doc > > > >> > >> >> > > > > > > [2] > > > >> > >> >> > > > > > > > is > > > >> > >> >> > > > > > > > > > out > > > >> > >> >> > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > > > have > > > >> the > > > >> > >> >> > > discussion > > > >> > >> >> > > > in > > > >> > >> >> > > > > > > this > > > >> > >> >> > > > > > > > > > > mailing > > > >> > >> >> > > > > > > > > > > > > list > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > forward > > to > > > >> your > > > >> > >> >> > > feedbacks. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > > Xintong > > > >> > Song > > > >> > >> < > > > >> > >> >> > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion > > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative > > 2. > > > I > > > >> was > > > >> > >> >> > wondering > > > >> > >> >> > > > > > whether > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > can > > > >> > >> >> > > > > > > > > > > > > avoid > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > off-heap > > > >> > >> managed > > > >> > >> >> > memory > > > >> > >> >> > > > and > > > >> > >> >> > > > > > > > network > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > with > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after giving > > it a > > > >> > second > > > >> > >> >> > > thought, > > > >> > >> >> > > > I > > > >> > >> >> > > > > > > think > > > >> > >> >> > > > > > > > > even > > > >> > >> >> > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > memory > > > for > > > >> > >> off-heap > > > >> > >> >> > > > managed > > > >> > >> >> > > > > > > memory > > > >> > >> >> > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > cause > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I think > > what > > > >> > >> proposed > > > >> > >> >> in > > > >> > >> >> > > this > > > >> > >> >> > > > > > FLIP > > > >> > >> >> > > > > > > it > > > >> > >> >> > > > > > > > > to > > > >> > >> >> > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > both > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > network > > > >> > memory > > > >> > >> >> > > allocated > > > >> > >> >> > > > > > > through > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > > they > > > >> are > > > >> > >> >> > practically > > > >> > >> >> > > > > > native > > > >> > >> >> > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > memory. > > > The > > > >> > only > > > >> > >> >> parts > > > >> > >> >> > of > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > > > limited > > > >> > >> >> > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > off-heap > > > >> > memory > > > >> > >> and > > > >> > >> >> > JVM > > > >> > >> >> > > > > > > overhead, > > > >> > >> >> > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests > to > > > set > > > >> the > > > >> > >> JVM > > > >> > >> >> max > > > >> > >> >> > > > > direct > > > >> > >> >> > > > > > > > memory > > > >> > >> >> > > > > > > > > > to. > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM > > Till > > > >> > >> Rohrmann > > > >> > >> >> < > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > > >> Xintong. I > > > >> > >> >> > > understand > > > >> > >> >> > > > > the > > > >> > >> >> > > > > > > two > > > >> > >> >> > > > > > > > > > > > > alternatives > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > option 2 > > > >> > because > > > >> > >> it > > > >> > >> >> > makes > > > >> > >> >> > > > > > things > > > >> > >> >> > > > > > > > > > > explicit. > > > >> > >> >> > > > > > > > > > > > If > > > >> > >> >> > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > memory, I > > > >> fear > > > >> > >> that > > > >> > >> >> we > > > >> > >> >> > > might > > > >> > >> >> > > > > end > > > >> > >> >> > > > > > > up > > > >> > >> >> > > > > > > > > in a > > > >> > >> >> > > > > > > > > > > > > similar > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are currently > > in: > > > >> The > > > >> > >> user > > > >> > >> >> > might > > > >> > >> >> > > > see > > > >> > >> >> > > > > > that > > > >> > >> >> > > > > > > > her > > > >> > >> >> > > > > > > > > > > > process > > > >> > >> >> > > > > > > > > > > > > > > gets > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does not > > know > > > >> why > > > >> > >> this > > > >> > >> >> is > > > >> > >> >> > > the > > > >> > >> >> > > > > > case. > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > >> > >> >> > > > > > > > > > > > > > > she > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the process > > > memory > > > >> > size > > > >> > >> >> > > (similar > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > > > increasing > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate > > for > > > >> the > > > >> > >> extra > > > >> > >> >> > > direct > > > >> > >> >> > > > > > > memory. > > > >> > >> >> > > > > > > > > > Even > > > >> > >> >> > > > > > > > > > > > > worse, > > > >> > >> >> > > > > > > > > > > > > > > she > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > budgets > > > >> which > > > >> > >> are > > > >> > >> >> not > > > >> > >> >> > > > fully > > > >> > >> >> > > > > > used > > > >> > >> >> > > > > > > > and > > > >> > >> >> > > > > > > > > > > hence > > > >> > >> >> > > > > > > > > > > > > > won't > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > >> consumption. > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 > AM > > > >> > Xintong > > > >> > >> >> Song < > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > > >> concrete > > > >> > >> >> example > > > >> > >> >> > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > following > > > >> > >> scenario. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > Off-Heap > > > >> > >> Memory + > > > >> > >> >> JVM > > > >> > >> >> > > > > > > Overhead): > > > >> > >> >> > > > > > > > > > 200MB > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > Memory, > > > JVM > > > >> > >> >> Metaspace, > > > >> > >> >> > > > > > Off-Heap > > > >> > >> >> > > > > > > > > > Managed > > > >> > >> >> > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > 200MB. > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > a > > > >> > >> >> > > > > > > > very > > > >> > >> >> > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > value, > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > > usage > > > >> of > > > >> > >> Task > > > >> > >> >> > > > Off-Heap > > > >> > >> >> > > > > > > Memory > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > > >> > alternative 2 > > > >> > >> >> and > > > >> > >> >> > > > > > > alternative 3 > > > >> > >> >> > > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > >> > >> >> > > > > > > will > > > >> > >> >> > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > reduce > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > > pools. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct memory > > > usage > > > >> of > > > >> > >> Task > > > >> > >> >> > > > Off-Heap > > > >> > >> >> > > > > > > Memory > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed > > > 200MB, > > > >> > then > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > > from > > > >> > >> frequent > > > >> > >> >> OOM. > > > >> > >> >> > > To > > > >> > >> >> > > > > > avoid > > > >> > >> >> > > > > > > > > that, > > > >> > >> >> > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > only > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to modify > > the > > > >> > >> >> configuration > > > >> > >> >> > > and > > > >> > >> >> > > > > > > > increase > > > >> > >> >> > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > Direct > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + > JVM > > > >> > >> Overhead). > > > >> > >> >> > Let's > > > >> > >> >> > > > say > > > >> > >> >> > > > > > > that > > > >> > >> >> > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > increases > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > > this > > > >> will > > > >> > >> >> reduce > > > >> > >> >> > the > > > >> > >> >> > > > > total > > > >> > >> >> > > > > > > > size > > > >> > >> >> > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > other > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the > > > total > > > >> > >> process > > > >> > >> >> > > memory > > > >> > >> >> > > > > > > remains > > > >> > >> >> > > > > > > > > > 1GB. > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > there > > is > > > >> no > > > >> > >> >> chance of > > > >> > >> >> > > > > direct > > > >> > >> >> > > > > > > OOM. > > > >> > >> >> > > > > > > > > > There > > > >> > >> >> > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > > process > > > >> > >> memory > > > >> > >> >> > limit, > > > >> > >> >> > > > but > > > >> > >> >> > > > > > > given > > > >> > >> >> > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > process > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > reserved > > > >> native > > > >> > >> >> memory > > > >> > >> >> > > > > > (Off-Heap > > > >> > >> >> > > > > > > > > > Managed > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), > if > > > the > > > >> > >> actual > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > > usage > > > >> > >> >> > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, > > user > > > >> > >> probably > > > >> > >> >> do > > > >> > >> >> > > not > > > >> > >> >> > > > > need > > > >> > >> >> > > > > > > to > > > >> > >> >> > > > > > > > > > change > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the > > > user's > > > >> > >> >> > > perspective, a > > > >> > >> >> > > > > > > > feasible > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead > to > > > >> lower > > > >> > >> >> resource > > > >> > >> >> > > > > > > utilization > > > >> > >> >> > > > > > > > > > > compared > > > >> > >> >> > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 10:28 > > AM > > > >> Till > > > >> > >> >> > Rohrmann > > > >> > >> >> > > < > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help > me > > > >> > >> understand > > > >> > >> >> the > > > >> > >> >> > > > > > difference > > > >> > >> >> > > > > > > > > > between > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > > > >> > utilization > > > >> > >> >> > > Xintong. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > >> > >> >> XX:MaxDirectMemorySize > > > >> > >> >> > > to > > > >> > >> >> > > > > Task > > > >> > >> >> > > > > > > > > > Off-Heap > > > >> > >> >> > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is > the > > > risk > > > >> > that > > > >> > >> >> this > > > >> > >> >> > > size > > > >> > >> >> > > > > is > > > >> > >> >> > > > > > > too > > > >> > >> >> > > > > > > > > low > > > >> > >> >> > > > > > > > > > > > > > resulting > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection > > and > > > >> > >> >> potentially > > > >> > >> >> > an > > > >> > >> >> > > > OOM. > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > >> > >> >> XX:MaxDirectMemorySize > > > >> > >> >> > > to > > > >> > >> >> > > > > > > > something > > > >> > >> >> > > > > > > > > > > larger > > > >> > >> >> > > > > > > > > > > > > > than > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would > of > > > >> course > > > >> > >> >> reduce > > > >> > >> >> > > the > > > >> > >> >> > > > > > sizes > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > other > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 > now > > > >> result > > > >> > >> in an > > > >> > >> >> > > under > > > >> > >> >> > > > > > > > > utilization > > > >> > >> >> > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? > > If > > > >> > >> >> alternative 3 > > > >> > >> >> > > > > > strictly > > > >> > >> >> > > > > > > > > sets a > > > >> > >> >> > > > > > > > > > > > > higher > > > >> > >> >> > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and we > > use > > > >> only > > > >> > >> >> little, > > > >> > >> >> > > > then I > > > >> > >> >> > > > > > > would > > > >> > >> >> > > > > > > > > > > expect > > > >> > >> >> > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > > memory > > > >> > under > > > >> > >> >> > > > > utilization. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > 4:19 > > > PM > > > >> > Yang > > > >> > >> >> Wang < > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a > very > > > >> large > > > >> > >> max > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > size > > > >> > >> >> > > > > > > > > > > when > > > >> > >> >> > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > do > > > >> > >> >> > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and > > > >> native > > > >> > >> >> memory. > > > >> > >> >> > If > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > framework > > > >> > direct > > > >> > >> >> > > > memory,could > > > >> > >> >> > > > > > be > > > >> > >> >> > > > > > > > > > > calculated > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting > > > >> direct > > > >> > >> memory > > > >> > >> >> > with > > > >> > >> >> > > > > fixed > > > >> > >> >> > > > > > > > > value. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. > For > > > Yarn > > > >> > and > > > >> > >> >> k8s,we > > > >> > >> >> > > > need > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > check > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in client > > to > > > >> avoid > > > >> > >> >> > > submitting > > > >> > >> >> > > > > > > > > successfully > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > failing > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > >> > >> [hidden email] > > > >> > >> >> > > > > >于2019年8月13日 > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > > Till. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I > > > think > > > >> > you > > > >> > >> are > > > >> > >> >> > > right > > > >> > >> >> > > > > that > > > >> > >> >> > > > > > > we > > > >> > >> >> > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > include > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of > > this > > > >> > FLIP. > > > >> > >> >> This > > > >> > >> >> > > FLIP > > > >> > >> >> > > > > > should > > > >> > >> >> > > > > > > > > > > > concentrate > > > >> > >> >> > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools > > for > > > >> > >> >> > TaskExecutors, > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > > > minimum > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > >> > >> >> > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use > it. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I > > > think > > > >> > >> >> > alternative > > > >> > >> >> > > 3 > > > >> > >> >> > > > > may > > > >> > >> >> > > > > > > not > > > >> > >> >> > > > > > > > > > having > > > >> > >> >> > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > same > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > > > >> > >> alternative 2 > > > >> > >> >> > > does, > > > >> > >> >> > > > > but > > > >> > >> >> > > > > > at > > > >> > >> >> > > > > > > > the > > > >> > >> >> > > > > > > > > > > cost > > > >> > >> >> > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > risk > > > >> > >> >> > > > > > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > > >> container > > > >> > >> level, > > > >> > >> >> > > which > > > >> > >> >> > > > is > > > >> > >> >> > > > > > not > > > >> > >> >> > > > > > > > > good. > > > >> > >> >> > > > > > > > > > > My > > > >> > >> >> > > > > > > > > > > > > > point > > > >> > >> >> > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > > Memory" > > > >> and > > > >> > >> "JVM > > > >> > >> >> > > > > Overhead" > > > >> > >> >> > > > > > > are > > > >> > >> >> > > > > > > > > not > > > >> > >> >> > > > > > > > > > > easy > > > >> > >> >> > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > > might > > > >> > >> configure > > > >> > >> >> > them > > > >> > >> >> > > > > > higher > > > >> > >> >> > > > > > > > than > > > >> > >> >> > > > > > > > > > > what > > > >> > >> >> > > > > > > > > > > > > > > actually > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting > a > > > >> direct > > > >> > >> OOM. > > > >> > >> >> For > > > >> > >> >> > > > > > > alternative > > > >> > >> >> > > > > > > > > 3, > > > >> > >> >> > > > > > > > > > > > users > > > >> > >> >> > > > > > > > > > > > > do > > > >> > >> >> > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they > may > > > not > > > >> > >> config > > > >> > >> >> the > > > >> > >> >> > > two > > > >> > >> >> > > > > > > options > > > >> > >> >> > > > > > > > > > > > > aggressively > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > > risks > > > >> of > > > >> > >> >> overall > > > >> > >> >> > > > > container > > > >> > >> >> > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > usage > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > at > > > >> 9:39 AM > > > >> > >> Till > > > >> > >> >> > > > > Rohrmann < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing > > > this > > > >> > FLIP > > > >> > >> >> > Xintong. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think > it > > > >> already > > > >> > >> >> looks > > > >> > >> >> > > quite > > > >> > >> >> > > > > > good. > > > >> > >> >> > > > > > > > > > > > Concerning > > > >> > >> >> > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > allocating > > > >> > >> memory > > > >> > >> >> > > > segments, > > > >> > >> >> > > > > I > > > >> > >> >> > > > > > > was > > > >> > >> >> > > > > > > > > > > > wondering > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary > to > > do > > > >> in > > > >> > the > > > >> > >> >> > context > > > >> > >> >> > > > of > > > >> > >> >> > > > > > this > > > >> > >> >> > > > > > > > > FLIP > > > >> > >> >> > > > > > > > > > or > > > >> > >> >> > > > > > > > > > > > > > whether > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow > > up? > > > >> > Without > > > >> > >> >> > knowing > > > >> > >> >> > > > all > > > >> > >> >> > > > > > > > > details, > > > >> > >> >> > > > > > > > > > I > > > >> > >> >> > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen > > the > > > >> scope > > > >> > >> of > > > >> > >> >> this > > > >> > >> >> > > > FLIP > > > >> > >> >> > > > > > too > > > >> > >> >> > > > > > > > much > > > >> > >> >> > > > > > > > > > > > because > > > >> > >> >> > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > existing > > > >> > call > > > >> > >> >> sites > > > >> > >> >> > of > > > >> > >> >> > > > the > > > >> > >> >> > > > > > > > > > > MemoryManager > > > >> > >> >> > > > > > > > > > > > > > where > > > >> > >> >> > > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > (this > > > >> should > > > >> > >> >> mainly > > > >> > >> >> > be > > > >> > >> >> > > > > batch > > > >> > >> >> > > > > > > > > > > operators). > > > >> > >> >> > > > > > > > > > > > > The > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > reservation > > > >> call > > > >> > to > > > >> > >> the > > > >> > >> >> > > > > > > MemoryManager > > > >> > >> >> > > > > > > > > > should > > > >> > >> >> > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > hope > > > that > > > >> > >> this is > > > >> > >> >> > the > > > >> > >> >> > > > only > > > >> > >> >> > > > > > > point > > > >> > >> >> > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > >> > >> >> > > > > > > > > > > > > > > > a > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would > > have > > > >> with > > > >> > >> the > > > >> > >> >> > > > > > > MemoryManager. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > second > > > open > > > >> > >> >> question > > > >> > >> >> > > about > > > >> > >> >> > > > > > > setting > > > >> > >> >> > > > > > > > > or > > > >> > >> >> > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > setting > > > >> > >> >> > > > > > > > > > > > > > > > a > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > limit, I > > > >> would > > > >> > >> also > > > >> > >> >> be > > > >> > >> >> > > > > > interested > > > >> > >> >> > > > > > > > why > > > >> > >> >> > > > > > > > > > > Yang > > > >> > >> >> > > > > > > > > > > > > Wang > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > would > > be > > > >> > best. > > > >> > >> My > > > >> > >> >> > > concern > > > >> > >> >> > > > > > about > > > >> > >> >> > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > situation > > > >> as > > > >> > we > > > >> > >> >> are > > > >> > >> >> > now > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different memory > > > pools > > > >> > are > > > >> > >> not > > > >> > >> >> > > > clearly > > > >> > >> >> > > > > > > > > separated > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > can > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, > then > > > it > > > >> is > > > >> > >> quite > > > >> > >> >> > hard > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > > understand > > > >> > >> >> > > > > > > > > > > > what > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > killed > > > for > > > >> > using > > > >> > >> >> too > > > >> > >> >> > > much > > > >> > >> >> > > > > > > memory. > > > >> > >> >> > > > > > > > > This > > > >> > >> >> > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > then > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > > >> situation > > > >> > >> what > > > >> > >> >> we > > > >> > >> >> > > have > > > >> > >> >> > > > > with > > > >> > >> >> > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > >> > >> >> > > > > > > > > > > > > > > > So > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > default > > > >> value > > > >> > >> for > > > >> > >> >> max > > > >> > >> >> > > > direct > > > >> > >> >> > > > > > > > memory > > > >> > >> >> > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > giving > > > >> > >> >> > > > > > > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase > it > > if > > > >> he > > > >> > >> runs > > > >> > >> >> into > > > >> > >> >> > > an > > > >> > >> >> > > > > OOM. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > > > >> > >> alternative 2 > > > >> > >> >> > lead > > > >> > >> >> > > to > > > >> > >> >> > > > > > lower > > > >> > >> >> > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where > we > > > set > > > >> > the > > > >> > >> >> direct > > > >> > >> >> > > > > memory > > > >> > >> >> > > > > > > to a > > > >> > >> >> > > > > > > > > > > higher > > > >> > >> >> > > > > > > > > > > > > > value? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 > at > > > >> 9:12 > > > >> > AM > > > >> > >> >> > Xintong > > > >> > >> >> > > > > Song < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > feedback, > > > >> > >> Yang. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > comments: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct > > > >> Memory* > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a > > very > > > >> > large > > > >> > >> max > > > >> > >> >> > > direct > > > >> > >> >> > > > > > > memory > > > >> > >> >> > > > > > > > > size > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., > we > > > do > > > >> not > > > >> > >> >> worry > > > >> > >> >> > > about > > > >> > >> >> > > > > > > direct > > > >> > >> >> > > > > > > > > OOM, > > > >> > >> >> > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > managed > > / > > > >> > network > > > >> > >> >> > memory > > > >> > >> >> > > > with > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are > > > also > > > >> > some > > > >> > >> >> down > > > >> > >> >> > > sides > > > >> > >> >> > > > > of > > > >> > >> >> > > > > > > > doing > > > >> > >> >> > > > > > > > > > > this. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I > can > > > >> think > > > >> > >> of is > > > >> > >> >> > that > > > >> > >> >> > > > if > > > >> > >> >> > > > > a > > > >> > >> >> > > > > > > task > > > >> > >> >> > > > > > > > > > > > executor > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > > >> overusing > > > >> > >> >> memory, > > > >> > >> >> > it > > > >> > >> >> > > > > could > > > >> > >> >> > > > > > > be > > > >> > >> >> > > > > > > > > hard > > > >> > >> >> > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > use > > > >> > >> >> > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory > is > > > >> > overused. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down > > side > > > >> is > > > >> > >> that > > > >> > >> >> the > > > >> > >> >> > > JVM > > > >> > >> >> > > > > > never > > > >> > >> >> > > > > > > > > > trigger > > > >> > >> >> > > > > > > > > > > GC > > > >> > >> >> > > > > > > > > > > > > due > > > >> > >> >> > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > > limit, > > > >> > >> because > > > >> > >> >> the > > > >> > >> >> > > > limit > > > >> > >> >> > > > > > is > > > >> > >> >> > > > > > > > too > > > >> > >> >> > > > > > > > > > high > > > >> > >> >> > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > be > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind > of > > > >> relay > > > >> > on > > > >> > >> >> heap > > > >> > >> >> > > > memory > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > > trigger > > > >> > >> >> > > > > > > > > > > GC > > > >> > >> >> > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > > could > > > >> be a > > > >> > >> >> problem > > > >> > >> >> > in > > > >> > >> >> > > > > cases > > > >> > >> >> > > > > > > > where > > > >> > >> >> > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > more > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > > enough > > > >> > heap > > > >> > >> >> > activity > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > trigger > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > GC. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > share > > > your > > > >> > >> reasons > > > >> > >> >> > for > > > >> > >> >> > > > > > > preferring > > > >> > >> >> > > > > > > > > > > > setting a > > > >> > >> >> > > > > > > > > > > > > > > very > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > anything > > > >> else > > > >> > I > > > >> > >> >> > > > overlooked. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > Calculation* > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > > conflict > > > >> > >> between > > > >> > >> >> > > > multiple > > > >> > >> >> > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > user > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > specified, > > > I > > > >> > >> think we > > > >> > >> >> > > should > > > >> > >> >> > > > > > throw > > > >> > >> >> > > > > > > > an > > > >> > >> >> > > > > > > > > > > error. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > checking > > > >> on > > > >> > the > > > >> > >> >> > client > > > >> > >> >> > > > side > > > >> > >> >> > > > > > is > > > >> > >> >> > > > > > > a > > > >> > >> >> > > > > > > > > good > > > >> > >> >> > > > > > > > > > > > idea, > > > >> > >> >> > > > > > > > > > > > > > so > > > >> > >> >> > > > > > > > > > > > > > > > that > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > discover > > > the > > > >> > >> problem > > > >> > >> >> > > before > > > >> > >> >> > > > > > > > submitting > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > Flink > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > > thing. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > only > > > >> rely on > > > >> > >> the > > > >> > >> >> > > client > > > >> > >> >> > > > > side > > > >> > >> >> > > > > > > > > > checking, > > > >> > >> >> > > > > > > > > > > > > > because > > > >> > >> >> > > > > > > > > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > > > >> > >> TaskManagers > > > >> > >> >> on > > > >> > >> >> > > > > > different > > > >> > >> >> > > > > > > > > > machines > > > >> > >> >> > > > > > > > > > > > may > > > >> > >> >> > > > > > > > > > > > > > > have > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and > > the > > > >> > client > > > >> > >> >> does > > > >> > >> >> > > see > > > >> > >> >> > > > > > that. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > 2019 > > at > > > >> 5:09 > > > >> > >> PM > > > >> > >> >> Yang > > > >> > >> >> > > > Wang > > > >> > >> >> > > > > < > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your > > > >> detailed > > > >> > >> >> > proposal. > > > >> > >> >> > > > > After > > > >> > >> >> > > > > > > all > > > >> > >> >> > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > > will > > > be > > > >> > more > > > >> > >> >> > > powerful > > > >> > >> >> > > > to > > > >> > >> >> > > > > > > > control > > > >> > >> >> > > > > > > > > > the > > > >> > >> >> > > > > > > > > > > > > flink > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > > >> questions > > > >> > >> about > > > >> > >> >> it. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and > > > Direct > > > >> > >> Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > >> differentiate > > > >> > >> user > > > >> > >> >> > direct > > > >> > >> >> > > > > > memory > > > >> > >> >> > > > > > > > and > > > >> > >> >> > > > > > > > > > > native > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task > > > >> off-heap > > > >> > >> >> memory. > > > >> > >> >> > > > > Right? > > > >> > >> >> > > > > > > So i > > > >> > >> >> > > > > > > > > > don’t > > > >> > >> >> > > > > > > > > > > > > think > > > >> > >> >> > > > > > > > > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > >> > -XX:MaxDirectMemorySize > > > >> > >> >> > > > properly. I > > > >> > >> >> > > > > > > > prefer > > > >> > >> >> > > > > > > > > > > > leaving > > > >> > >> >> > > > > > > > > > > > > > it a > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > >> Calculation > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of > and > > > >> > >> fine-grained > > > >> > >> >> > > > > > > memory(network > > > >> > >> >> > > > > > > > > > > memory, > > > >> > >> >> > > > > > > > > > > > > > > managed > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > > total > > > >> > >> process > > > >> > >> >> > > memory, > > > >> > >> >> > > > > how > > > >> > >> >> > > > > > do > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > deal > > > >> > >> >> > > > > > > > > > > > > with > > > >> > >> >> > > > > > > > > > > > > > > this > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check > > the > > > >> > memory > > > >> > >> >> > > > > configuration > > > >> > >> >> > > > > > > in > > > >> > >> >> > > > > > > > > > > client? > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > >> > >> >> > > [hidden email]> > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like > to > > > >> start > > > >> > a > > > >> > >> >> > > discussion > > > >> > >> >> > > > > > > thread > > > >> > >> >> > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration > > for > > > >> > >> >> > > > TaskExecutors"[1], > > > >> > >> >> > > > > > > where > > > >> > >> >> > > > > > > > we > > > >> > >> >> > > > > > > > > > > > > describe > > > >> > >> >> > > > > > > > > > > > > > > how > > > >> > >> >> > > > > > > > > > > > > > > > to > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > memory > > > >> > >> >> > > configurations. > > > >> > >> >> > > > > The > > > >> > >> >> > > > > > > > FLIP > > > >> > >> >> > > > > > > > > > > > document > > > >> > >> >> > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > > "Memory > > > >> > >> >> Management > > > >> > >> >> > > and > > > >> > >> >> > > > > > > > > > Configuration > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > > from > > > >> > >> follow-up > > > >> > >> >> > > > > discussions > > > >> > >> >> > > > > > > > both > > > >> > >> >> > > > > > > > > > > online > > > >> > >> >> > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > addresses > > > >> > >> several > > > >> > >> >> > > > > > shortcomings > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > > > current > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > memory > > > >> > >> >> > > configuration. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > > > >> > >> configuration > > > >> > >> >> > for > > > >> > >> >> > > > > > > Streaming > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > Batch. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex > and > > > >> > >> difficult > > > >> > >> >> > > > > > configuration > > > >> > >> >> > > > > > > of > > > >> > >> >> > > > > > > > > > > RocksDB > > > >> > >> >> > > > > > > > > > > > > in > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Complicated, > > > >> > >> uncertain > > > >> > >> >> and > > > >> > >> >> > > > hard > > > >> > >> >> > > > > to > > > >> > >> >> > > > > > > > > > > understand. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to > > > solve > > > >> > the > > > >> > >> >> > problems > > > >> > >> >> > > > can > > > >> > >> >> > > > > > be > > > >> > >> >> > > > > > > > > > > summarized > > > >> > >> >> > > > > > > > > > > > > as > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > > memory > > > >> > >> manager > > > >> > >> >> to > > > >> > >> >> > > also > > > >> > >> >> > > > > > > account > > > >> > >> >> > > > > > > > > for > > > >> > >> >> > > > > > > > > > > > memory > > > >> > >> >> > > > > > > > > > > > > > > usage > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify > how > > > >> > >> TaskExecutor > > > >> > >> >> > > memory > > > >> > >> >> > > > > is > > > >> > >> >> > > > > > > > > > > partitioned > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > >> reservations > > > >> > >> and > > > >> > >> >> > pools. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify > > > memory > > > >> > >> >> > > configuration > > > >> > >> >> > > > > > > options > > > >> > >> >> > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find > more > > > >> > details > > > >> > >> in > > > >> > >> >> the > > > >> > >> >> > > > FLIP > > > >> > >> >> > > > > > wiki > > > >> > >> >> > > > > > > > > > > document > > > >> > >> >> > > > > > > > > > > > > [1]. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > > that > > > >> the > > > >> > >> early > > > >> > >> >> > > design > > > >> > >> >> > > > > doc > > > >> > >> >> > > > > > > [2] > > > >> > >> >> > > > > > > > is > > > >> > >> >> > > > > > > > > > out > > > >> > >> >> > > > > > > > > > > > of > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to > > > have > > > >> the > > > >> > >> >> > > discussion > > > >> > >> >> > > > in > > > >> > >> >> > > > > > > this > > > >> > >> >> > > > > > > > > > > mailing > > > >> > >> >> > > > > > > > > > > > > list > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > forward > > to > > > >> your > > > >> > >> >> > > feedbacks. > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> >> > > > >> > >> > > > > >> > >> > > > >> > > > > > >> > > > > >> > > > > > > > > > > |
Hi Xintong,
True, there would be no regression if only one type of memory is configured. This can be a problem only for the old jobs running in a newly configured cluster. About the pool type precedence, in general, it should not matter for the users which type the segments have. The first implementation can be just to pull from any pool, e.g. empty one pool firstly and then another or some other random pulling. This might be a problem if we mix segment allocations and reservation of memory chunks from the same memory manager. The reservation will be usually for a certain type of memory then the task will probably have to also decide from which pool to allocate the segments. I would suggest we create a memory manager per slot and give it memory limit of the slot then we do not have this kind of mixed operation because Dataset/Batch jobs need only segment memory allocations and streaming jobs need only memory chunks for state backends as I understand the current plan. I would suggest we will look at it if we have the mixed operations at some point and it becomes a problem. Thanks, Andrey On Fri, Sep 13, 2019 at 5:24 PM Andrey Zagrebin <[hidden email]> wrote: > > > ---------- Forwarded message --------- > From: Xintong Song <[hidden email]> > Date: Thu, Sep 12, 2019 at 4:21 AM > Subject: Re: [DISCUSS] FLIP-49: Unified Memory Configuration for > TaskExecutors > To: dev <[hidden email]> > > > Hi Andrey, > > Thanks for bringing this up. > > If I understand correctly, this issue only occurs where the cluster is > configured with both on-heap and off-heap memory. There should be no > regression for clusters configured in the old way (either all on-heap or > all off-heap). > > I also agree that it would be good if the DataSet API jobs can use both > memory types. The only question I can see is that, from which pool (heap / > off-heap) should we allocate memory for DataSet API operators? Do we > always prioritize one pool over the other? Or do we always prioritize the > pool with more available memory left? > > Thank you~ > > Xintong Song > > > > On Tue, Sep 10, 2019 at 8:15 PM Andrey Zagrebin <[hidden email]> > wrote: > > > Hi All, > > > > While looking more into the implementation details of Step 4, we released > > during some offline discussions with @Till > > that there can be a performance degradation for the batch DataSet API if > we > > simply continue to pull memory from the pool > > according the legacy option taskmanager.memory.off-heap. > > > > The reason is that if the cluster is newly configured to statically split > > heap/off-heap (not like previously either heap or 0ff-heap) > > then the batch DataSet API jobs will be able to use only one type of > > memory. Although it does not really matter where the memory segments come > > from > > and potentially batch jobs can use both. Also, currently the Dataset API > > does not result in absolute resource requirements and its batch jobs will > > always get a default share of TM resources. > > > > The suggestion is that we let the batch tasks of Dataset API pull from > both > > pools according to their fair slot share of each memory type. > > For that we can have a special wrapping view of both pools which will > pull > > segments (can be randomly) according to the slot limits. > > The view can wrap TM level memory pools and be given to the Task. > > > > Best, > > Andrey > > > > On Mon, Sep 2, 2019 at 1:35 PM Xintong Song <[hidden email]> > wrote: > > > > > Thanks for your comments, Andrey. > > > > > > - Regarding Task Off-Heap Memory, I think you're right that the user > need > > > to make sure that direct memory and native memory together used by the > > user > > > code (external libs) do not exceed the configured value. As far as I > can > > > think of, there is nothing we can do about it. > > > > > > I addressed the rest of your comment in the wiki page [1]. Please take > a > > > look. > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin <[hidden email]> > > > wrote: > > > > > > > EDIT: sorry for confusion I meant > > > > taskmanager.memory.off-heap > > > > instead of > > > > setting taskmanager.memory.preallocate > > > > > > > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin < > [hidden email]> > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > @Xitong thanks a lot for driving the discussion. > > > > > > > > > > I also reviewed the FLIP and it looks quite good to me. > > > > > Here are some comments: > > > > > > > > > > > > > > > - One thing I wanted to discuss is the backwards-compatibility > > with > > > > > the previous user setups. We could list which options we plan to > > > > deprecate. > > > > > From the first glance it looks possible to provide the > > same/similar > > > > > behaviour for the setups relying on the deprecated options. E.g. > > > > > setting taskmanager.memory.preallocate to true could override > the > > > > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At the > > > > moment the > > > > > FLIP just states that in some cases it may require > re-configuring > > of > > > > > cluster if migrated from prior versions. My suggestion is that > we > > > try > > > > to > > > > > keep it backwards-compatible unless there is a good reason like > > some > > > > major > > > > > complication for the implementation. > > > > > > > > > > > > > > > Also couple of smaller things: > > > > > > > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP and > > leave > > > > > some general wording atm, like 'data structure to store' or > > 'utility > > > > > classes'. When the classes are implemented, we put the concrete > > > class > > > > > names. This way we can avoid confusion and stale documents. > > > > > > > > > > > > > > > - As I understand, if user task uses native memory (not direct > > > memory, > > > > > but e.g. unsafe.allocate or from external lib), there will be no > > > > > explicit guard against exceeding 'task off heap memory'. Then > user > > > > should > > > > > still explicitly make sure that her/his direct buffer allocation > > > plus > > > > any > > > > > other memory usages does not exceed value announced as 'task off > > > > heap'. I > > > > > guess there is no so much that can be done about it except > > > mentioning > > > > in > > > > > docs, similar to controlling the heap state backend. > > > > > > > > > > > > > > > Thanks, > > > > > Andrey > > > > > > > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email]> > > > wrote: > > > > > > > > > >> I also agree that all the configuration should be calculated out > of > > > > >> TaskManager. > > > > >> > > > > >> So a full configuration should be generated before TaskManager > > > started. > > > > >> > > > > >> Override the calculated configurations through -D now seems > better. > > > > >> > > > > >> > > > > >> > > > > >> Best, > > > > >> > > > > >> Yang > > > > >> > > > > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > > > >> > > > > >> > I just updated the FLIP wiki page [1], with the following > changes: > > > > >> > > > > > >> > - Network memory uses JVM direct memory, and is accounted > when > > > > >> setting > > > > >> > JVM max direct memory size parameter. > > > > >> > - Use dynamic configurations (`-Dkey=value`) to pass > calculated > > > > >> memory > > > > >> > configs into TaskExecutors, instead of ENV variables. > > > > >> > - Remove 'supporting memory reservation' from the scope of > this > > > > FLIP. > > > > >> > > > > > >> > @till @stephan, please take another look see if there are any > > other > > > > >> > concerns. > > > > >> > > > > > >> > Thank you~ > > > > >> > > > > > >> > Xintong Song > > > > >> > > > > > >> > > > > > >> > [1] > > > > >> > > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > >> > > > > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song < > > [hidden email] > > > > > > > > >> > wrote: > > > > >> > > > > > >> > > Sorry for the late response. > > > > >> > > > > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss > > the > > > > >> detail > > > > >> > > in PR. > > > > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 for > > > using > > > > >> > > dynamic configuration at the moment, given that there are more > > > > >> questions > > > > >> > to > > > > >> > > be discussed to have a general framework for overwriting > > > > >> configurations > > > > >> > > with ENV variables. > > > > >> > > - Regarding memory reservation, I double checked with Yu and > he > > > will > > > > >> take > > > > >> > > care of it. > > > > >> > > > > > > >> > > Thank you~ > > > > >> > > > > > > >> > > Xintong Song > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann < > > > [hidden email] > > > > > > > > > >> > > wrote: > > > > >> > > > > > > >> > >> What I forgot to add is that we could tackle specifying the > > > > >> > configuration > > > > >> > >> fully in an incremental way and that the full specification > > > should > > > > be > > > > >> > the > > > > >> > >> desired end state. > > > > >> > >> > > > > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > > > > [hidden email]> > > > > >> > >> wrote: > > > > >> > >> > > > > >> > >> > I think our goal should be that the configuration is fully > > > > >> specified > > > > >> > >> when > > > > >> > >> > the process is started. By considering the internal > > calculation > > > > >> step > > > > >> > to > > > > >> > >> be > > > > >> > >> > rather validate existing values and calculate missing ones, > > > these > > > > >> two > > > > >> > >> > proposal shouldn't even conflict (given determinism). > > > > >> > >> > > > > > >> > >> > Since we don't want to change an existing flink-conf.yaml, > > > > >> specifying > > > > >> > >> the > > > > >> > >> > full configuration would require to pass in the options > > > > >> differently. > > > > >> > >> > > > > > >> > >> > One way could be the ENV variables approach. The reason why > > I'm > > > > >> trying > > > > >> > >> to > > > > >> > >> > exclude this feature from the FLIP is that I believe it > > needs a > > > > bit > > > > >> > more > > > > >> > >> > discussion. Just some questions which come to my mind: What > > > would > > > > >> be > > > > >> > the > > > > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot > > separator > > > > >> which > > > > >> > is > > > > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept > the > > > dot > > > > >> > >> > separator what would be the order of precedence if there > are > > > two > > > > >> ENV > > > > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? > What > > is > > > > the > > > > >> > >> > precedence of env variable vs. dynamic configuration value > > > > >> specified > > > > >> > >> via -D? > > > > >> > >> > > > > > >> > >> > Another approach could be to pass in the dynamic > > configuration > > > > >> values > > > > >> > >> via > > > > >> > >> > `-Dkey=value` to the Flink process. For that we don't have > to > > > > >> change > > > > >> > >> > anything because the functionality already exists. > > > > >> > >> > > > > > >> > >> > Cheers, > > > > >> > >> > Till > > > > >> > >> > > > > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen < > > > [hidden email]> > > > > >> > wrote: > > > > >> > >> > > > > > >> > >> >> I see. Under the assumption of strict determinism that > > should > > > > >> work. > > > > >> > >> >> > > > > >> > >> >> The original proposal had this point "don't compute inside > > the > > > > TM, > > > > >> > >> compute > > > > >> > >> >> outside and supply a full config", because that sounded > more > > > > >> > intuitive. > > > > >> > >> >> > > > > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > > > > >> [hidden email] > > > > >> > > > > > > >> > >> >> wrote: > > > > >> > >> >> > > > > >> > >> >> > My understanding was that before starting the Flink > > process > > > we > > > > >> > call a > > > > >> > >> >> > utility which calculates these values. I assume that > this > > > > >> utility > > > > >> > >> will > > > > >> > >> >> do > > > > >> > >> >> > the calculation based on a set of configured values > > (process > > > > >> > memory, > > > > >> > >> >> flink > > > > >> > >> >> > memory, network memory etc.). Assuming that these values > > > don't > > > > >> > differ > > > > >> > >> >> from > > > > >> > >> >> > the values with which the JVM is started, it should be > > > > possible > > > > >> to > > > > >> > >> >> > recompute them in the Flink process in order to set the > > > > values. > > > > >> > >> >> > > > > > >> > >> >> > > > > > >> > >> >> > > > > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > > > > [hidden email] > > > > >> > > > > > >> > >> wrote: > > > > >> > >> >> > > > > > >> > >> >> > > When computing the values in the JVM process after it > > > > started, > > > > >> > how > > > > >> > >> >> would > > > > >> > >> >> > > you deal with values like Max Direct Memory, Metaspace > > > size. > > > > >> > native > > > > >> > >> >> > memory > > > > >> > >> >> > > reservation (reduce heap size), etc? All the values > that > > > are > > > > >> > >> >> parameters > > > > >> > >> >> > to > > > > >> > >> >> > > the JVM process and that need to be supplied at > process > > > > >> startup? > > > > >> > >> >> > > > > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > > > >> > >> [hidden email]> > > > > >> > >> >> > > wrote: > > > > >> > >> >> > > > > > > >> > >> >> > > > Thanks for the clarification. I have some more > > comments: > > > > >> > >> >> > > > > > > > >> > >> >> > > > - I would actually split the logic to compute the > > > process > > > > >> > memory > > > > >> > >> >> > > > requirements and storing the values into two things. > > > E.g. > > > > >> one > > > > >> > >> could > > > > >> > >> >> > name > > > > >> > >> >> > > > the former TaskExecutorProcessUtility and the > latter > > > > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this > on > > > the > > > > PR > > > > >> > >> since > > > > >> > >> >> it's > > > > >> > >> >> > > > just a naming detail. > > > > >> > >> >> > > > > > > > >> > >> >> > > > - Generally, I'm not opposed to making configuration > > > > values > > > > >> > >> >> overridable > > > > >> > >> >> > > by > > > > >> > >> >> > > > ENV variables. I think this is a very good idea and > > > makes > > > > >> the > > > > >> > >> >> > > > configurability of Flink processes easier. However, > I > > > > think > > > > >> > that > > > > >> > >> >> adding > > > > >> > >> >> > > > this functionality should not be part of this FLIP > > > because > > > > >> it > > > > >> > >> would > > > > >> > >> >> > > simply > > > > >> > >> >> > > > widen the scope unnecessarily. > > > > >> > >> >> > > > > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are the > > > > >> following: > > > > >> > >> For > > > > >> > >> >> Yarn > > > > >> > >> >> > > we > > > > >> > >> >> > > > already create write a flink-conf.yaml which could > be > > > > >> populated > > > > >> > >> with > > > > >> > >> >> > the > > > > >> > >> >> > > > memory settings. For the other processes it should > not > > > > make > > > > >> a > > > > >> > >> >> > difference > > > > >> > >> >> > > > whether the loaded Configuration is populated with > the > > > > >> memory > > > > >> > >> >> settings > > > > >> > >> >> > > from > > > > >> > >> >> > > > ENV variables or by using TaskExecutorProcessUtility > > to > > > > >> compute > > > > >> > >> the > > > > >> > >> >> > > missing > > > > >> > >> >> > > > values from the loaded configuration. If the latter > > > would > > > > >> not > > > > >> > be > > > > >> > >> >> > possible > > > > >> > >> >> > > > (wrong or missing configuration values), then we > > should > > > > not > > > > >> > have > > > > >> > >> >> been > > > > >> > >> >> > > able > > > > >> > >> >> > > > to actually start the process in the first place. > > > > >> > >> >> > > > > > > > >> > >> >> > > > - Concerning the memory reservation: I agree with > you > > > that > > > > >> we > > > > >> > >> need > > > > >> > >> >> the > > > > >> > >> >> > > > memory reservation functionality to make streaming > > jobs > > > > work > > > > >> > with > > > > >> > >> >> > > "managed" > > > > >> > >> >> > > > memory. However, w/o this functionality the whole > Flip > > > > would > > > > >> > >> already > > > > >> > >> >> > > bring > > > > >> > >> >> > > > a good amount of improvements to our users when > > running > > > > >> batch > > > > >> > >> jobs. > > > > >> > >> >> > > > Moreover, by keeping the scope smaller we can > complete > > > the > > > > >> FLIP > > > > >> > >> >> faster. > > > > >> > >> >> > > > Hence, I would propose to address the memory > > reservation > > > > >> > >> >> functionality > > > > >> > >> >> > > as a > > > > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not > > > > mistaken). > > > > >> > >> >> > > > > > > > >> > >> >> > > > Cheers, > > > > >> > >> >> > > > Till > > > > >> > >> >> > > > > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > > > >> > >> [hidden email]> > > > > >> > >> >> > > wrote: > > > > >> > >> >> > > > > > > > >> > >> >> > > > > Just add my 2 cents. > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > Using environment variables to override the > > > > configuration > > > > >> for > > > > >> > >> >> > different > > > > >> > >> >> > > > > taskmanagers is better. > > > > >> > >> >> > > > > We do not need to generate dedicated > flink-conf.yaml > > > for > > > > >> all > > > > >> > >> >> > > > taskmanagers. > > > > >> > >> >> > > > > A common flink-conf.yam and different environment > > > > >> variables > > > > >> > are > > > > >> > >> >> > enough. > > > > >> > >> >> > > > > By reducing the distributed cached files, it could > > > make > > > > >> > >> launching > > > > >> > >> >> a > > > > >> > >> >> > > > > taskmanager faster. > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could move > > the > > > > >> logic > > > > >> > >> into > > > > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > > > >> > >> >> > > > > Maybe the client could also benefit from this. > > > Different > > > > >> > users > > > > >> > >> do > > > > >> > >> >> not > > > > >> > >> >> > > > have > > > > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config > > options. > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > Best, > > > > >> > >> >> > > > > Yang > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 > > > 上午1:21写道: > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > One note on the Environment Variables and > > > > Configuration > > > > >> > >> >> discussion. > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables > are > > > > added > > > > >> to > > > > >> > >> the > > > > >> > >> >> > > > > > configuration in the > > > > "GlobalConfiguration.loadConfig()" > > > > >> > >> method > > > > >> > >> >> (or > > > > >> > >> >> > > > > > similar). > > > > >> > >> >> > > > > > For all the code inside Flink, it looks like the > > > data > > > > >> was > > > > >> > in > > > > >> > >> the > > > > >> > >> >> > > config > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > start with, just that the scripts that compute > the > > > > >> > variables > > > > >> > >> can > > > > >> > >> >> > pass > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > values to the process without actually needing > to > > > > write > > > > >> a > > > > >> > >> file. > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > For example the > "GlobalConfiguration.loadConfig()" > > > > >> method > > > > >> > >> would > > > > >> > >> >> > take > > > > >> > >> >> > > > any > > > > >> > >> >> > > > > > ENV variable prefixed with "flink" and add it > as a > > > > >> config > > > > >> > >> key. > > > > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > > > > >> > >> >> > > > "taskmanager.memory.size: > > > > >> > >> >> > > > > > 2g". > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > > > >> > >> >> > [hidden email]> > > > > >> > >> >> > > > > > wrote: > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki page, > > but > > > > >> let's > > > > >> > >> keep > > > > >> > >> >> the > > > > >> > >> >> > > > > > > discussion here. > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do > you > > > > think > > > > >> > about > > > > >> > >> >> > naming > > > > >> > >> >> > > it > > > > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > > > >> > >> >> > > > > > > - Regarding passing memory configurations into > > > task > > > > >> > >> executors, > > > > >> > >> >> > I'm > > > > >> > >> >> > > in > > > > >> > >> >> > > > > > favor > > > > >> > >> >> > > > > > > of do it via environment variables rather than > > > > >> > >> configurations, > > > > >> > >> >> > with > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > following two reasons. > > > > >> > >> >> > > > > > > - It is easier to keep the memory options > once > > > > >> > calculate > > > > >> > >> >> not to > > > > >> > >> >> > > be > > > > >> > >> >> > > > > > > changed with environment variables rather than > > > > >> > >> configurations. > > > > >> > >> >> > > > > > > - I'm not sure whether we should write the > > > > >> > configuration > > > > >> > >> in > > > > >> > >> >> > > startup > > > > >> > >> >> > > > > > > scripts. Writing changes into the > configuration > > > > files > > > > >> > when > > > > >> > >> >> > running > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > startup scripts does not sounds right to me. > Or > > we > > > > >> could > > > > >> > >> make > > > > >> > >> >> a > > > > >> > >> >> > > copy > > > > >> > >> >> > > > of > > > > >> > >> >> > > > > > > configuration files per flink cluster, and > make > > > the > > > > >> task > > > > >> > >> >> executor > > > > >> > >> >> > > to > > > > >> > >> >> > > > > load > > > > >> > >> >> > > > > > > from the copy, and clean up the copy after the > > > > >> cluster is > > > > >> > >> >> > shutdown, > > > > >> > >> >> > > > > which > > > > >> > >> >> > > > > > > is complicated. (I think this is also what > > Stephan > > > > >> means > > > > >> > in > > > > >> > >> >> his > > > > >> > >> >> > > > comment > > > > >> > >> >> > > > > > on > > > > >> > >> >> > > > > > > the wiki page?) > > > > >> > >> >> > > > > > > - Regarding reserving memory, I think this > > change > > > > >> should > > > > >> > be > > > > >> > >> >> > > included > > > > >> > >> >> > > > in > > > > >> > >> >> > > > > > > this FLIP. I think a big part of motivations > of > > > this > > > > >> FLIP > > > > >> > >> is > > > > >> > >> >> to > > > > >> > >> >> > > unify > > > > >> > >> >> > > > > > > memory configuration for streaming / batch and > > > make > > > > it > > > > >> > easy > > > > >> > >> >> for > > > > >> > >> >> > > > > > configuring > > > > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > > > > >> reservation, > > > > >> > >> then > > > > >> > >> >> > > > streaming > > > > >> > >> >> > > > > > jobs > > > > >> > >> >> > > > > > > cannot use managed memory (neither on-heap or > > > > >> off-heap), > > > > >> > >> which > > > > >> > >> >> > > makes > > > > >> > >> >> > > > > this > > > > >> > >> >> > > > > > > FLIP incomplete. > > > > >> > >> >> > > > > > > - Regarding network memory, I think you are > > > right. I > > > > >> > think > > > > >> > >> we > > > > >> > >> >> > > > probably > > > > >> > >> >> > > > > > > don't need to change network stack from using > > > direct > > > > >> > >> memory to > > > > >> > >> >> > > using > > > > >> > >> >> > > > > > unsafe > > > > >> > >> >> > > > > > > native memory. Network memory size is > > > deterministic, > > > > >> > >> cannot be > > > > >> > >> >> > > > reserved > > > > >> > >> >> > > > > > as > > > > >> > >> >> > > > > > > managed memory does, and cannot be overused. I > > > think > > > > >> it > > > > >> > >> also > > > > >> > >> >> > works > > > > >> > >> >> > > if > > > > >> > >> >> > > > > we > > > > >> > >> >> > > > > > > simply keep using direct memory for network > and > > > > >> include > > > > >> > it > > > > >> > >> in > > > > >> > >> >> jvm > > > > >> > >> >> > > max > > > > >> > >> >> > > > > > > direct memory size. > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann > < > > > > >> > >> >> > > [hidden email]> > > > > >> > >> >> > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and > adding > > a > > > > more > > > > >> > >> >> detailed > > > > >> > >> >> > > > > > > > implementation plan. I have a couple of > > comments > > > > >> > >> concerning > > > > >> > >> >> the > > > > >> > >> >> > > > > > > > implementation plan: > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not > > really > > > > >> > >> >> descriptive. > > > > >> > >> >> > > > > Choosing > > > > >> > >> >> > > > > > a > > > > >> > >> >> > > > > > > > different name could help here. > > > > >> > >> >> > > > > > > > - I'm not sure whether I would pass the > memory > > > > >> > >> >> configuration to > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > > > TaskExecutor via environment variables. I > > think > > > it > > > > >> > would > > > > >> > >> be > > > > >> > >> >> > > better > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > write > > > > >> > >> >> > > > > > > > it into the configuration one uses to start > > the > > > TM > > > > >> > >> process. > > > > >> > >> >> > > > > > > > - If possible, I would exclude the memory > > > > >> reservation > > > > >> > >> from > > > > >> > >> >> this > > > > >> > >> >> > > > FLIP > > > > >> > >> >> > > > > > and > > > > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > > > > >> > >> >> > > > > > > > - If possible, then I would exclude changes > to > > > the > > > > >> > >> network > > > > >> > >> >> > stack > > > > >> > >> >> > > > from > > > > >> > >> >> > > > > > > this > > > > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the > direct > > > > memory > > > > >> > >> needed > > > > >> > >> >> by > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > > network > > > > >> > >> >> > > > > > > > stack is the framework direct memory > > > requirement. > > > > >> > >> Changing > > > > >> > >> >> how > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > is allocated can happen in a second step. > This > > > > would > > > > >> > keep > > > > >> > >> >> the > > > > >> > >> >> > > scope > > > > >> > >> >> > > > > of > > > > >> > >> >> > > > > > > this > > > > >> > >> >> > > > > > > > FLIP smaller. > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong > Song < > > > > >> > >> >> > > > [hidden email]> > > > > >> > >> >> > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki > > [1], > > > > with > > > > >> > the > > > > >> > >> >> > > following > > > > >> > >> >> > > > > > > changes. > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > > > > MemorySegment > > > > >> > >> >> > allocation. > > > > >> > >> >> > > As > > > > >> > >> >> > > > > > > > > discussed, we exclude this topic from > the > > > > >> scope of > > > > >> > >> this > > > > >> > >> >> > > FLIP. > > > > >> > >> >> > > > > > > > > - Updated content about JVM direct > memory > > > > >> > parameter > > > > >> > >> >> > > according > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > recent > > > > >> > >> >> > > > > > > > > discussions, and moved the other > options > > to > > > > >> > >> "Rejected > > > > >> > >> >> > > > > > Alternatives" > > > > >> > >> >> > > > > > > > for > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > moment. > > > > >> > >> >> > > > > > > > > - Added implementation steps. > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > [1] > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan > > Ewen < > > > > >> > >> >> > [hidden email] > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory > > users > > > > >> before > > > > >> > >> task > > > > >> > >> >> > > dispose > > > > >> > >> >> > > > > and > > > > >> > >> >> > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > release": I agree, that's how it should > > be. > > > > >> Let's > > > > >> > >> try it > > > > >> > >> >> > out. > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM > does > > > not > > > > >> wait > > > > >> > >> for > > > > >> > >> >> GC > > > > >> > >> >> > > when > > > > >> > >> >> > > > > > > > allocating > > > > >> > >> >> > > > > > > > > > direct memory buffer": There seems to be > > > > pretty > > > > >> > >> >> elaborate > > > > >> > >> >> > > logic > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > free > > > > >> > >> >> > > > > > > > > > buffers when allocating new ones. See > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM > > > > default > > > > >> > works > > > > >> > >> >> (like > > > > >> > >> >> > > > going > > > > >> > >> >> > > > > > > with > > > > >> > >> >> > > > > > > > > > option 2 and not setting > > > > >> "-XX:MaxDirectMemorySize" > > > > >> > at > > > > >> > >> >> all), > > > > >> > >> >> > > > then > > > > >> > >> >> > > > > I > > > > >> > >> >> > > > > > > > think > > > > >> > >> >> > > > > > > > > it > > > > >> > >> >> > > > > > > > > > should be okay to set > > > > "-XX:MaxDirectMemorySize" > > > > >> to > > > > >> > >> >> > > > > > > > > > "off_heap_managed_memory + > direct_memory" > > > even > > > > >> if > > > > >> > we > > > > >> > >> use > > > > >> > >> >> > > > RocksDB. > > > > >> > >> >> > > > > > > That > > > > >> > >> >> > > > > > > > > is a > > > > >> > >> >> > > > > > > > > > big if, though, I honestly have no idea > :D > > > > >> Would be > > > > >> > >> >> good to > > > > >> > >> >> > > > > > > understand > > > > >> > >> >> > > > > > > > > > this, though, because this would affect > > > option > > > > >> (2) > > > > >> > >> and > > > > >> > >> >> > option > > > > >> > >> >> > > > > > (1.2). > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong > > > Song < > > > > >> > >> >> > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. > > > Please > > > > >> > correct > > > > >> > >> >> me if > > > > >> > >> >> > > I'm > > > > >> > >> >> > > > > > > wrong. > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always > > avoid > > > > >> > returning > > > > >> > >> >> > memory > > > > >> > >> >> > > > > > segments > > > > >> > >> >> > > > > > > > to > > > > >> > >> >> > > > > > > > > > > memory manager while there are > still > > > > >> > un-cleaned > > > > >> > >> >> > > > structures / > > > > >> > >> >> > > > > > > > threads > > > > >> > >> >> > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it > > would > > > > >> cause > > > > >> > >> >> serious > > > > >> > >> >> > > > > problems > > > > >> > >> >> > > > > > > by > > > > >> > >> >> > > > > > > > > > having > > > > >> > >> >> > > > > > > > > > > multiple consumers trying to use > the > > > same > > > > >> > memory > > > > >> > >> >> > > segment. > > > > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > > > > allocating > > > > >> > >> direct > > > > >> > >> >> > memory > > > > >> > >> >> > > > > > buffer. > > > > >> > >> >> > > > > > > > > > > Therefore even we set proper max > > direct > > > > >> memory > > > > >> > >> size > > > > >> > >> >> > > limit, > > > > >> > >> >> > > > > we > > > > >> > >> >> > > > > > > may > > > > >> > >> >> > > > > > > > > > still > > > > >> > >> >> > > > > > > > > > > encounter direct memory oom if the > GC > > > > >> cleaning > > > > >> > >> >> memory > > > > >> > >> >> > > > slower > > > > >> > >> >> > > > > > > than > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > direct memory allocation. > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM > > > JingsongLee > > > > < > > > > >> > >> >> > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > .invalid> > > > > >> > >> >> > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly > shut > > > > down > > > > >> > >> before > > > > >> > >> >> we > > > > >> > >> >> > can > > > > >> > >> >> > > > > exit > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > task: > > > > >> > >> >> > > > > > > > > > > > In the current case of memory reuse, > > it > > > > has > > > > >> > >> freed up > > > > >> > >> >> > the > > > > >> > >> >> > > > > memory > > > > >> > >> >> > > > > > > it > > > > >> > >> >> > > > > > > > > > > > uses. If this memory is used by > other > > > > tasks > > > > >> > and > > > > >> > >> >> > > > asynchronous > > > > >> > >> >> > > > > > > > threads > > > > >> > >> >> > > > > > > > > > > > of exited task may still be > writing, > > > > there > > > > >> > will > > > > >> > >> be > > > > >> > >> >> > > > > concurrent > > > > >> > >> >> > > > > > > > > security > > > > >> > >> >> > > > > > > > > > > > problems, and even lead to errors > in > > > user > > > > >> > >> computing > > > > >> > >> >> > > > results. > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > > > > intolerable > > > > >> > >> bug, No > > > > >> > >> >> > > matter > > > > >> > >> >> > > > > > what > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > option is, it should be avoided. > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > > > > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, > I've > > > > >> > >> encountered so > > > > >> > >> >> > many > > > > >> > >> >> > > > > > > > situations > > > > >> > >> >> > > > > > > > > > > > that it's too late for GC to cause > > > > >> > DirectMemory > > > > >> > >> >> OOM. > > > > >> > >> >> > > > Release > > > > >> > >> >> > > > > > and > > > > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on the > > > type > > > > of > > > > >> > user > > > > >> > >> >> job, > > > > >> > >> >> > > > which > > > > >> > >> >> > > > > is > > > > >> > >> >> > > > > > > > > > > > often beyond our control. > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > > > > >> > >> >> > > > > > > > > > > > Jingsong Lee > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > >> ------------------------------------------------------------------ > > > > >> > >> >> > > > > > > > > > > > From:Stephan Ewen <[hidden email] > > > > > > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > > > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: > Unified > > > > >> Memory > > > > >> > >> >> > > Configuration > > > > >> > >> >> > > > > for > > > > >> > >> >> > > > > > > > > > > > TaskExecutors > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 > > (manually > > > > >> release > > > > >> > >> >> memory) > > > > >> > >> >> > > is > > > > >> > >> >> > > > > that > > > > >> > >> >> > > > > > > > > > segfaults > > > > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of > > alarms > > > on > > > > >> user > > > > >> > >> >> ends. > > > > >> > >> >> > So > > > > >> > >> >> > > we > > > > >> > >> >> > > > > > need > > > > >> > >> >> > > > > > > to > > > > >> > >> >> > > > > > > > > > > > guarantee that this never happens. > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that uses > > > data > > > > >> > >> >> structures / > > > > >> > >> >> > > > > > algorithms > > > > >> > >> >> > > > > > > > > with > > > > >> > >> >> > > > > > > > > > > > additional threads, like hash table > > > > >> spill/read > > > > >> > >> and > > > > >> > >> >> > > sorting > > > > >> > >> >> > > > > > > threads. > > > > >> > >> >> > > > > > > > > We > > > > >> > >> >> > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut > down > > > > >> before > > > > >> > we > > > > >> > >> can > > > > >> > >> >> > exit > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > task. > > > > >> > >> >> > > > > > > > > > > > I am not sure that we have that > > > guaranteed > > > > >> > >> already, > > > > >> > >> >> > > that's > > > > >> > >> >> > > > > why > > > > >> > >> >> > > > > > > > option > > > > >> > >> >> > > > > > > > > > 1.1 > > > > >> > >> >> > > > > > > > > > > > seemed simpler to me. > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM > > Xintong > > > > >> Song < > > > > >> > >> >> > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. > > > > >> Summarized > > > > >> > in > > > > >> > >> >> this > > > > >> > >> >> > > way > > > > >> > >> >> > > > > > really > > > > >> > >> >> > > > > > > > > makes > > > > >> > >> >> > > > > > > > > > > > > things easier to understand. > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least > > for > > > > the > > > > >> > >> >> moment. I > > > > >> > >> >> > > > think > > > > >> > >> >> > > > > it > > > > >> > >> >> > > > > > > is > > > > >> > >> >> > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault safe > > for > > > > >> memory > > > > >> > >> >> > manager, > > > > >> > >> >> > > as > > > > >> > >> >> > > > > > long > > > > >> > >> >> > > > > > > as > > > > >> > >> >> > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > always > > > > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment > when > > it > > > > is > > > > >> > >> released > > > > >> > >> >> > from > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory > > consumer > > > > >> > continue > > > > >> > >> >> using > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > > buffer > > > > >> > >> >> > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > segment after releasing it, in > which > > > > case > > > > >> we > > > > >> > do > > > > >> > >> >> want > > > > >> > >> >> > > the > > > > >> > >> >> > > > > job > > > > >> > >> >> > > > > > to > > > > >> > >> >> > > > > > > > > fail > > > > >> > >> >> > > > > > > > > > so > > > > >> > >> >> > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think this > > is > > > a > > > > >> good > > > > >> > >> idea. > > > > >> > >> >> > Not > > > > >> > >> >> > > > only > > > > >> > >> >> > > > > > > > because > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough > to > > > > clean > > > > >> > >> direct > > > > >> > >> >> > > buffers) > > > > >> > >> >> > > > > may > > > > >> > >> >> > > > > > > not > > > > >> > >> >> > > > > > > > > > > always > > > > >> > >> >> > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > true, but also it makes harder for > > > > finding > > > > >> > >> >> problems > > > > >> > >> >> > in > > > > >> > >> >> > > > > cases > > > > >> > >> >> > > > > > of > > > > >> > >> >> > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured > some > > > > direct > > > > >> > >> memory > > > > >> > >> >> for > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > user > > > > >> > >> >> > > > > > > > > > > libraries. > > > > >> > >> >> > > > > > > > > > > > > If the library actually use more > > > direct > > > > >> > memory > > > > >> > >> >> then > > > > >> > >> >> > > > > > configured, > > > > >> > >> >> > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because > they > > > are > > > > >> > still > > > > >> > >> in > > > > >> > >> >> > use, > > > > >> > >> >> > > > may > > > > >> > >> >> > > > > > lead > > > > >> > >> >> > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > overuse > > > > >> > >> >> > > > > > > > > > > > > of the total container memory. In > > that > > > > >> case, > > > > >> > >> if it > > > > >> > >> >> > > didn't > > > > >> > >> >> > > > > > touch > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > default max direct memory limit, > we > > > > cannot > > > > >> > get > > > > >> > >> a > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > OOM > > > > >> > >> >> > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > it > > > > >> > >> >> > > > > > > > > > > > > will become super hard to > understand > > > > which > > > > >> > >> part of > > > > >> > >> >> > the > > > > >> > >> >> > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > > to be updated. > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the similar > > > > >> problem as > > > > >> > >> >> 1.2, if > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > > > exceeded > > > > >> > >> >> > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > memory does not reach the max > direct > > > > >> memory > > > > >> > >> limit > > > > >> > >> >> > > > specified > > > > >> > >> >> > > > > > by > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it is > > > > >> slightly > > > > >> > >> better > > > > >> > >> >> > than > > > > >> > >> >> > > > > 1.2, > > > > >> > >> >> > > > > > > only > > > > >> > >> >> > > > > > > > > > > because > > > > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM > > > Stephan > > > > >> Ewen > > > > >> > < > > > > >> > >> >> > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the > > "-XX:MaxDirectMemorySize" > > > > >> > >> discussion, > > > > >> > >> >> > maybe > > > > >> > >> >> > > > let > > > > >> > >> >> > > > > > me > > > > >> > >> >> > > > > > > > > > > summarize > > > > >> > >> >> > > > > > > > > > > > > it a > > > > >> > >> >> > > > > > > > > > > > > > bit differently: > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two > options: > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > > > > >> de-allocated > > > > >> > by > > > > >> > >> the > > > > >> > >> >> > GC. > > > > >> > >> >> > > > That > > > > >> > >> >> > > > > > > makes > > > > >> > >> >> > > > > > > > > it > > > > >> > >> >> > > > > > > > > > > > > segfault > > > > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way to > > > > trigger > > > > >> GC > > > > >> > in > > > > >> > >> >> case > > > > >> > >> >> > > > > > > > de-allocation > > > > >> > >> >> > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of > > segments > > > > >> > happens > > > > >> > >> >> > quickly, > > > > >> > >> >> > > > > which > > > > >> > >> >> > > > > > > is > > > > >> > >> >> > > > > > > > > > often > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > case during batch scheduling or > > task > > > > >> > restart. > > > > >> > >> >> > > > > > > > > > > > > > - The > "-XX:MaxDirectMemorySize" > > > > >> (option > > > > >> > >> 1.1) > > > > >> > >> >> is > > > > >> > >> >> > one > > > > >> > >> >> > > > way > > > > >> > >> >> > > > > > to > > > > >> > >> >> > > > > > > do > > > > >> > >> >> > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > - Another way could be to > have a > > > > >> > dedicated > > > > >> > >> >> > > > bookkeeping > > > > >> > >> >> > > > > in > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so > > that > > > > this > > > > >> > is a > > > > >> > >> >> > number > > > > >> > >> >> > > > > > > > independent > > > > >> > >> >> > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" > > parameter. > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > > > > de-allocate > > > > >> > the > > > > >> > >> >> memory > > > > >> > >> >> > > for > > > > >> > >> >> > > > > the > > > > >> > >> >> > > > > > > > > > > > > MemorySegments > > > > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need not > > > worry > > > > >> > about > > > > >> > >> >> > > triggering > > > > >> > >> >> > > > > GC > > > > >> > >> >> > > > > > by > > > > >> > >> >> > > > > > > > > some > > > > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it > > is > > > > >> harder > > > > >> > to > > > > >> > >> >> > prevent > > > > >> > >> >> > > > > > > > segfaults. > > > > >> > >> >> > > > > > > > > We > > > > >> > >> >> > > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > be very careful about when we > > > release > > > > >> the > > > > >> > >> memory > > > > >> > >> >> > > > segments > > > > >> > >> >> > > > > > > (only > > > > >> > >> >> > > > > > > > > in > > > > >> > >> >> > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main > thread). > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we > > > probably > > > > >> need > > > > >> > to > > > > >> > >> >> set > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > > > >> > >> >> > > "off_heap_managed_memory + > > > > >> > >> >> > > > > > > > > > > direct_memory" > > > > >> > >> >> > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a > separate > > > > >> reserved > > > > >> > >> >> memory > > > > >> > >> >> > > > pool. > > > > >> > >> >> > > > > > > > Because > > > > >> > >> >> > > > > > > > > if > > > > >> > >> >> > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > just > > > > >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > > > > >> > >> >> > > > > "off_heap_managed_memory + > > > > >> > >> >> > > > > > > > > > > > > jvm_overhead", > > > > >> > >> >> > > > > > > > > > > > > > then there will be times when > that > > > > >> entire > > > > >> > >> >> memory is > > > > >> > >> >> > > > > > allocated > > > > >> > >> >> > > > > > > > by > > > > >> > >> >> > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing left > > for > > > > the > > > > >> > JVM > > > > >> > >> >> > > overhead. > > > > >> > >> >> > > > So > > > > >> > >> >> > > > > > we > > > > >> > >> >> > > > > > > > > either > > > > >> > >> >> > > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > > a > > > > >> > >> >> > > > > > > > > > > > > > way to compensate for that > (again > > > some > > > > >> > safety > > > > >> > >> >> > margin > > > > >> > >> >> > > > > cutoff > > > > >> > >> >> > > > > > > > > value) > > > > >> > >> >> > > > > > > > > > or > > > > >> > >> >> > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > will exceed container memory. > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we > need > > to > > > > be > > > > >> > aware > > > > >> > >> >> that > > > > >> > >> >> > it > > > > >> > >> >> > > > > takes > > > > >> > >> >> > > > > > > > > > elaborate > > > > >> > >> >> > > > > > > > > > > > > logic > > > > >> > >> >> > > > > > > > > > > > > > to push recycling of direct > > buffers > > > > >> without > > > > >> > >> >> always > > > > >> > >> >> > > > > > > triggering a > > > > >> > >> >> > > > > > > > > > full > > > > >> > >> >> > > > > > > > > > > > GC. > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the > options > > > > will > > > > >> be > > > > >> > >> >> easiest > > > > >> > >> >> > to > > > > >> > >> >> > > > do > > > > >> > >> >> > > > > in > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > following > > > > >> > >> >> > > > > > > > > > > > > > order: > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > > > > >> > direct_memory > > > > >> > >> >> > > > parameter, > > > > >> > >> >> > > > > as > > > > >> > >> >> > > > > > > > > > discussed > > > > >> > >> >> > > > > > > > > > > > > > above. We would need to find a > way > > > to > > > > >> set > > > > >> > the > > > > >> > >> >> > > > > direct_memory > > > > >> > >> >> > > > > > > > > > parameter > > > > >> > >> >> > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > default. We could start with 64 > MB > > > and > > > > >> see > > > > >> > >> how > > > > >> > >> >> it > > > > >> > >> >> > > goes > > > > >> > >> >> > > > in > > > > >> > >> >> > > > > > > > > practice. > > > > >> > >> >> > > > > > > > > > > One > > > > >> > >> >> > > > > > > > > > > > > > danger I see is that setting > this > > > loo > > > > >> low > > > > >> > can > > > > >> > >> >> > cause a > > > > >> > >> >> > > > > bunch > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > > > > additional > > > > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we need > to > > > > watch > > > > >> > this > > > > >> > >> >> > > > carefully). > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually > quite > > > > >> simple > > > > >> > to > > > > >> > >> >> > > implement, > > > > >> > >> >> > > > > we > > > > >> > >> >> > > > > > > > could > > > > >> > >> >> > > > > > > > > > try > > > > >> > >> >> > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the > > moment. > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not > touch > > > the > > > > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > > > >> > >> >> > > > > > > > > > > > > parameter > > > > >> > >> >> > > > > > > > > > > > > > at all and assume that all the > > > direct > > > > >> > memory > > > > >> > >> >> > > > allocations > > > > >> > >> >> > > > > > that > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough > to > > be > > > > >> > cleaned > > > > >> > >> up > > > > >> > >> >> > fast > > > > >> > >> >> > > > > enough > > > > >> > >> >> > > > > > > > > through > > > > >> > >> >> > > > > > > > > > > > > regular > > > > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a > > valid > > > > >> > >> assumption, > > > > >> > >> >> > > > though. > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > > > > >> > >> >> > > > > > > > > > > > > > Stephan > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > > > Xintong > > > > >> > Song > > > > >> > >> < > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > opinion > > > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > alternative > > > 2. > > > > I > > > > >> was > > > > >> > >> >> > wondering > > > > >> > >> >> > > > > > whether > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > can > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > > off-heap > > > > >> > >> managed > > > > >> > >> >> > memory > > > > >> > >> >> > > > and > > > > >> > >> >> > > > > > > > network > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > with > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > giving > > > it a > > > > >> > second > > > > >> > >> >> > > thought, > > > > >> > >> >> > > > I > > > > >> > >> >> > > > > > > think > > > > >> > >> >> > > > > > > > > even > > > > >> > >> >> > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > memory > > > > for > > > > >> > >> off-heap > > > > >> > >> >> > > > managed > > > > >> > >> >> > > > > > > memory > > > > >> > >> >> > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > cause > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > think > > > what > > > > >> > >> proposed > > > > >> > >> >> in > > > > >> > >> >> > > this > > > > >> > >> >> > > > > > FLIP > > > > >> > >> >> > > > > > > it > > > > >> > >> >> > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > both > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > > network > > > > >> > memory > > > > >> > >> >> > > allocated > > > > >> > >> >> > > > > > > through > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > > > they > > > > >> are > > > > >> > >> >> > practically > > > > >> > >> >> > > > > > native > > > > >> > >> >> > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > memory. > > > > The > > > > >> > only > > > > >> > >> >> parts > > > > >> > >> >> > of > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > > > limited > > > > >> > >> >> > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > off-heap > > > > >> > memory > > > > >> > >> and > > > > >> > >> >> > JVM > > > > >> > >> >> > > > > > > overhead, > > > > >> > >> >> > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests > > to > > > > set > > > > >> the > > > > >> > >> JVM > > > > >> > >> >> max > > > > >> > >> >> > > > > direct > > > > >> > >> >> > > > > > > > memory > > > > >> > >> >> > > > > > > > > > to. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 > PM > > > Till > > > > >> > >> Rohrmann > > > > >> > >> >> < > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > > > >> Xintong. I > > > > >> > >> >> > > understand > > > > >> > >> >> > > > > the > > > > >> > >> >> > > > > > > two > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > option 2 > > > > >> > because > > > > >> > >> it > > > > >> > >> >> > makes > > > > >> > >> >> > > > > > things > > > > >> > >> >> > > > > > > > > > > explicit. > > > > >> > >> >> > > > > > > > > > > > If > > > > >> > >> >> > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > memory, I > > > > >> fear > > > > >> > >> that > > > > >> > >> >> we > > > > >> > >> >> > > might > > > > >> > >> >> > > > > end > > > > >> > >> >> > > > > > > up > > > > >> > >> >> > > > > > > > > in a > > > > >> > >> >> > > > > > > > > > > > > similar > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > currently > > > in: > > > > >> The > > > > >> > >> user > > > > >> > >> >> > might > > > > >> > >> >> > > > see > > > > >> > >> >> > > > > > that > > > > >> > >> >> > > > > > > > her > > > > >> > >> >> > > > > > > > > > > > process > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does > not > > > know > > > > >> why > > > > >> > >> this > > > > >> > >> >> is > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > case. > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > process > > > > memory > > > > >> > size > > > > >> > >> >> > > (similar > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > > > increasing > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > accommodate > > > for > > > > >> the > > > > >> > >> extra > > > > >> > >> >> > > direct > > > > >> > >> >> > > > > > > memory. > > > > >> > >> >> > > > > > > > > > Even > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > budgets > > > > >> which > > > > >> > >> are > > > > >> > >> >> not > > > > >> > >> >> > > > fully > > > > >> > >> >> > > > > > used > > > > >> > >> >> > > > > > > > and > > > > >> > >> >> > > > > > > > > > > hence > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > > >> consumption. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 11:01 > > AM > > > > >> > Xintong > > > > >> > >> >> Song < > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > > > >> concrete > > > > >> > >> >> example > > > > >> > >> >> > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > following > > > > >> > >> scenario. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > > Off-Heap > > > > >> > >> Memory + > > > > >> > >> >> JVM > > > > >> > >> >> > > > > > > Overhead): > > > > >> > >> >> > > > > > > > > > 200MB > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > Memory, > > > > JVM > > > > >> > >> >> Metaspace, > > > > >> > >> >> > > > > > Off-Heap > > > > >> > >> >> > > > > > > > > > Managed > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > 200MB. > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > a > > > > >> > >> >> > > > > > > > very > > > > >> > >> >> > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > memory > > > > usage > > > > >> of > > > > >> > >> Task > > > > >> > >> >> > > > Off-Heap > > > > >> > >> >> > > > > > > Memory > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > > > >> > alternative 2 > > > > >> > >> >> and > > > > >> > >> >> > > > > > > alternative 3 > > > > >> > >> >> > > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > larger > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > > > will > > > > >> > >> >> > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > > > pools. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > memory > > > > usage > > > > >> of > > > > >> > >> Task > > > > >> > >> >> > > > Off-Heap > > > > >> > >> >> > > > > > > Memory > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > exceed > > > > 200MB, > > > > >> > then > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > > > from > > > > >> > >> frequent > > > > >> > >> >> OOM. > > > > >> > >> >> > > To > > > > >> > >> >> > > > > > avoid > > > > >> > >> >> > > > > > > > > that, > > > > >> > >> >> > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > only > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > modify > > > the > > > > >> > >> >> configuration > > > > >> > >> >> > > and > > > > >> > >> >> > > > > > > > increase > > > > >> > >> >> > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + > > JVM > > > > >> > >> Overhead). > > > > >> > >> >> > Let's > > > > >> > >> >> > > > say > > > > >> > >> >> > > > > > > that > > > > >> > >> >> > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > > > this > > > > >> will > > > > >> > >> >> reduce > > > > >> > >> >> > the > > > > >> > >> >> > > > > total > > > > >> > >> >> > > > > > > > size > > > > >> > >> >> > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > other > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given > the > > > > total > > > > >> > >> process > > > > >> > >> >> > > memory > > > > >> > >> >> > > > > > > remains > > > > >> > >> >> > > > > > > > > > 1GB. > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > > there > > > is > > > > >> no > > > > >> > >> >> chance of > > > > >> > >> >> > > > > direct > > > > >> > >> >> > > > > > > OOM. > > > > >> > >> >> > > > > > > > > > There > > > > >> > >> >> > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > > > process > > > > >> > >> memory > > > > >> > >> >> > limit, > > > > >> > >> >> > > > but > > > > >> > >> >> > > > > > > given > > > > >> > >> >> > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > reserved > > > > >> native > > > > >> > >> >> memory > > > > >> > >> >> > > > > > (Off-Heap > > > > >> > >> >> > > > > > > > > > Managed > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), > > if > > > > the > > > > >> > >> actual > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > 200MB, > > > user > > > > >> > >> probably > > > > >> > >> >> do > > > > >> > >> >> > > not > > > > >> > >> >> > > > > need > > > > >> > >> >> > > > > > > to > > > > >> > >> >> > > > > > > > > > change > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from > the > > > > user's > > > > >> > >> >> > > perspective, a > > > > >> > >> >> > > > > > > > feasible > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead > > to > > > > >> lower > > > > >> > >> >> resource > > > > >> > >> >> > > > > > > utilization > > > > >> > >> >> > > > > > > > > > > compared > > > > >> > >> >> > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > 10:28 > > > AM > > > > >> Till > > > > >> > >> >> > Rohrmann > > > > >> > >> >> > > < > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help > > me > > > > >> > >> understand > > > > >> > >> >> the > > > > >> > >> >> > > > > > difference > > > > >> > >> >> > > > > > > > > > between > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > under > > > > >> > utilization > > > > >> > >> >> > > Xintong. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > >> > >> >> XX:MaxDirectMemorySize > > > > >> > >> >> > > to > > > > >> > >> >> > > > > Task > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is > > the > > > > risk > > > > >> > that > > > > >> > >> >> this > > > > >> > >> >> > > size > > > > >> > >> >> > > > > is > > > > >> > >> >> > > > > > > too > > > > >> > >> >> > > > > > > > > low > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > collection > > > and > > > > >> > >> >> potentially > > > > >> > >> >> > an > > > > >> > >> >> > > > OOM. > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > >> > >> >> XX:MaxDirectMemorySize > > > > >> > >> >> > > to > > > > >> > >> >> > > > > > > > something > > > > >> > >> >> > > > > > > > > > > larger > > > > >> > >> >> > > > > > > > > > > > > > than > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > would > > of > > > > >> course > > > > >> > >> >> reduce > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > sizes > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > other > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 > > now > > > > >> result > > > > >> > >> in an > > > > >> > >> >> > > under > > > > >> > >> >> > > > > > > > > utilization > > > > >> > >> >> > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative > 3? > > > If > > > > >> > >> >> alternative 3 > > > > >> > >> >> > > > > > strictly > > > > >> > >> >> > > > > > > > > sets a > > > > >> > >> >> > > > > > > > > > > > > higher > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and > we > > > use > > > > >> only > > > > >> > >> >> little, > > > > >> > >> >> > > > then I > > > > >> > >> >> > > > > > > would > > > > >> > >> >> > > > > > > > > > > expect > > > > >> > >> >> > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > > > memory > > > > >> > under > > > > >> > >> >> > > > > utilization. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > > 4:19 > > > > PM > > > > >> > Yang > > > > >> > >> >> Wang < > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a > > very > > > > >> large > > > > >> > >> max > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > size > > > > >> > >> >> > > > > > > > > > > when > > > > >> > >> >> > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > do > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct > and > > > > >> native > > > > >> > >> >> memory. > > > > >> > >> >> > If > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > framework > > > > >> > direct > > > > >> > >> >> > > > memory,could > > > > >> > >> >> > > > > > be > > > > >> > >> >> > > > > > > > > > > calculated > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > setting > > > > >> direct > > > > >> > >> memory > > > > >> > >> >> > with > > > > >> > >> >> > > > > fixed > > > > >> > >> >> > > > > > > > > value. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. > > For > > > > Yarn > > > > >> > and > > > > >> > >> >> k8s,we > > > > >> > >> >> > > > need > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > check > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > client > > > to > > > > >> avoid > > > > >> > >> >> > > submitting > > > > >> > >> >> > > > > > > > > successfully > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > >> > >> [hidden email] > > > > >> > >> >> > > > > >于2019年8月13日 > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > MemorySegment, I > > > > think > > > > >> > you > > > > >> > >> are > > > > >> > >> >> > > right > > > > >> > >> >> > > > > that > > > > >> > >> >> > > > > > > we > > > > >> > >> >> > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope > of > > > this > > > > >> > FLIP. > > > > >> > >> >> This > > > > >> > >> >> > > FLIP > > > > >> > >> >> > > > > > should > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > >> > >> >> > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > pools > > > for > > > > >> > >> >> > TaskExecutors, > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > > > minimum > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use > > it. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > memory, I > > > > think > > > > >> > >> >> > alternative > > > > >> > >> >> > > 3 > > > > >> > >> >> > > > > may > > > > >> > >> >> > > > > > > not > > > > >> > >> >> > > > > > > > > > having > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue > that > > > > >> > >> alternative 2 > > > > >> > >> >> > > does, > > > > >> > >> >> > > > > but > > > > >> > >> >> > > > > > at > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > > cost > > > > >> > >> >> > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > > > >> container > > > > >> > >> level, > > > > >> > >> >> > > which > > > > >> > >> >> > > > is > > > > >> > >> >> > > > > > not > > > > >> > >> >> > > > > > > > > good. > > > > >> > >> >> > > > > > > > > > > My > > > > >> > >> >> > > > > > > > > > > > > > point > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > > > Memory" > > > > >> and > > > > >> > >> "JVM > > > > >> > >> >> > > > > Overhead" > > > > >> > >> >> > > > > > > are > > > > >> > >> >> > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > easy > > > > >> > >> >> > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > > > might > > > > >> > >> configure > > > > >> > >> >> > them > > > > >> > >> >> > > > > > higher > > > > >> > >> >> > > > > > > > than > > > > >> > >> >> > > > > > > > > > > what > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > getting > > a > > > > >> direct > > > > >> > >> OOM. > > > > >> > >> >> For > > > > >> > >> >> > > > > > > alternative > > > > >> > >> >> > > > > > > > > 3, > > > > >> > >> >> > > > > > > > > > > > users > > > > >> > >> >> > > > > > > > > > > > > do > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they > > may > > > > not > > > > >> > >> config > > > > >> > >> >> the > > > > >> > >> >> > > two > > > > >> > >> >> > > > > > > options > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > > > risks > > > > >> of > > > > >> > >> >> overall > > > > >> > >> >> > > > > container > > > > >> > >> >> > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > > at > > > > >> 9:39 AM > > > > >> > >> Till > > > > >> > >> >> > > > > Rohrmann < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > proposing > > > > this > > > > >> > FLIP > > > > >> > >> >> > Xintong. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think > > it > > > > >> already > > > > >> > >> >> looks > > > > >> > >> >> > > quite > > > > >> > >> >> > > > > > good. > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > >> > >> >> > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > allocating > > > > >> > >> memory > > > > >> > >> >> > > > segments, > > > > >> > >> >> > > > > I > > > > >> > >> >> > > > > > > was > > > > >> > >> >> > > > > > > > > > > > wondering > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary > > to > > > do > > > > >> in > > > > >> > the > > > > >> > >> >> > context > > > > >> > >> >> > > > of > > > > >> > >> >> > > > > > this > > > > >> > >> >> > > > > > > > > FLIP > > > > >> > >> >> > > > > > > > > > or > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > follow > > > up? > > > > >> > Without > > > > >> > >> >> > knowing > > > > >> > >> >> > > > all > > > > >> > >> >> > > > > > > > > details, > > > > >> > >> >> > > > > > > > > > I > > > > >> > >> >> > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > widen > > > the > > > > >> scope > > > > >> > >> of > > > > >> > >> >> this > > > > >> > >> >> > > > FLIP > > > > >> > >> >> > > > > > too > > > > >> > >> >> > > > > > > > much > > > > >> > >> >> > > > > > > > > > > > because > > > > >> > >> >> > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > > existing > > > > >> > call > > > > >> > >> >> sites > > > > >> > >> >> > of > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > >> > >> >> > > > > > > > > > > > > > where > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > > (this > > > > >> should > > > > >> > >> >> mainly > > > > >> > >> >> > be > > > > >> > >> >> > > > > batch > > > > >> > >> >> > > > > > > > > > > operators). > > > > >> > >> >> > > > > > > > > > > > > The > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > reservation > > > > >> call > > > > >> > to > > > > >> > >> the > > > > >> > >> >> > > > > > > MemoryManager > > > > >> > >> >> > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > > hope > > > > that > > > > >> > >> this is > > > > >> > >> >> > the > > > > >> > >> >> > > > only > > > > >> > >> >> > > > > > > point > > > > >> > >> >> > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > would > > > have > > > > >> with > > > > >> > >> the > > > > >> > >> >> > > > > > > MemoryManager. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > second > > > > open > > > > >> > >> >> question > > > > >> > >> >> > > about > > > > >> > >> >> > > > > > > setting > > > > >> > >> >> > > > > > > > > or > > > > >> > >> >> > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > limit, I > > > > >> would > > > > >> > >> also > > > > >> > >> >> be > > > > >> > >> >> > > > > > interested > > > > >> > >> >> > > > > > > > why > > > > >> > >> >> > > > > > > > > > > Yang > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > > would > > > be > > > > >> > best. > > > > >> > >> My > > > > >> > >> >> > > concern > > > > >> > >> >> > > > > > about > > > > >> > >> >> > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > > situation > > > > >> as > > > > >> > we > > > > >> > >> >> are > > > > >> > >> >> > now > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > memory > > > > pools > > > > >> > are > > > > >> > >> not > > > > >> > >> >> > > > clearly > > > > >> > >> >> > > > > > > > > separated > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > can > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, > > then > > > > it > > > > >> is > > > > >> > >> quite > > > > >> > >> >> > hard > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > > understand > > > > >> > >> >> > > > > > > > > > > > what > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > killed > > > > for > > > > >> > using > > > > >> > >> >> too > > > > >> > >> >> > > much > > > > >> > >> >> > > > > > > memory. > > > > >> > >> >> > > > > > > > > This > > > > >> > >> >> > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > > > >> situation > > > > >> > >> what > > > > >> > >> >> we > > > > >> > >> >> > > have > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > default > > > > >> value > > > > >> > >> for > > > > >> > >> >> max > > > > >> > >> >> > > > direct > > > > >> > >> >> > > > > > > > memory > > > > >> > >> >> > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > giving > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase > > it > > > if > > > > >> he > > > > >> > >> runs > > > > >> > >> >> into > > > > >> > >> >> > > an > > > > >> > >> >> > > > > OOM. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > would > > > > >> > >> alternative 2 > > > > >> > >> >> > lead > > > > >> > >> >> > > to > > > > >> > >> >> > > > > > lower > > > > >> > >> >> > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > where > > we > > > > set > > > > >> > the > > > > >> > >> >> direct > > > > >> > >> >> > > > > memory > > > > >> > >> >> > > > > > > to a > > > > >> > >> >> > > > > > > > > > > higher > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > 2019 > > at > > > > >> 9:12 > > > > >> > AM > > > > >> > >> >> > Xintong > > > > >> > >> >> > > > > Song < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > > feedback, > > > > >> > >> Yang. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > > comments: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > Direct > > > > >> Memory* > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting > a > > > very > > > > >> > large > > > > >> > >> max > > > > >> > >> >> > > direct > > > > >> > >> >> > > > > > > memory > > > > >> > >> >> > > > > > > > > size > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > E.g., > > we > > > > do > > > > >> not > > > > >> > >> >> worry > > > > >> > >> >> > > about > > > > >> > >> >> > > > > > > direct > > > > >> > >> >> > > > > > > > > OOM, > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > managed > > > / > > > > >> > network > > > > >> > >> >> > memory > > > > >> > >> >> > > > with > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there > are > > > > also > > > > >> > some > > > > >> > >> >> down > > > > >> > >> >> > > sides > > > > >> > >> >> > > > > of > > > > >> > >> >> > > > > > > > doing > > > > >> > >> >> > > > > > > > > > > this. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I > > can > > > > >> think > > > > >> > >> of is > > > > >> > >> >> > that > > > > >> > >> >> > > > if > > > > >> > >> >> > > > > a > > > > >> > >> >> > > > > > > task > > > > >> > >> >> > > > > > > > > > > > executor > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > > > >> overusing > > > > >> > >> >> memory, > > > > >> > >> >> > it > > > > >> > >> >> > > > > could > > > > >> > >> >> > > > > > > be > > > > >> > >> >> > > > > > > > > hard > > > > >> > >> >> > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > use > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory > > is > > > > >> > overused. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > down > > > side > > > > >> is > > > > >> > >> that > > > > >> > >> >> the > > > > >> > >> >> > > JVM > > > > >> > >> >> > > > > > never > > > > >> > >> >> > > > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > > GC > > > > >> > >> >> > > > > > > > > > > > > due > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > > > limit, > > > > >> > >> because > > > > >> > >> >> the > > > > >> > >> >> > > > limit > > > > >> > >> >> > > > > > is > > > > >> > >> >> > > > > > > > too > > > > >> > >> >> > > > > > > > > > high > > > > >> > >> >> > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind > > of > > > > >> relay > > > > >> > on > > > > >> > >> >> heap > > > > >> > >> >> > > > memory > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > > GC > > > > >> > >> >> > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > > > could > > > > >> be a > > > > >> > >> >> problem > > > > >> > >> >> > in > > > > >> > >> >> > > > > cases > > > > >> > >> >> > > > > > > > where > > > > >> > >> >> > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > > > enough > > > > >> > heap > > > > >> > >> >> > activity > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > GC. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > > share > > > > your > > > > >> > >> reasons > > > > >> > >> >> > for > > > > >> > >> >> > > > > > > preferring > > > > >> > >> >> > > > > > > > > > > > setting a > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > anything > > > > >> else > > > > >> > I > > > > >> > >> >> > > > overlooked. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > Calculation* > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > > > conflict > > > > >> > >> between > > > > >> > >> >> > > > multiple > > > > >> > >> >> > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > specified, > > > > I > > > > >> > >> think we > > > > >> > >> >> > > should > > > > >> > >> >> > > > > > throw > > > > >> > >> >> > > > > > > > an > > > > >> > >> >> > > > > > > > > > > error. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > > checking > > > > >> on > > > > >> > the > > > > >> > >> >> > client > > > > >> > >> >> > > > side > > > > >> > >> >> > > > > > is > > > > >> > >> >> > > > > > > a > > > > >> > >> >> > > > > > > > > good > > > > >> > >> >> > > > > > > > > > > > idea, > > > > >> > >> >> > > > > > > > > > > > > > so > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > discover > > > > the > > > > >> > >> problem > > > > >> > >> >> > > before > > > > >> > >> >> > > > > > > > submitting > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > > > thing. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > > only > > > > >> rely on > > > > >> > >> the > > > > >> > >> >> > > client > > > > >> > >> >> > > > > side > > > > >> > >> >> > > > > > > > > > checking, > > > > >> > >> >> > > > > > > > > > > > > > because > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > cluster > > > > >> > >> TaskManagers > > > > >> > >> >> on > > > > >> > >> >> > > > > > different > > > > >> > >> >> > > > > > > > > > machines > > > > >> > >> >> > > > > > > > > > > > may > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations > and > > > the > > > > >> > client > > > > >> > >> >> does > > > > >> > >> >> > > see > > > > >> > >> >> > > > > > that. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > think? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > > 2019 > > > at > > > > >> 5:09 > > > > >> > >> PM > > > > >> > >> >> Yang > > > > >> > >> >> > > > Wang > > > > >> > >> >> > > > > < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > your > > > > >> detailed > > > > >> > >> >> > proposal. > > > > >> > >> >> > > > > After > > > > >> > >> >> > > > > > > all > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > > > will > > > > be > > > > >> > more > > > > >> > >> >> > > powerful > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > control > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > flink > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > > > >> questions > > > > >> > >> about > > > > >> > >> >> it. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native > and > > > > Direct > > > > >> > >> Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > >> differentiate > > > > >> > >> user > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > and > > > > >> > >> >> > > > > > > > > > > native > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in > task > > > > >> off-heap > > > > >> > >> >> memory. > > > > >> > >> >> > > > > Right? > > > > >> > >> >> > > > > > > So i > > > > >> > >> >> > > > > > > > > > don’t > > > > >> > >> >> > > > > > > > > > > > > think > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > >> > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > properly. I > > > > >> > >> >> > > > > > > > prefer > > > > >> > >> >> > > > > > > > > > > > leaving > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > > >> Calculation > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of > > and > > > > >> > >> fine-grained > > > > >> > >> >> > > > > > > memory(network > > > > >> > >> >> > > > > > > > > > > memory, > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > > > total > > > > >> > >> process > > > > >> > >> >> > > memory, > > > > >> > >> >> > > > > how > > > > >> > >> >> > > > > > do > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > deal > > > > >> > >> >> > > > > > > > > > > > > with > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > check > > > the > > > > >> > memory > > > > >> > >> >> > > > > configuration > > > > >> > >> >> > > > > > > in > > > > >> > >> >> > > > > > > > > > > client? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > > >> > >> >> > > [hidden email]> > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > like > > to > > > > >> start > > > > >> > a > > > > >> > >> >> > > discussion > > > > >> > >> >> > > > > > > thread > > > > >> > >> >> > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > for > > > > >> > >> >> > > > TaskExecutors"[1], > > > > >> > >> >> > > > > > > where > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > describe > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > memory > > > > >> > >> >> > > configurations. > > > > >> > >> >> > > > > The > > > > >> > >> >> > > > > > > > FLIP > > > > >> > >> >> > > > > > > > > > > > document > > > > >> > >> >> > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > > > "Memory > > > > >> > >> >> Management > > > > >> > >> >> > > and > > > > >> > >> >> > > > > > > > > > Configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > > > from > > > > >> > >> follow-up > > > > >> > >> >> > > > > discussions > > > > >> > >> >> > > > > > > > both > > > > >> > >> >> > > > > > > > > > > online > > > > >> > >> >> > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > > addresses > > > > >> > >> several > > > > >> > >> >> > > > > > shortcomings > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > > > current > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > memory > > > > >> > >> >> > > configuration. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Different > > > > >> > >> configuration > > > > >> > >> >> > for > > > > >> > >> >> > > > > > > Streaming > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex > > and > > > > >> > >> difficult > > > > >> > >> >> > > > > > configuration > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > >> > >> >> > > > > > > > > > > > > in > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Complicated, > > > > >> > >> uncertain > > > > >> > >> >> and > > > > >> > >> >> > > > hard > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > > > > understand. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes > to > > > > solve > > > > >> > the > > > > >> > >> >> > problems > > > > >> > >> >> > > > can > > > > >> > >> >> > > > > > be > > > > >> > >> >> > > > > > > > > > > summarized > > > > >> > >> >> > > > > > > > > > > > > as > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > > > memory > > > > >> > >> manager > > > > >> > >> >> to > > > > >> > >> >> > > also > > > > >> > >> >> > > > > > > account > > > > >> > >> >> > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify > > how > > > > >> > >> TaskExecutor > > > > >> > >> >> > > memory > > > > >> > >> >> > > > > is > > > > >> > >> >> > > > > > > > > > > partitioned > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > > >> reservations > > > > >> > >> and > > > > >> > >> >> > pools. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Simplify > > > > memory > > > > >> > >> >> > > configuration > > > > >> > >> >> > > > > > > options > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find > > more > > > > >> > details > > > > >> > >> in > > > > >> > >> >> the > > > > >> > >> >> > > > FLIP > > > > >> > >> >> > > > > > wiki > > > > >> > >> >> > > > > > > > > > > document > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > > > that > > > > >> the > > > > >> > >> early > > > > >> > >> >> > > design > > > > >> > >> >> > > > > doc > > > > >> > >> >> > > > > > > [2] > > > > >> > >> >> > > > > > > > is > > > > >> > >> >> > > > > > > > > > out > > > > >> > >> >> > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated > to > > > > have > > > > >> the > > > > >> > >> >> > > discussion > > > > >> > >> >> > > > in > > > > >> > >> >> > > > > > > this > > > > >> > >> >> > > > > > > > > > > mailing > > > > >> > >> >> > > > > > > > > > > > > list > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > forward > > > to > > > > >> your > > > > >> > >> >> > > feedbacks. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM > > > > Xintong > > > > >> > Song > > > > >> > >> < > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > opinion > > > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > alternative > > > 2. > > > > I > > > > >> was > > > > >> > >> >> > wondering > > > > >> > >> >> > > > > > whether > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > can > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > > off-heap > > > > >> > >> managed > > > > >> > >> >> > memory > > > > >> > >> >> > > > and > > > > >> > >> >> > > > > > > > network > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > with > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > giving > > > it a > > > > >> > second > > > > >> > >> >> > > thought, > > > > >> > >> >> > > > I > > > > >> > >> >> > > > > > > think > > > > >> > >> >> > > > > > > > > even > > > > >> > >> >> > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > memory > > > > for > > > > >> > >> off-heap > > > > >> > >> >> > > > managed > > > > >> > >> >> > > > > > > memory > > > > >> > >> >> > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > cause > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > think > > > what > > > > >> > >> proposed > > > > >> > >> >> in > > > > >> > >> >> > > this > > > > >> > >> >> > > > > > FLIP > > > > >> > >> >> > > > > > > it > > > > >> > >> >> > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > both > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > > network > > > > >> > memory > > > > >> > >> >> > > allocated > > > > >> > >> >> > > > > > > through > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means > > > they > > > > >> are > > > > >> > >> >> > practically > > > > >> > >> >> > > > > > native > > > > >> > >> >> > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > memory. > > > > The > > > > >> > only > > > > >> > >> >> parts > > > > >> > >> >> > of > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > > > limited > > > > >> > >> >> > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > off-heap > > > > >> > memory > > > > >> > >> and > > > > >> > >> >> > JVM > > > > >> > >> >> > > > > > > overhead, > > > > >> > >> >> > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests > > to > > > > set > > > > >> the > > > > >> > >> JVM > > > > >> > >> >> max > > > > >> > >> >> > > > > direct > > > > >> > >> >> > > > > > > > memory > > > > >> > >> >> > > > > > > > > > to. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 > PM > > > Till > > > > >> > >> Rohrmann > > > > >> > >> >> < > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the clarification > > > > >> Xintong. I > > > > >> > >> >> > > understand > > > > >> > >> >> > > > > the > > > > >> > >> >> > > > > > > two > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > option 2 > > > > >> > because > > > > >> > >> it > > > > >> > >> >> > makes > > > > >> > >> >> > > > > > things > > > > >> > >> >> > > > > > > > > > > explicit. > > > > >> > >> >> > > > > > > > > > > > If > > > > >> > >> >> > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > memory, I > > > > >> fear > > > > >> > >> that > > > > >> > >> >> we > > > > >> > >> >> > > might > > > > >> > >> >> > > > > end > > > > >> > >> >> > > > > > > up > > > > >> > >> >> > > > > > > > > in a > > > > >> > >> >> > > > > > > > > > > > > similar > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > currently > > > in: > > > > >> The > > > > >> > >> user > > > > >> > >> >> > might > > > > >> > >> >> > > > see > > > > >> > >> >> > > > > > that > > > > >> > >> >> > > > > > > > her > > > > >> > >> >> > > > > > > > > > > > process > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does > not > > > know > > > > >> why > > > > >> > >> this > > > > >> > >> >> is > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > case. > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > process > > > > memory > > > > >> > size > > > > >> > >> >> > > (similar > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > > > increasing > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > accommodate > > > for > > > > >> the > > > > >> > >> extra > > > > >> > >> >> > > direct > > > > >> > >> >> > > > > > > memory. > > > > >> > >> >> > > > > > > > > > Even > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > budgets > > > > >> which > > > > >> > >> are > > > > >> > >> >> not > > > > >> > >> >> > > > fully > > > > >> > >> >> > > > > > used > > > > >> > >> >> > > > > > > > and > > > > >> > >> >> > > > > > > > > > > hence > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > > >> consumption. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 11:01 > > AM > > > > >> > Xintong > > > > >> > >> >> Song < > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this with a > > > > >> concrete > > > > >> > >> >> example > > > > >> > >> >> > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > following > > > > >> > >> scenario. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > > Off-Heap > > > > >> > >> Memory + > > > > >> > >> >> JVM > > > > >> > >> >> > > > > > > Overhead): > > > > >> > >> >> > > > > > > > > > 200MB > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > Memory, > > > > JVM > > > > >> > >> >> Metaspace, > > > > >> > >> >> > > > > > Off-Heap > > > > >> > >> >> > > > > > > > > > Managed > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > 200MB. > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > a > > > > >> > >> >> > > > > > > > very > > > > >> > >> >> > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > memory > > > > usage > > > > >> of > > > > >> > >> Task > > > > >> > >> >> > > > Off-Heap > > > > >> > >> >> > > > > > > Memory > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > > > > >> > alternative 2 > > > > >> > >> >> and > > > > >> > >> >> > > > > > > alternative 3 > > > > >> > >> >> > > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > larger > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > > > > will > > > > >> > >> >> > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other memory > > > pools. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > memory > > > > usage > > > > >> of > > > > >> > >> Task > > > > >> > >> >> > > > Off-Heap > > > > >> > >> >> > > > > > > Memory > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > exceed > > > > 200MB, > > > > >> > then > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers > > > from > > > > >> > >> frequent > > > > >> > >> >> OOM. > > > > >> > >> >> > > To > > > > >> > >> >> > > > > > avoid > > > > >> > >> >> > > > > > > > > that, > > > > >> > >> >> > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > only > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > modify > > > the > > > > >> > >> >> configuration > > > > >> > >> >> > > and > > > > >> > >> >> > > > > > > > increase > > > > >> > >> >> > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + > > JVM > > > > >> > >> Overhead). > > > > >> > >> >> > Let's > > > > >> > >> >> > > > say > > > > >> > >> >> > > > > > > that > > > > >> > >> >> > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, > > > this > > > > >> will > > > > >> > >> >> reduce > > > > >> > >> >> > the > > > > >> > >> >> > > > > total > > > > >> > >> >> > > > > > > > size > > > > >> > >> >> > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > other > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given > the > > > > total > > > > >> > >> process > > > > >> > >> >> > > memory > > > > >> > >> >> > > > > > > remains > > > > >> > >> >> > > > > > > > > > 1GB. > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > > there > > > is > > > > >> no > > > > >> > >> >> chance of > > > > >> > >> >> > > > > direct > > > > >> > >> >> > > > > > > OOM. > > > > >> > >> >> > > > > > > > > > There > > > > >> > >> >> > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the total > > > > process > > > > >> > >> memory > > > > >> > >> >> > limit, > > > > >> > >> >> > > > but > > > > >> > >> >> > > > > > > given > > > > >> > >> >> > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > reserved > > > > >> native > > > > >> > >> >> memory > > > > >> > >> >> > > > > > (Off-Heap > > > > >> > >> >> > > > > > > > > > Managed > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), > > if > > > > the > > > > >> > >> actual > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > 200MB, > > > user > > > > >> > >> probably > > > > >> > >> >> do > > > > >> > >> >> > > not > > > > >> > >> >> > > > > need > > > > >> > >> >> > > > > > > to > > > > >> > >> >> > > > > > > > > > change > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from > the > > > > user's > > > > >> > >> >> > > perspective, a > > > > >> > >> >> > > > > > > > feasible > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead > > to > > > > >> lower > > > > >> > >> >> resource > > > > >> > >> >> > > > > > > utilization > > > > >> > >> >> > > > > > > > > > > compared > > > > >> > >> >> > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > 10:28 > > > AM > > > > >> Till > > > > >> > >> >> > Rohrmann > > > > >> > >> >> > > < > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to help > > me > > > > >> > >> understand > > > > >> > >> >> the > > > > >> > >> >> > > > > > difference > > > > >> > >> >> > > > > > > > > > between > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > under > > > > >> > utilization > > > > >> > >> >> > > Xintong. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > >> > >> >> XX:MaxDirectMemorySize > > > > >> > >> >> > > to > > > > >> > >> >> > > > > Task > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is > > the > > > > risk > > > > >> > that > > > > >> > >> >> this > > > > >> > >> >> > > size > > > > >> > >> >> > > > > is > > > > >> > >> >> > > > > > > too > > > > >> > >> >> > > > > > > > > low > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > collection > > > and > > > > >> > >> >> potentially > > > > >> > >> >> > an > > > > >> > >> >> > > > OOM. > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > >> > >> >> XX:MaxDirectMemorySize > > > > >> > >> >> > > to > > > > >> > >> >> > > > > > > > something > > > > >> > >> >> > > > > > > > > > > larger > > > > >> > >> >> > > > > > > > > > > > > > than > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > would > > of > > > > >> course > > > > >> > >> >> reduce > > > > >> > >> >> > > the > > > > >> > >> >> > > > > > sizes > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > other > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 > > now > > > > >> result > > > > >> > >> in an > > > > >> > >> >> > > under > > > > >> > >> >> > > > > > > > > utilization > > > > >> > >> >> > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to alternative > 3? > > > If > > > > >> > >> >> alternative 3 > > > > >> > >> >> > > > > > strictly > > > > >> > >> >> > > > > > > > > sets a > > > > >> > >> >> > > > > > > > > > > > > higher > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and > we > > > use > > > > >> only > > > > >> > >> >> little, > > > > >> > >> >> > > > then I > > > > >> > >> >> > > > > > > would > > > > >> > >> >> > > > > > > > > > > expect > > > > >> > >> >> > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in > > > > memory > > > > >> > under > > > > >> > >> >> > > > > utilization. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at > > 4:19 > > > > PM > > > > >> > Yang > > > > >> > >> >> Wang < > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting a > > very > > > > >> large > > > > >> > >> max > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > size > > > > >> > >> >> > > > > > > > > > > when > > > > >> > >> >> > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > do > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct > and > > > > >> native > > > > >> > >> >> memory. > > > > >> > >> >> > If > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > framework > > > > >> > direct > > > > >> > >> >> > > > memory,could > > > > >> > >> >> > > > > > be > > > > >> > >> >> > > > > > > > > > > calculated > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > setting > > > > >> direct > > > > >> > >> memory > > > > >> > >> >> > with > > > > >> > >> >> > > > > fixed > > > > >> > >> >> > > > > > > > > value. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. > > For > > > > Yarn > > > > >> > and > > > > >> > >> >> k8s,we > > > > >> > >> >> > > > need > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > check > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > client > > > to > > > > >> avoid > > > > >> > >> >> > > submitting > > > > >> > >> >> > > > > > > > > successfully > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > >> > >> [hidden email] > > > > >> > >> >> > > > > >于2019年8月13日 > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, > > > Till. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > MemorySegment, I > > > > think > > > > >> > you > > > > >> > >> are > > > > >> > >> >> > > right > > > > >> > >> >> > > > > that > > > > >> > >> >> > > > > > > we > > > > >> > >> >> > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope > of > > > this > > > > >> > FLIP. > > > > >> > >> >> This > > > > >> > >> >> > > FLIP > > > > >> > >> >> > > > > > should > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > >> > >> >> > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > pools > > > for > > > > >> > >> >> > TaskExecutors, > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > > > minimum > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use > > it. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > memory, I > > > > think > > > > >> > >> >> > alternative > > > > >> > >> >> > > 3 > > > > >> > >> >> > > > > may > > > > >> > >> >> > > > > > > not > > > > >> > >> >> > > > > > > > > > having > > > > >> > >> >> > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue > that > > > > >> > >> alternative 2 > > > > >> > >> >> > > does, > > > > >> > >> >> > > > > but > > > > >> > >> >> > > > > > at > > > > >> > >> >> > > > > > > > the > > > > >> > >> >> > > > > > > > > > > cost > > > > >> > >> >> > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at the > > > > >> container > > > > >> > >> level, > > > > >> > >> >> > > which > > > > >> > >> >> > > > is > > > > >> > >> >> > > > > > not > > > > >> > >> >> > > > > > > > > good. > > > > >> > >> >> > > > > > > > > > > My > > > > >> > >> >> > > > > > > > > > > > > > point > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap > > > > Memory" > > > > >> and > > > > >> > >> "JVM > > > > >> > >> >> > > > > Overhead" > > > > >> > >> >> > > > > > > are > > > > >> > >> >> > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > easy > > > > >> > >> >> > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users > > > might > > > > >> > >> configure > > > > >> > >> >> > them > > > > >> > >> >> > > > > > higher > > > > >> > >> >> > > > > > > > than > > > > >> > >> >> > > > > > > > > > > what > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > getting > > a > > > > >> direct > > > > >> > >> OOM. > > > > >> > >> >> For > > > > >> > >> >> > > > > > > alternative > > > > >> > >> >> > > > > > > > > 3, > > > > >> > >> >> > > > > > > > > > > > users > > > > >> > >> >> > > > > > > > > > > > > do > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they > > may > > > > not > > > > >> > >> config > > > > >> > >> >> the > > > > >> > >> >> > > two > > > > >> > >> >> > > > > > > options > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences are > > > risks > > > > >> of > > > > >> > >> >> overall > > > > >> > >> >> > > > > container > > > > >> > >> >> > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > > at > > > > >> 9:39 AM > > > > >> > >> Till > > > > >> > >> >> > > > > Rohrmann < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > proposing > > > > this > > > > >> > FLIP > > > > >> > >> >> > Xintong. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think > > it > > > > >> already > > > > >> > >> >> looks > > > > >> > >> >> > > quite > > > > >> > >> >> > > > > > good. > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > >> > >> >> > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > allocating > > > > >> > >> memory > > > > >> > >> >> > > > segments, > > > > >> > >> >> > > > > I > > > > >> > >> >> > > > > > > was > > > > >> > >> >> > > > > > > > > > > > wondering > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary > > to > > > do > > > > >> in > > > > >> > the > > > > >> > >> >> > context > > > > >> > >> >> > > > of > > > > >> > >> >> > > > > > this > > > > >> > >> >> > > > > > > > > FLIP > > > > >> > >> >> > > > > > > > > > or > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > follow > > > up? > > > > >> > Without > > > > >> > >> >> > knowing > > > > >> > >> >> > > > all > > > > >> > >> >> > > > > > > > > details, > > > > >> > >> >> > > > > > > > > > I > > > > >> > >> >> > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > widen > > > the > > > > >> scope > > > > >> > >> of > > > > >> > >> >> this > > > > >> > >> >> > > > FLIP > > > > >> > >> >> > > > > > too > > > > >> > >> >> > > > > > > > much > > > > >> > >> >> > > > > > > > > > > > because > > > > >> > >> >> > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > > existing > > > > >> > call > > > > >> > >> >> sites > > > > >> > >> >> > of > > > > >> > >> >> > > > the > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > >> > >> >> > > > > > > > > > > > > > where > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > > (this > > > > >> should > > > > >> > >> >> mainly > > > > >> > >> >> > be > > > > >> > >> >> > > > > batch > > > > >> > >> >> > > > > > > > > > > operators). > > > > >> > >> >> > > > > > > > > > > > > The > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > reservation > > > > >> call > > > > >> > to > > > > >> > >> the > > > > >> > >> >> > > > > > > MemoryManager > > > > >> > >> >> > > > > > > > > > should > > > > >> > >> >> > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > > hope > > > > that > > > > >> > >> this is > > > > >> > >> >> > the > > > > >> > >> >> > > > only > > > > >> > >> >> > > > > > > point > > > > >> > >> >> > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > would > > > have > > > > >> with > > > > >> > >> the > > > > >> > >> >> > > > > > > MemoryManager. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > second > > > > open > > > > >> > >> >> question > > > > >> > >> >> > > about > > > > >> > >> >> > > > > > > setting > > > > >> > >> >> > > > > > > > > or > > > > >> > >> >> > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > limit, I > > > > >> would > > > > >> > >> also > > > > >> > >> >> be > > > > >> > >> >> > > > > > interested > > > > >> > >> >> > > > > > > > why > > > > >> > >> >> > > > > > > > > > > Yang > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > > would > > > be > > > > >> > best. > > > > >> > >> My > > > > >> > >> >> > > concern > > > > >> > >> >> > > > > > about > > > > >> > >> >> > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > > situation > > > > >> as > > > > >> > we > > > > >> > >> >> are > > > > >> > >> >> > now > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > memory > > > > pools > > > > >> > are > > > > >> > >> not > > > > >> > >> >> > > > clearly > > > > >> > >> >> > > > > > > > > separated > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > can > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, > > then > > > > it > > > > >> is > > > > >> > >> quite > > > > >> > >> >> > hard > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > > understand > > > > >> > >> >> > > > > > > > > > > > what > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > killed > > > > for > > > > >> > using > > > > >> > >> >> too > > > > >> > >> >> > > much > > > > >> > >> >> > > > > > > memory. > > > > >> > >> >> > > > > > > > > This > > > > >> > >> >> > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar > > > > >> situation > > > > >> > >> what > > > > >> > >> >> we > > > > >> > >> >> > > have > > > > >> > >> >> > > > > with > > > > >> > >> >> > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > default > > > > >> value > > > > >> > >> for > > > > >> > >> >> max > > > > >> > >> >> > > > direct > > > > >> > >> >> > > > > > > > memory > > > > >> > >> >> > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > giving > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to increase > > it > > > if > > > > >> he > > > > >> > >> runs > > > > >> > >> >> into > > > > >> > >> >> > > an > > > > >> > >> >> > > > > OOM. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > would > > > > >> > >> alternative 2 > > > > >> > >> >> > lead > > > > >> > >> >> > > to > > > > >> > >> >> > > > > > lower > > > > >> > >> >> > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > where > > we > > > > set > > > > >> > the > > > > >> > >> >> direct > > > > >> > >> >> > > > > memory > > > > >> > >> >> > > > > > > to a > > > > >> > >> >> > > > > > > > > > > higher > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > 2019 > > at > > > > >> 9:12 > > > > >> > AM > > > > >> > >> >> > Xintong > > > > >> > >> >> > > > > Song < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > > feedback, > > > > >> > >> Yang. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > > comments: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > Direct > > > > >> Memory* > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting > a > > > very > > > > >> > large > > > > >> > >> max > > > > >> > >> >> > > direct > > > > >> > >> >> > > > > > > memory > > > > >> > >> >> > > > > > > > > size > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > E.g., > > we > > > > do > > > > >> not > > > > >> > >> >> worry > > > > >> > >> >> > > about > > > > >> > >> >> > > > > > > direct > > > > >> > >> >> > > > > > > > > OOM, > > > > >> > >> >> > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > managed > > > / > > > > >> > network > > > > >> > >> >> > memory > > > > >> > >> >> > > > with > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there > are > > > > also > > > > >> > some > > > > >> > >> >> down > > > > >> > >> >> > > sides > > > > >> > >> >> > > > > of > > > > >> > >> >> > > > > > > > doing > > > > >> > >> >> > > > > > > > > > > this. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I > > can > > > > >> think > > > > >> > >> of is > > > > >> > >> >> > that > > > > >> > >> >> > > > if > > > > >> > >> >> > > > > a > > > > >> > >> >> > > > > > > task > > > > >> > >> >> > > > > > > > > > > > executor > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to > > > > >> overusing > > > > >> > >> >> memory, > > > > >> > >> >> > it > > > > >> > >> >> > > > > could > > > > >> > >> >> > > > > > > be > > > > >> > >> >> > > > > > > > > hard > > > > >> > >> >> > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > use > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory > > is > > > > >> > overused. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > down > > > side > > > > >> is > > > > >> > >> that > > > > >> > >> >> the > > > > >> > >> >> > > JVM > > > > >> > >> >> > > > > > never > > > > >> > >> >> > > > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > > GC > > > > >> > >> >> > > > > > > > > > > > > due > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory > > > > limit, > > > > >> > >> because > > > > >> > >> >> the > > > > >> > >> >> > > > limit > > > > >> > >> >> > > > > > is > > > > >> > >> >> > > > > > > > too > > > > >> > >> >> > > > > > > > > > high > > > > >> > >> >> > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > be > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind > > of > > > > >> relay > > > > >> > on > > > > >> > >> >> heap > > > > >> > >> >> > > > memory > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > > GC > > > > >> > >> >> > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That > > > could > > > > >> be a > > > > >> > >> >> problem > > > > >> > >> >> > in > > > > >> > >> >> > > > > cases > > > > >> > >> >> > > > > > > > where > > > > >> > >> >> > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not > > > > enough > > > > >> > heap > > > > >> > >> >> > activity > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > trigger > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > GC. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > > share > > > > your > > > > >> > >> reasons > > > > >> > >> >> > for > > > > >> > >> >> > > > > > > preferring > > > > >> > >> >> > > > > > > > > > > > setting a > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > anything > > > > >> else > > > > >> > I > > > > >> > >> >> > > > overlooked. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > Calculation* > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any > > > > conflict > > > > >> > >> between > > > > >> > >> >> > > > multiple > > > > >> > >> >> > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > specified, > > > > I > > > > >> > >> think we > > > > >> > >> >> > > should > > > > >> > >> >> > > > > > throw > > > > >> > >> >> > > > > > > > an > > > > >> > >> >> > > > > > > > > > > error. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > > checking > > > > >> on > > > > >> > the > > > > >> > >> >> > client > > > > >> > >> >> > > > side > > > > >> > >> >> > > > > > is > > > > >> > >> >> > > > > > > a > > > > >> > >> >> > > > > > > > > good > > > > >> > >> >> > > > > > > > > > > > idea, > > > > >> > >> >> > > > > > > > > > > > > > so > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > discover > > > > the > > > > >> > >> problem > > > > >> > >> >> > > before > > > > >> > >> >> > > > > > > > submitting > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good > > > > thing. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > > only > > > > >> rely on > > > > >> > >> the > > > > >> > >> >> > > client > > > > >> > >> >> > > > > side > > > > >> > >> >> > > > > > > > > > checking, > > > > >> > >> >> > > > > > > > > > > > > > because > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > cluster > > > > >> > >> TaskManagers > > > > >> > >> >> on > > > > >> > >> >> > > > > > different > > > > >> > >> >> > > > > > > > > > machines > > > > >> > >> >> > > > > > > > > > > > may > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations > and > > > the > > > > >> > client > > > > >> > >> >> does > > > > >> > >> >> > > see > > > > >> > >> >> > > > > > that. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > think? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > > 2019 > > > at > > > > >> 5:09 > > > > >> > >> PM > > > > >> > >> >> Yang > > > > >> > >> >> > > > Wang > > > > >> > >> >> > > > > < > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > your > > > > >> detailed > > > > >> > >> >> > proposal. > > > > >> > >> >> > > > > After > > > > >> > >> >> > > > > > > all > > > > >> > >> >> > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it > > > will > > > > be > > > > >> > more > > > > >> > >> >> > > powerful > > > > >> > >> >> > > > to > > > > >> > >> >> > > > > > > > control > > > > >> > >> >> > > > > > > > > > the > > > > >> > >> >> > > > > > > > > > > > > flink > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few > > > > >> questions > > > > >> > >> about > > > > >> > >> >> it. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native > and > > > > Direct > > > > >> > >> Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > >> differentiate > > > > >> > >> user > > > > >> > >> >> > direct > > > > >> > >> >> > > > > > memory > > > > >> > >> >> > > > > > > > and > > > > >> > >> >> > > > > > > > > > > native > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in > task > > > > >> off-heap > > > > >> > >> >> memory. > > > > >> > >> >> > > > > Right? > > > > >> > >> >> > > > > > > So i > > > > >> > >> >> > > > > > > > > > don’t > > > > >> > >> >> > > > > > > > > > > > > think > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > >> > -XX:MaxDirectMemorySize > > > > >> > >> >> > > > properly. I > > > > >> > >> >> > > > > > > > prefer > > > > >> > >> >> > > > > > > > > > > > leaving > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > > >> Calculation > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of > > and > > > > >> > >> fine-grained > > > > >> > >> >> > > > > > > memory(network > > > > >> > >> >> > > > > > > > > > > memory, > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than > > > total > > > > >> > >> process > > > > >> > >> >> > > memory, > > > > >> > >> >> > > > > how > > > > >> > >> >> > > > > > do > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > deal > > > > >> > >> >> > > > > > > > > > > > > with > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > check > > > the > > > > >> > memory > > > > >> > >> >> > > > > configuration > > > > >> > >> >> > > > > > > in > > > > >> > >> >> > > > > > > > > > > client? > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > > > > >> > >> >> > > [hidden email]> > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > like > > to > > > > >> start > > > > >> > a > > > > >> > >> >> > > discussion > > > > >> > >> >> > > > > > > thread > > > > >> > >> >> > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > for > > > > >> > >> >> > > > TaskExecutors"[1], > > > > >> > >> >> > > > > > > where > > > > >> > >> >> > > > > > > > we > > > > >> > >> >> > > > > > > > > > > > > describe > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > memory > > > > >> > >> >> > > configurations. > > > > >> > >> >> > > > > The > > > > >> > >> >> > > > > > > > FLIP > > > > >> > >> >> > > > > > > > > > > > document > > > > >> > >> >> > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design > > > > "Memory > > > > >> > >> >> Management > > > > >> > >> >> > > and > > > > >> > >> >> > > > > > > > > > Configuration > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates > > > from > > > > >> > >> follow-up > > > > >> > >> >> > > > > discussions > > > > >> > >> >> > > > > > > > both > > > > >> > >> >> > > > > > > > > > > online > > > > >> > >> >> > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > > addresses > > > > >> > >> several > > > > >> > >> >> > > > > > shortcomings > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > > > current > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > memory > > > > >> > >> >> > > configuration. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Different > > > > >> > >> configuration > > > > >> > >> >> > for > > > > >> > >> >> > > > > > > Streaming > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex > > and > > > > >> > >> difficult > > > > >> > >> >> > > > > > configuration > > > > >> > >> >> > > > > > > of > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > >> > >> >> > > > > > > > > > > > > in > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Complicated, > > > > >> > >> uncertain > > > > >> > >> >> and > > > > >> > >> >> > > > hard > > > > >> > >> >> > > > > to > > > > >> > >> >> > > > > > > > > > > understand. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes > to > > > > solve > > > > >> > the > > > > >> > >> >> > problems > > > > >> > >> >> > > > can > > > > >> > >> >> > > > > > be > > > > >> > >> >> > > > > > > > > > > summarized > > > > >> > >> >> > > > > > > > > > > > > as > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend > > > memory > > > > >> > >> manager > > > > >> > >> >> to > > > > >> > >> >> > > also > > > > >> > >> >> > > > > > > account > > > > >> > >> >> > > > > > > > > for > > > > >> > >> >> > > > > > > > > > > > memory > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify > > how > > > > >> > >> TaskExecutor > > > > >> > >> >> > > memory > > > > >> > >> >> > > > > is > > > > >> > >> >> > > > > > > > > > > partitioned > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > > >> reservations > > > > >> > >> and > > > > >> > >> >> > pools. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Simplify > > > > memory > > > > >> > >> >> > > configuration > > > > >> > >> >> > > > > > > options > > > > >> > >> >> > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find > > more > > > > >> > details > > > > >> > >> in > > > > >> > >> >> the > > > > >> > >> >> > > > FLIP > > > > >> > >> >> > > > > > wiki > > > > >> > >> >> > > > > > > > > > > document > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note > > > that > > > > >> the > > > > >> > >> early > > > > >> > >> >> > > design > > > > >> > >> >> > > > > doc > > > > >> > >> >> > > > > > > [2] > > > > >> > >> >> > > > > > > > is > > > > >> > >> >> > > > > > > > > > out > > > > >> > >> >> > > > > > > > > > > > of > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated > to > > > > have > > > > >> the > > > > >> > >> >> > > discussion > > > > >> > >> >> > > > in > > > > >> > >> >> > > > > > > this > > > > >> > >> >> > > > > > > > > > > mailing > > > > >> > >> >> > > > > > > > > > > > > list > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > forward > > > to > > > > >> your > > > > >> > >> >> > > feedbacks. > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> >> > > > > >> > >> > > > > > >> > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > |
@Andrey,
If we speak in the condition of dynamic slot allocation, then DataSet tasks should always request slots with unknown resource profiles. Then the allocated slots should have 1/n on-heap managed memory and 1/n off-heap managed memory of the task executor, where n is the configured 'numberOfSlot'. Given that DataSet operators only allocate segments (no reservation), and does not care whether the segments are on-heap or off-heap, I think it makes sense to wrapping both pools for DataSet operators, with respect to the on-heap/off-heap quota of the allocated slot. I'm not sure whether we want to create one memory manager per slot. I think we can have limit the quota of each slot with one memory manager per task executor. Given the dynamic slot allocation that dynamically create and destroy slots, I think having one memory manager per task executor may save the overhead for frequently creating and destroying memory managers, and provide the chance for reusing memory segments across slots. Thank you~ Xintong Song On Fri, Sep 13, 2019 at 11:47 PM Andrey Zagrebin <[hidden email]> wrote: > Hi Xintong, > > True, there would be no regression if only one type of memory is > configured. This can be a problem only for the old jobs running in a newly > configured cluster. > > About the pool type precedence, in general, it should not matter for the > users which type the segments have. > The first implementation can be just to pull from any pool, e.g. empty one > pool firstly and then another or some other random pulling. > This might be a problem if we mix segment allocations and reservation of > memory chunks from the same memory manager. > The reservation will be usually for a certain type of memory then the task > will probably have to also decide from which pool to allocate the segments. > I would suggest we create a memory manager per slot and give it memory > limit of the slot then we do not have this kind of mixed operation > because Dataset/Batch jobs need only segment memory allocations and > streaming jobs need only memory chunks for state backends as I understand > the current plan. > I would suggest we will look at it if we have the mixed operations at some > point and it becomes a problem. > > Thanks, > Andrey > > On Fri, Sep 13, 2019 at 5:24 PM Andrey Zagrebin <[hidden email]> > wrote: > > > > > > > ---------- Forwarded message --------- > > From: Xintong Song <[hidden email]> > > Date: Thu, Sep 12, 2019 at 4:21 AM > > Subject: Re: [DISCUSS] FLIP-49: Unified Memory Configuration for > > TaskExecutors > > To: dev <[hidden email]> > > > > > > Hi Andrey, > > > > Thanks for bringing this up. > > > > If I understand correctly, this issue only occurs where the cluster is > > configured with both on-heap and off-heap memory. There should be no > > regression for clusters configured in the old way (either all on-heap or > > all off-heap). > > > > I also agree that it would be good if the DataSet API jobs can use both > > memory types. The only question I can see is that, from which pool (heap > / > > off-heap) should we allocate memory for DataSet API operators? Do we > > always prioritize one pool over the other? Or do we always prioritize the > > pool with more available memory left? > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Tue, Sep 10, 2019 at 8:15 PM Andrey Zagrebin <[hidden email]> > > wrote: > > > > > Hi All, > > > > > > While looking more into the implementation details of Step 4, we > released > > > during some offline discussions with @Till > > > that there can be a performance degradation for the batch DataSet API > if > > we > > > simply continue to pull memory from the pool > > > according the legacy option taskmanager.memory.off-heap. > > > > > > The reason is that if the cluster is newly configured to statically > split > > > heap/off-heap (not like previously either heap or 0ff-heap) > > > then the batch DataSet API jobs will be able to use only one type of > > > memory. Although it does not really matter where the memory segments > come > > > from > > > and potentially batch jobs can use both. Also, currently the Dataset > API > > > does not result in absolute resource requirements and its batch jobs > will > > > always get a default share of TM resources. > > > > > > The suggestion is that we let the batch tasks of Dataset API pull from > > both > > > pools according to their fair slot share of each memory type. > > > For that we can have a special wrapping view of both pools which will > > pull > > > segments (can be randomly) according to the slot limits. > > > The view can wrap TM level memory pools and be given to the Task. > > > > > > Best, > > > Andrey > > > > > > On Mon, Sep 2, 2019 at 1:35 PM Xintong Song <[hidden email]> > > wrote: > > > > > > > Thanks for your comments, Andrey. > > > > > > > > - Regarding Task Off-Heap Memory, I think you're right that the user > > need > > > > to make sure that direct memory and native memory together used by > the > > > user > > > > code (external libs) do not exceed the configured value. As far as I > > can > > > > think of, there is nothing we can do about it. > > > > > > > > I addressed the rest of your comment in the wiki page [1]. Please > take > > a > > > > look. > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin <[hidden email] > > > > > > wrote: > > > > > > > > > EDIT: sorry for confusion I meant > > > > > taskmanager.memory.off-heap > > > > > instead of > > > > > setting taskmanager.memory.preallocate > > > > > > > > > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin < > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi All, > > > > > > > > > > > > @Xitong thanks a lot for driving the discussion. > > > > > > > > > > > > I also reviewed the FLIP and it looks quite good to me. > > > > > > Here are some comments: > > > > > > > > > > > > > > > > > > - One thing I wanted to discuss is the backwards-compatibility > > > with > > > > > > the previous user setups. We could list which options we plan > to > > > > > deprecate. > > > > > > From the first glance it looks possible to provide the > > > same/similar > > > > > > behaviour for the setups relying on the deprecated options. > E.g. > > > > > > setting taskmanager.memory.preallocate to true could override > > the > > > > > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At > the > > > > > moment the > > > > > > FLIP just states that in some cases it may require > > re-configuring > > > of > > > > > > cluster if migrated from prior versions. My suggestion is that > > we > > > > try > > > > > to > > > > > > keep it backwards-compatible unless there is a good reason > like > > > some > > > > > major > > > > > > complication for the implementation. > > > > > > > > > > > > > > > > > > Also couple of smaller things: > > > > > > > > > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP and > > > leave > > > > > > some general wording atm, like 'data structure to store' or > > > 'utility > > > > > > classes'. When the classes are implemented, we put the > concrete > > > > class > > > > > > names. This way we can avoid confusion and stale documents. > > > > > > > > > > > > > > > > > > - As I understand, if user task uses native memory (not direct > > > > memory, > > > > > > but e.g. unsafe.allocate or from external lib), there will be > no > > > > > > explicit guard against exceeding 'task off heap memory'. Then > > user > > > > > should > > > > > > still explicitly make sure that her/his direct buffer > allocation > > > > plus > > > > > any > > > > > > other memory usages does not exceed value announced as 'task > off > > > > > heap'. I > > > > > > guess there is no so much that can be done about it except > > > > mentioning > > > > > in > > > > > > docs, similar to controlling the heap state backend. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Andrey > > > > > > > > > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang <[hidden email] > > > > > > wrote: > > > > > > > > > > > >> I also agree that all the configuration should be calculated out > > of > > > > > >> TaskManager. > > > > > >> > > > > > >> So a full configuration should be generated before TaskManager > > > > started. > > > > > >> > > > > > >> Override the calculated configurations through -D now seems > > better. > > > > > >> > > > > > >> > > > > > >> > > > > > >> Best, > > > > > >> > > > > > >> Yang > > > > > >> > > > > > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > > > > >> > > > > > >> > I just updated the FLIP wiki page [1], with the following > > changes: > > > > > >> > > > > > > >> > - Network memory uses JVM direct memory, and is accounted > > when > > > > > >> setting > > > > > >> > JVM max direct memory size parameter. > > > > > >> > - Use dynamic configurations (`-Dkey=value`) to pass > > calculated > > > > > >> memory > > > > > >> > configs into TaskExecutors, instead of ENV variables. > > > > > >> > - Remove 'supporting memory reservation' from the scope of > > this > > > > > FLIP. > > > > > >> > > > > > > >> > @till @stephan, please take another look see if there are any > > > other > > > > > >> > concerns. > > > > > >> > > > > > > >> > Thank you~ > > > > > >> > > > > > > >> > Xintong Song > > > > > >> > > > > > > >> > > > > > > >> > [1] > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > >> > > > > > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song < > > > [hidden email] > > > > > > > > > > >> > wrote: > > > > > >> > > > > > > >> > > Sorry for the late response. > > > > > >> > > > > > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's > discuss > > > the > > > > > >> detail > > > > > >> > > in PR. > > > > > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 > for > > > > using > > > > > >> > > dynamic configuration at the moment, given that there are > more > > > > > >> questions > > > > > >> > to > > > > > >> > > be discussed to have a general framework for overwriting > > > > > >> configurations > > > > > >> > > with ENV variables. > > > > > >> > > - Regarding memory reservation, I double checked with Yu and > > he > > > > will > > > > > >> take > > > > > >> > > care of it. > > > > > >> > > > > > > > >> > > Thank you~ > > > > > >> > > > > > > > >> > > Xintong Song > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann < > > > > [hidden email] > > > > > > > > > > > >> > > wrote: > > > > > >> > > > > > > > >> > >> What I forgot to add is that we could tackle specifying the > > > > > >> > configuration > > > > > >> > >> fully in an incremental way and that the full specification > > > > should > > > > > be > > > > > >> > the > > > > > >> > >> desired end state. > > > > > >> > >> > > > > > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > > > > > [hidden email]> > > > > > >> > >> wrote: > > > > > >> > >> > > > > > >> > >> > I think our goal should be that the configuration is > fully > > > > > >> specified > > > > > >> > >> when > > > > > >> > >> > the process is started. By considering the internal > > > calculation > > > > > >> step > > > > > >> > to > > > > > >> > >> be > > > > > >> > >> > rather validate existing values and calculate missing > ones, > > > > these > > > > > >> two > > > > > >> > >> > proposal shouldn't even conflict (given determinism). > > > > > >> > >> > > > > > > >> > >> > Since we don't want to change an existing > flink-conf.yaml, > > > > > >> specifying > > > > > >> > >> the > > > > > >> > >> > full configuration would require to pass in the options > > > > > >> differently. > > > > > >> > >> > > > > > > >> > >> > One way could be the ENV variables approach. The reason > why > > > I'm > > > > > >> trying > > > > > >> > >> to > > > > > >> > >> > exclude this feature from the FLIP is that I believe it > > > needs a > > > > > bit > > > > > >> > more > > > > > >> > >> > discussion. Just some questions which come to my mind: > What > > > > would > > > > > >> be > > > > > >> > the > > > > > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot > > > separator > > > > > >> which > > > > > >> > is > > > > > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we accept > > the > > > > dot > > > > > >> > >> > separator what would be the order of precedence if there > > are > > > > two > > > > > >> ENV > > > > > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? > > What > > > is > > > > > the > > > > > >> > >> > precedence of env variable vs. dynamic configuration > value > > > > > >> specified > > > > > >> > >> via -D? > > > > > >> > >> > > > > > > >> > >> > Another approach could be to pass in the dynamic > > > configuration > > > > > >> values > > > > > >> > >> via > > > > > >> > >> > `-Dkey=value` to the Flink process. For that we don't > have > > to > > > > > >> change > > > > > >> > >> > anything because the functionality already exists. > > > > > >> > >> > > > > > > >> > >> > Cheers, > > > > > >> > >> > Till > > > > > >> > >> > > > > > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen < > > > > [hidden email]> > > > > > >> > wrote: > > > > > >> > >> > > > > > > >> > >> >> I see. Under the assumption of strict determinism that > > > should > > > > > >> work. > > > > > >> > >> >> > > > > > >> > >> >> The original proposal had this point "don't compute > inside > > > the > > > > > TM, > > > > > >> > >> compute > > > > > >> > >> >> outside and supply a full config", because that sounded > > more > > > > > >> > intuitive. > > > > > >> > >> >> > > > > > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > > > > > >> [hidden email] > > > > > >> > > > > > > > >> > >> >> wrote: > > > > > >> > >> >> > > > > > >> > >> >> > My understanding was that before starting the Flink > > > process > > > > we > > > > > >> > call a > > > > > >> > >> >> > utility which calculates these values. I assume that > > this > > > > > >> utility > > > > > >> > >> will > > > > > >> > >> >> do > > > > > >> > >> >> > the calculation based on a set of configured values > > > (process > > > > > >> > memory, > > > > > >> > >> >> flink > > > > > >> > >> >> > memory, network memory etc.). Assuming that these > values > > > > don't > > > > > >> > differ > > > > > >> > >> >> from > > > > > >> > >> >> > the values with which the JVM is started, it should be > > > > > possible > > > > > >> to > > > > > >> > >> >> > recompute them in the Flink process in order to set > the > > > > > values. > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > > > > > [hidden email] > > > > > >> > > > > > > >> > >> wrote: > > > > > >> > >> >> > > > > > > >> > >> >> > > When computing the values in the JVM process after > it > > > > > started, > > > > > >> > how > > > > > >> > >> >> would > > > > > >> > >> >> > > you deal with values like Max Direct Memory, > Metaspace > > > > size. > > > > > >> > native > > > > > >> > >> >> > memory > > > > > >> > >> >> > > reservation (reduce heap size), etc? All the values > > that > > > > are > > > > > >> > >> >> parameters > > > > > >> > >> >> > to > > > > > >> > >> >> > > the JVM process and that need to be supplied at > > process > > > > > >> startup? > > > > > >> > >> >> > > > > > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > > > > >> > >> [hidden email]> > > > > > >> > >> >> > > wrote: > > > > > >> > >> >> > > > > > > > >> > >> >> > > > Thanks for the clarification. I have some more > > > comments: > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > - I would actually split the logic to compute the > > > > process > > > > > >> > memory > > > > > >> > >> >> > > > requirements and storing the values into two > things. > > > > E.g. > > > > > >> one > > > > > >> > >> could > > > > > >> > >> >> > name > > > > > >> > >> >> > > > the former TaskExecutorProcessUtility and the > > latter > > > > > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss this > > on > > > > the > > > > > PR > > > > > >> > >> since > > > > > >> > >> >> it's > > > > > >> > >> >> > > > just a naming detail. > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > - Generally, I'm not opposed to making > configuration > > > > > values > > > > > >> > >> >> overridable > > > > > >> > >> >> > > by > > > > > >> > >> >> > > > ENV variables. I think this is a very good idea > and > > > > makes > > > > > >> the > > > > > >> > >> >> > > > configurability of Flink processes easier. > However, > > I > > > > > think > > > > > >> > that > > > > > >> > >> >> adding > > > > > >> > >> >> > > > this functionality should not be part of this FLIP > > > > because > > > > > >> it > > > > > >> > >> would > > > > > >> > >> >> > > simply > > > > > >> > >> >> > > > widen the scope unnecessarily. > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are > the > > > > > >> following: > > > > > >> > >> For > > > > > >> > >> >> Yarn > > > > > >> > >> >> > > we > > > > > >> > >> >> > > > already create write a flink-conf.yaml which could > > be > > > > > >> populated > > > > > >> > >> with > > > > > >> > >> >> > the > > > > > >> > >> >> > > > memory settings. For the other processes it should > > not > > > > > make > > > > > >> a > > > > > >> > >> >> > difference > > > > > >> > >> >> > > > whether the loaded Configuration is populated with > > the > > > > > >> memory > > > > > >> > >> >> settings > > > > > >> > >> >> > > from > > > > > >> > >> >> > > > ENV variables or by using > TaskExecutorProcessUtility > > > to > > > > > >> compute > > > > > >> > >> the > > > > > >> > >> >> > > missing > > > > > >> > >> >> > > > values from the loaded configuration. If the > latter > > > > would > > > > > >> not > > > > > >> > be > > > > > >> > >> >> > possible > > > > > >> > >> >> > > > (wrong or missing configuration values), then we > > > should > > > > > not > > > > > >> > have > > > > > >> > >> >> been > > > > > >> > >> >> > > able > > > > > >> > >> >> > > > to actually start the process in the first place. > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > - Concerning the memory reservation: I agree with > > you > > > > that > > > > > >> we > > > > > >> > >> need > > > > > >> > >> >> the > > > > > >> > >> >> > > > memory reservation functionality to make streaming > > > jobs > > > > > work > > > > > >> > with > > > > > >> > >> >> > > "managed" > > > > > >> > >> >> > > > memory. However, w/o this functionality the whole > > Flip > > > > > would > > > > > >> > >> already > > > > > >> > >> >> > > bring > > > > > >> > >> >> > > > a good amount of improvements to our users when > > > running > > > > > >> batch > > > > > >> > >> jobs. > > > > > >> > >> >> > > > Moreover, by keeping the scope smaller we can > > complete > > > > the > > > > > >> FLIP > > > > > >> > >> >> faster. > > > > > >> > >> >> > > > Hence, I would propose to address the memory > > > reservation > > > > > >> > >> >> functionality > > > > > >> > >> >> > > as a > > > > > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm not > > > > > mistaken). > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > Cheers, > > > > > >> > >> >> > > > Till > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > > > > >> > >> [hidden email]> > > > > > >> > >> >> > > wrote: > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > Just add my 2 cents. > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > Using environment variables to override the > > > > > configuration > > > > > >> for > > > > > >> > >> >> > different > > > > > >> > >> >> > > > > taskmanagers is better. > > > > > >> > >> >> > > > > We do not need to generate dedicated > > flink-conf.yaml > > > > for > > > > > >> all > > > > > >> > >> >> > > > taskmanagers. > > > > > >> > >> >> > > > > A common flink-conf.yam and different > environment > > > > > >> variables > > > > > >> > are > > > > > >> > >> >> > enough. > > > > > >> > >> >> > > > > By reducing the distributed cached files, it > could > > > > make > > > > > >> > >> launching > > > > > >> > >> >> a > > > > > >> > >> >> > > > > taskmanager faster. > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could > move > > > the > > > > > >> logic > > > > > >> > >> into > > > > > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > > > > >> > >> >> > > > > Maybe the client could also benefit from this. > > > > Different > > > > > >> > users > > > > > >> > >> do > > > > > >> > >> >> not > > > > > >> > >> >> > > > have > > > > > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config > > > options. > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > Best, > > > > > >> > >> >> > > > > Yang > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 > > > > 上午1:21写道: > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > One note on the Environment Variables and > > > > > Configuration > > > > > >> > >> >> discussion. > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > My understanding is that passed ENV variables > > are > > > > > added > > > > > >> to > > > > > >> > >> the > > > > > >> > >> >> > > > > > configuration in the > > > > > "GlobalConfiguration.loadConfig()" > > > > > >> > >> method > > > > > >> > >> >> (or > > > > > >> > >> >> > > > > > similar). > > > > > >> > >> >> > > > > > For all the code inside Flink, it looks like > the > > > > data > > > > > >> was > > > > > >> > in > > > > > >> > >> the > > > > > >> > >> >> > > config > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > start with, just that the scripts that compute > > the > > > > > >> > variables > > > > > >> > >> can > > > > > >> > >> >> > pass > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > values to the process without actually needing > > to > > > > > write > > > > > >> a > > > > > >> > >> file. > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > For example the > > "GlobalConfiguration.loadConfig()" > > > > > >> method > > > > > >> > >> would > > > > > >> > >> >> > take > > > > > >> > >> >> > > > any > > > > > >> > >> >> > > > > > ENV variable prefixed with "flink" and add it > > as a > > > > > >> config > > > > > >> > >> key. > > > > > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would > become > > > > > >> > >> >> > > > "taskmanager.memory.size: > > > > > >> > >> >> > > > > > 2g". > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > > > > > >> > >> >> > [hidden email]> > > > > > >> > >> >> > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki > page, > > > but > > > > > >> let's > > > > > >> > >> keep > > > > > >> > >> >> the > > > > > >> > >> >> > > > > > > discussion here. > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do > > you > > > > > think > > > > > >> > about > > > > > >> > >> >> > naming > > > > > >> > >> >> > > it > > > > > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > > > > >> > >> >> > > > > > > - Regarding passing memory configurations > into > > > > task > > > > > >> > >> executors, > > > > > >> > >> >> > I'm > > > > > >> > >> >> > > in > > > > > >> > >> >> > > > > > favor > > > > > >> > >> >> > > > > > > of do it via environment variables rather > than > > > > > >> > >> configurations, > > > > > >> > >> >> > with > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > following two reasons. > > > > > >> > >> >> > > > > > > - It is easier to keep the memory options > > once > > > > > >> > calculate > > > > > >> > >> >> not to > > > > > >> > >> >> > > be > > > > > >> > >> >> > > > > > > changed with environment variables rather > than > > > > > >> > >> configurations. > > > > > >> > >> >> > > > > > > - I'm not sure whether we should write the > > > > > >> > configuration > > > > > >> > >> in > > > > > >> > >> >> > > startup > > > > > >> > >> >> > > > > > > scripts. Writing changes into the > > configuration > > > > > files > > > > > >> > when > > > > > >> > >> >> > running > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > startup scripts does not sounds right to me. > > Or > > > we > > > > > >> could > > > > > >> > >> make > > > > > >> > >> >> a > > > > > >> > >> >> > > copy > > > > > >> > >> >> > > > of > > > > > >> > >> >> > > > > > > configuration files per flink cluster, and > > make > > > > the > > > > > >> task > > > > > >> > >> >> executor > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > load > > > > > >> > >> >> > > > > > > from the copy, and clean up the copy after > the > > > > > >> cluster is > > > > > >> > >> >> > shutdown, > > > > > >> > >> >> > > > > which > > > > > >> > >> >> > > > > > > is complicated. (I think this is also what > > > Stephan > > > > > >> means > > > > > >> > in > > > > > >> > >> >> his > > > > > >> > >> >> > > > comment > > > > > >> > >> >> > > > > > on > > > > > >> > >> >> > > > > > > the wiki page?) > > > > > >> > >> >> > > > > > > - Regarding reserving memory, I think this > > > change > > > > > >> should > > > > > >> > be > > > > > >> > >> >> > > included > > > > > >> > >> >> > > > in > > > > > >> > >> >> > > > > > > this FLIP. I think a big part of motivations > > of > > > > this > > > > > >> FLIP > > > > > >> > >> is > > > > > >> > >> >> to > > > > > >> > >> >> > > unify > > > > > >> > >> >> > > > > > > memory configuration for streaming / batch > and > > > > make > > > > > it > > > > > >> > easy > > > > > >> > >> >> for > > > > > >> > >> >> > > > > > configuring > > > > > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > > > > > >> reservation, > > > > > >> > >> then > > > > > >> > >> >> > > > streaming > > > > > >> > >> >> > > > > > jobs > > > > > >> > >> >> > > > > > > cannot use managed memory (neither on-heap > or > > > > > >> off-heap), > > > > > >> > >> which > > > > > >> > >> >> > > makes > > > > > >> > >> >> > > > > this > > > > > >> > >> >> > > > > > > FLIP incomplete. > > > > > >> > >> >> > > > > > > - Regarding network memory, I think you are > > > > right. I > > > > > >> > think > > > > > >> > >> we > > > > > >> > >> >> > > > probably > > > > > >> > >> >> > > > > > > don't need to change network stack from > using > > > > direct > > > > > >> > >> memory to > > > > > >> > >> >> > > using > > > > > >> > >> >> > > > > > unsafe > > > > > >> > >> >> > > > > > > native memory. Network memory size is > > > > deterministic, > > > > > >> > >> cannot be > > > > > >> > >> >> > > > reserved > > > > > >> > >> >> > > > > > as > > > > > >> > >> >> > > > > > > managed memory does, and cannot be > overused. I > > > > think > > > > > >> it > > > > > >> > >> also > > > > > >> > >> >> > works > > > > > >> > >> >> > > if > > > > > >> > >> >> > > > > we > > > > > >> > >> >> > > > > > > simply keep using direct memory for network > > and > > > > > >> include > > > > > >> > it > > > > > >> > >> in > > > > > >> > >> >> jvm > > > > > >> > >> >> > > max > > > > > >> > >> >> > > > > > > direct memory size. > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till > Rohrmann > > < > > > > > >> > >> >> > > [hidden email]> > > > > > >> > >> >> > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and > > adding > > > a > > > > > more > > > > > >> > >> >> detailed > > > > > >> > >> >> > > > > > > > implementation plan. I have a couple of > > > comments > > > > > >> > >> concerning > > > > > >> > >> >> the > > > > > >> > >> >> > > > > > > > implementation plan: > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not > > > really > > > > > >> > >> >> descriptive. > > > > > >> > >> >> > > > > Choosing > > > > > >> > >> >> > > > > > a > > > > > >> > >> >> > > > > > > > different name could help here. > > > > > >> > >> >> > > > > > > > - I'm not sure whether I would pass the > > memory > > > > > >> > >> >> configuration to > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > > > TaskExecutor via environment variables. I > > > think > > > > it > > > > > >> > would > > > > > >> > >> be > > > > > >> > >> >> > > better > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > write > > > > > >> > >> >> > > > > > > > it into the configuration one uses to > start > > > the > > > > TM > > > > > >> > >> process. > > > > > >> > >> >> > > > > > > > - If possible, I would exclude the memory > > > > > >> reservation > > > > > >> > >> from > > > > > >> > >> >> this > > > > > >> > >> >> > > > FLIP > > > > > >> > >> >> > > > > > and > > > > > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > > > > > >> > >> >> > > > > > > > - If possible, then I would exclude > changes > > to > > > > the > > > > > >> > >> network > > > > > >> > >> >> > stack > > > > > >> > >> >> > > > from > > > > > >> > >> >> > > > > > > this > > > > > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the > > direct > > > > > memory > > > > > >> > >> needed > > > > > >> > >> >> by > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > > network > > > > > >> > >> >> > > > > > > > stack is the framework direct memory > > > > requirement. > > > > > >> > >> Changing > > > > > >> > >> >> how > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > is allocated can happen in a second step. > > This > > > > > would > > > > > >> > keep > > > > > >> > >> >> the > > > > > >> > >> >> > > scope > > > > > >> > >> >> > > > > of > > > > > >> > >> >> > > > > > > this > > > > > >> > >> >> > > > > > > > FLIP smaller. > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong > > Song < > > > > > >> > >> >> > > > [hidden email]> > > > > > >> > >> >> > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on wiki > > > [1], > > > > > with > > > > > >> > the > > > > > >> > >> >> > > following > > > > > >> > >> >> > > > > > > changes. > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > > > > > MemorySegment > > > > > >> > >> >> > allocation. > > > > > >> > >> >> > > As > > > > > >> > >> >> > > > > > > > > discussed, we exclude this topic from > > the > > > > > >> scope of > > > > > >> > >> this > > > > > >> > >> >> > > FLIP. > > > > > >> > >> >> > > > > > > > > - Updated content about JVM direct > > memory > > > > > >> > parameter > > > > > >> > >> >> > > according > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > recent > > > > > >> > >> >> > > > > > > > > discussions, and moved the other > > options > > > to > > > > > >> > >> "Rejected > > > > > >> > >> >> > > > > > Alternatives" > > > > > >> > >> >> > > > > > > > for > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > moment. > > > > > >> > >> >> > > > > > > > > - Added implementation steps. > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > [1] > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan > > > Ewen < > > > > > >> > >> >> > [hidden email] > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory > > > users > > > > > >> before > > > > > >> > >> task > > > > > >> > >> >> > > dispose > > > > > >> > >> >> > > > > and > > > > > >> > >> >> > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > release": I agree, that's how it > should > > > be. > > > > > >> Let's > > > > > >> > >> try it > > > > > >> > >> >> > out. > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM > > does > > > > not > > > > > >> wait > > > > > >> > >> for > > > > > >> > >> >> GC > > > > > >> > >> >> > > when > > > > > >> > >> >> > > > > > > > allocating > > > > > >> > >> >> > > > > > > > > > direct memory buffer": There seems to > be > > > > > pretty > > > > > >> > >> >> elaborate > > > > > >> > >> >> > > logic > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > free > > > > > >> > >> >> > > > > > > > > > buffers when allocating new ones. See > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the > JVM > > > > > default > > > > > >> > works > > > > > >> > >> >> (like > > > > > >> > >> >> > > > going > > > > > >> > >> >> > > > > > > with > > > > > >> > >> >> > > > > > > > > > option 2 and not setting > > > > > >> "-XX:MaxDirectMemorySize" > > > > > >> > at > > > > > >> > >> >> all), > > > > > >> > >> >> > > > then > > > > > >> > >> >> > > > > I > > > > > >> > >> >> > > > > > > > think > > > > > >> > >> >> > > > > > > > > it > > > > > >> > >> >> > > > > > > > > > should be okay to set > > > > > "-XX:MaxDirectMemorySize" > > > > > >> to > > > > > >> > >> >> > > > > > > > > > "off_heap_managed_memory + > > direct_memory" > > > > even > > > > > >> if > > > > > >> > we > > > > > >> > >> use > > > > > >> > >> >> > > > RocksDB. > > > > > >> > >> >> > > > > > > That > > > > > >> > >> >> > > > > > > > > is a > > > > > >> > >> >> > > > > > > > > > big if, though, I honestly have no > idea > > :D > > > > > >> Would be > > > > > >> > >> >> good to > > > > > >> > >> >> > > > > > > understand > > > > > >> > >> >> > > > > > > > > > this, though, because this would > affect > > > > option > > > > > >> (2) > > > > > >> > >> and > > > > > >> > >> >> > option > > > > > >> > >> >> > > > > > (1.2). > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM > Xintong > > > > Song < > > > > > >> > >> >> > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your points. > > > > Please > > > > > >> > correct > > > > > >> > >> >> me if > > > > > >> > >> >> > > I'm > > > > > >> > >> >> > > > > > > wrong. > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should always > > > avoid > > > > > >> > returning > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > > > segments > > > > > >> > >> >> > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > memory manager while there are > > still > > > > > >> > un-cleaned > > > > > >> > >> >> > > > structures / > > > > > >> > >> >> > > > > > > > threads > > > > > >> > >> >> > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, it > > > would > > > > > >> cause > > > > > >> > >> >> serious > > > > > >> > >> >> > > > > problems > > > > > >> > >> >> > > > > > > by > > > > > >> > >> >> > > > > > > > > > having > > > > > >> > >> >> > > > > > > > > > > multiple consumers trying to use > > the > > > > same > > > > > >> > memory > > > > > >> > >> >> > > segment. > > > > > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > > > > > allocating > > > > > >> > >> direct > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > > > buffer. > > > > > >> > >> >> > > > > > > > > > > Therefore even we set proper max > > > direct > > > > > >> memory > > > > > >> > >> size > > > > > >> > >> >> > > limit, > > > > > >> > >> >> > > > > we > > > > > >> > >> >> > > > > > > may > > > > > >> > >> >> > > > > > > > > > still > > > > > >> > >> >> > > > > > > > > > > encounter direct memory oom if > the > > GC > > > > > >> cleaning > > > > > >> > >> >> memory > > > > > >> > >> >> > > > slower > > > > > >> > >> >> > > > > > > than > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > direct memory allocation. > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM > > > > JingsongLee > > > > > < > > > > > >> > >> >> > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > .invalid> > > > > > >> > >> >> > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not cleanly > > shut > > > > > down > > > > > >> > >> before > > > > > >> > >> >> we > > > > > >> > >> >> > can > > > > > >> > >> >> > > > > exit > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > task: > > > > > >> > >> >> > > > > > > > > > > > In the current case of memory > reuse, > > > it > > > > > has > > > > > >> > >> freed up > > > > > >> > >> >> > the > > > > > >> > >> >> > > > > memory > > > > > >> > >> >> > > > > > > it > > > > > >> > >> >> > > > > > > > > > > > uses. If this memory is used by > > other > > > > > tasks > > > > > >> > and > > > > > >> > >> >> > > > asynchronous > > > > > >> > >> >> > > > > > > > threads > > > > > >> > >> >> > > > > > > > > > > > of exited task may still be > > writing, > > > > > there > > > > > >> > will > > > > > >> > >> be > > > > > >> > >> >> > > > > concurrent > > > > > >> > >> >> > > > > > > > > security > > > > > >> > >> >> > > > > > > > > > > > problems, and even lead to errors > > in > > > > user > > > > > >> > >> computing > > > > > >> > >> >> > > > results. > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > > > > > intolerable > > > > > >> > >> bug, No > > > > > >> > >> >> > > matter > > > > > >> > >> >> > > > > > what > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > option is, it should be avoided. > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > > > > > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, > > I've > > > > > >> > >> encountered so > > > > > >> > >> >> > many > > > > > >> > >> >> > > > > > > > situations > > > > > >> > >> >> > > > > > > > > > > > that it's too late for GC to > cause > > > > > >> > DirectMemory > > > > > >> > >> >> OOM. > > > > > >> > >> >> > > > Release > > > > > >> > >> >> > > > > > and > > > > > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on > the > > > > type > > > > > of > > > > > >> > user > > > > > >> > >> >> job, > > > > > >> > >> >> > > > which > > > > > >> > >> >> > > > > is > > > > > >> > >> >> > > > > > > > > > > > often beyond our control. > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > > > > > >> > >> >> > > > > > > > > > > > Jingsong Lee > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > >> > ------------------------------------------------------------------ > > > > > >> > >> >> > > > > > > > > > > > From:Stephan Ewen < > [hidden email] > > > > > > > > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > > > > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: > > Unified > > > > > >> Memory > > > > > >> > >> >> > > Configuration > > > > > >> > >> >> > > > > for > > > > > >> > >> >> > > > > > > > > > > > TaskExecutors > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 > > > (manually > > > > > >> release > > > > > >> > >> >> memory) > > > > > >> > >> >> > > is > > > > > >> > >> >> > > > > that > > > > > >> > >> >> > > > > > > > > > segfaults > > > > > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of > > > alarms > > > > on > > > > > >> user > > > > > >> > >> >> ends. > > > > > >> > >> >> > So > > > > > >> > >> >> > > we > > > > > >> > >> >> > > > > > need > > > > > >> > >> >> > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > guarantee that this never happens. > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that > uses > > > > data > > > > > >> > >> >> structures / > > > > > >> > >> >> > > > > > algorithms > > > > > >> > >> >> > > > > > > > > with > > > > > >> > >> >> > > > > > > > > > > > additional threads, like hash > table > > > > > >> spill/read > > > > > >> > >> and > > > > > >> > >> >> > > sorting > > > > > >> > >> >> > > > > > > threads. > > > > > >> > >> >> > > > > > > > > We > > > > > >> > >> >> > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > to ensure that these cleanly shut > > down > > > > > >> before > > > > > >> > we > > > > > >> > >> can > > > > > >> > >> >> > exit > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > task. > > > > > >> > >> >> > > > > > > > > > > > I am not sure that we have that > > > > guaranteed > > > > > >> > >> already, > > > > > >> > >> >> > > that's > > > > > >> > >> >> > > > > why > > > > > >> > >> >> > > > > > > > option > > > > > >> > >> >> > > > > > > > > > 1.1 > > > > > >> > >> >> > > > > > > > > > > > seemed simpler to me. > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM > > > Xintong > > > > > >> Song < > > > > > >> > >> >> > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, > Stephan. > > > > > >> Summarized > > > > > >> > in > > > > > >> > >> >> this > > > > > >> > >> >> > > way > > > > > >> > >> >> > > > > > really > > > > > >> > >> >> > > > > > > > > makes > > > > > >> > >> >> > > > > > > > > > > > > things easier to understand. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at > least > > > for > > > > > the > > > > > >> > >> >> moment. I > > > > > >> > >> >> > > > think > > > > > >> > >> >> > > > > it > > > > > >> > >> >> > > > > > > is > > > > > >> > >> >> > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault > safe > > > for > > > > > >> memory > > > > > >> > >> >> > manager, > > > > > >> > >> >> > > as > > > > > >> > >> >> > > > > > long > > > > > >> > >> >> > > > > > > as > > > > > >> > >> >> > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > always > > > > > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment > > when > > > it > > > > > is > > > > > >> > >> released > > > > > >> > >> >> > from > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory > > > consumer > > > > > >> > continue > > > > > >> > >> >> using > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > > buffer > > > > > >> > >> >> > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > segment after releasing it, in > > which > > > > > case > > > > > >> we > > > > > >> > do > > > > > >> > >> >> want > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > job > > > > > >> > >> >> > > > > > to > > > > > >> > >> >> > > > > > > > > fail > > > > > >> > >> >> > > > > > > > > > so > > > > > >> > >> >> > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think > this > > > is > > > > a > > > > > >> good > > > > > >> > >> idea. > > > > > >> > >> >> > Not > > > > > >> > >> >> > > > only > > > > > >> > >> >> > > > > > > > because > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > assumption (regular GC is enough > > to > > > > > clean > > > > > >> > >> direct > > > > > >> > >> >> > > buffers) > > > > > >> > >> >> > > > > may > > > > > >> > >> >> > > > > > > not > > > > > >> > >> >> > > > > > > > > > > always > > > > > >> > >> >> > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > true, but also it makes harder > for > > > > > finding > > > > > >> > >> >> problems > > > > > >> > >> >> > in > > > > > >> > >> >> > > > > cases > > > > > >> > >> >> > > > > > of > > > > > >> > >> >> > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured > > some > > > > > direct > > > > > >> > >> memory > > > > > >> > >> >> for > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > user > > > > > >> > >> >> > > > > > > > > > > libraries. > > > > > >> > >> >> > > > > > > > > > > > > If the library actually use more > > > > direct > > > > > >> > memory > > > > > >> > >> >> then > > > > > >> > >> >> > > > > > configured, > > > > > >> > >> >> > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC because > > they > > > > are > > > > > >> > still > > > > > >> > >> in > > > > > >> > >> >> > use, > > > > > >> > >> >> > > > may > > > > > >> > >> >> > > > > > lead > > > > > >> > >> >> > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > overuse > > > > > >> > >> >> > > > > > > > > > > > > of the total container memory. > In > > > that > > > > > >> case, > > > > > >> > >> if it > > > > > >> > >> >> > > didn't > > > > > >> > >> >> > > > > > touch > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > default max direct memory limit, > > we > > > > > cannot > > > > > >> > get > > > > > >> > >> a > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > OOM > > > > > >> > >> >> > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > it > > > > > >> > >> >> > > > > > > > > > > > > will become super hard to > > understand > > > > > which > > > > > >> > >> part of > > > > > >> > >> >> > the > > > > > >> > >> >> > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > > to be updated. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the > similar > > > > > >> problem as > > > > > >> > >> >> 1.2, if > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > > > exceeded > > > > > >> > >> >> > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > memory does not reach the max > > direct > > > > > >> memory > > > > > >> > >> limit > > > > > >> > >> >> > > > specified > > > > > >> > >> >> > > > > > by > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think it > is > > > > > >> slightly > > > > > >> > >> better > > > > > >> > >> >> > than > > > > > >> > >> >> > > > > 1.2, > > > > > >> > >> >> > > > > > > only > > > > > >> > >> >> > > > > > > > > > > because > > > > > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM > > > > Stephan > > > > > >> Ewen > > > > > >> > < > > > > > >> > >> >> > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the > > > "-XX:MaxDirectMemorySize" > > > > > >> > >> discussion, > > > > > >> > >> >> > maybe > > > > > >> > >> >> > > > let > > > > > >> > >> >> > > > > > me > > > > > >> > >> >> > > > > > > > > > > summarize > > > > > >> > >> >> > > > > > > > > > > > > it a > > > > > >> > >> >> > > > > > > > > > > > > > bit differently: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two > > options: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > > > > > >> de-allocated > > > > > >> > by > > > > > >> > >> the > > > > > >> > >> >> > GC. > > > > > >> > >> >> > > > That > > > > > >> > >> >> > > > > > > makes > > > > > >> > >> >> > > > > > > > > it > > > > > >> > >> >> > > > > > > > > > > > > segfault > > > > > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way > to > > > > > trigger > > > > > >> GC > > > > > >> > in > > > > > >> > >> >> case > > > > > >> > >> >> > > > > > > > de-allocation > > > > > >> > >> >> > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of > > > segments > > > > > >> > happens > > > > > >> > >> >> > quickly, > > > > > >> > >> >> > > > > which > > > > > >> > >> >> > > > > > > is > > > > > >> > >> >> > > > > > > > > > often > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > case during batch scheduling > or > > > task > > > > > >> > restart. > > > > > >> > >> >> > > > > > > > > > > > > > - The > > "-XX:MaxDirectMemorySize" > > > > > >> (option > > > > > >> > >> 1.1) > > > > > >> > >> >> is > > > > > >> > >> >> > one > > > > > >> > >> >> > > > way > > > > > >> > >> >> > > > > > to > > > > > >> > >> >> > > > > > > do > > > > > >> > >> >> > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > - Another way could be to > > have a > > > > > >> > dedicated > > > > > >> > >> >> > > > bookkeeping > > > > > >> > >> >> > > > > in > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so > > > that > > > > > this > > > > > >> > is a > > > > > >> > >> >> > number > > > > > >> > >> >> > > > > > > > independent > > > > > >> > >> >> > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" > > > parameter. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > > > > > de-allocate > > > > > >> > the > > > > > >> > >> >> memory > > > > > >> > >> >> > > for > > > > > >> > >> >> > > > > the > > > > > >> > >> >> > > > > > > > > > > > > MemorySegments > > > > > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need > not > > > > worry > > > > > >> > about > > > > > >> > >> >> > > triggering > > > > > >> > >> >> > > > > GC > > > > > >> > >> >> > > > > > by > > > > > >> > >> >> > > > > > > > > some > > > > > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but > it > > > is > > > > > >> harder > > > > > >> > to > > > > > >> > >> >> > prevent > > > > > >> > >> >> > > > > > > > segfaults. > > > > > >> > >> >> > > > > > > > > We > > > > > >> > >> >> > > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > be very careful about when we > > > > release > > > > > >> the > > > > > >> > >> memory > > > > > >> > >> >> > > > segments > > > > > >> > >> >> > > > > > > (only > > > > > >> > >> >> > > > > > > > > in > > > > > >> > >> >> > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main > > thread). > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we > > > > probably > > > > > >> need > > > > > >> > to > > > > > >> > >> >> set > > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > > > > >> > >> >> > > "off_heap_managed_memory + > > > > > >> > >> >> > > > > > > > > > > direct_memory" > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a > > separate > > > > > >> reserved > > > > > >> > >> >> memory > > > > > >> > >> >> > > > pool. > > > > > >> > >> >> > > > > > > > Because > > > > > >> > >> >> > > > > > > > > if > > > > > >> > >> >> > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > just > > > > > >> > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" > to > > > > > >> > >> >> > > > > "off_heap_managed_memory + > > > > > >> > >> >> > > > > > > > > > > > > jvm_overhead", > > > > > >> > >> >> > > > > > > > > > > > > > then there will be times when > > that > > > > > >> entire > > > > > >> > >> >> memory is > > > > > >> > >> >> > > > > > allocated > > > > > >> > >> >> > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing > left > > > for > > > > > the > > > > > >> > JVM > > > > > >> > >> >> > > overhead. > > > > > >> > >> >> > > > So > > > > > >> > >> >> > > > > > we > > > > > >> > >> >> > > > > > > > > either > > > > > >> > >> >> > > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > > a > > > > > >> > >> >> > > > > > > > > > > > > > way to compensate for that > > (again > > > > some > > > > > >> > safety > > > > > >> > >> >> > margin > > > > > >> > >> >> > > > > cutoff > > > > > >> > >> >> > > > > > > > > value) > > > > > >> > >> >> > > > > > > > > > or > > > > > >> > >> >> > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > will exceed container memory. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we > > need > > > to > > > > > be > > > > > >> > aware > > > > > >> > >> >> that > > > > > >> > >> >> > it > > > > > >> > >> >> > > > > takes > > > > > >> > >> >> > > > > > > > > > elaborate > > > > > >> > >> >> > > > > > > > > > > > > logic > > > > > >> > >> >> > > > > > > > > > > > > > to push recycling of direct > > > buffers > > > > > >> without > > > > > >> > >> >> always > > > > > >> > >> >> > > > > > > triggering a > > > > > >> > >> >> > > > > > > > > > full > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the > > options > > > > > will > > > > > >> be > > > > > >> > >> >> easiest > > > > > >> > >> >> > to > > > > > >> > >> >> > > > do > > > > > >> > >> >> > > > > in > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > following > > > > > >> > >> >> > > > > > > > > > > > > > order: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a > dedicated > > > > > >> > direct_memory > > > > > >> > >> >> > > > parameter, > > > > > >> > >> >> > > > > as > > > > > >> > >> >> > > > > > > > > > discussed > > > > > >> > >> >> > > > > > > > > > > > > > above. We would need to find a > > way > > > > to > > > > > >> set > > > > > >> > the > > > > > >> > >> >> > > > > direct_memory > > > > > >> > >> >> > > > > > > > > > parameter > > > > > >> > >> >> > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > default. We could start with > 64 > > MB > > > > and > > > > > >> see > > > > > >> > >> how > > > > > >> > >> >> it > > > > > >> > >> >> > > goes > > > > > >> > >> >> > > > in > > > > > >> > >> >> > > > > > > > > practice. > > > > > >> > >> >> > > > > > > > > > > One > > > > > >> > >> >> > > > > > > > > > > > > > danger I see is that setting > > this > > > > loo > > > > > >> low > > > > > >> > can > > > > > >> > >> >> > cause a > > > > > >> > >> >> > > > > bunch > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > additional > > > > > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we > need > > to > > > > > watch > > > > > >> > this > > > > > >> > >> >> > > > carefully). > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually > > quite > > > > > >> simple > > > > > >> > to > > > > > >> > >> >> > > implement, > > > > > >> > >> >> > > > > we > > > > > >> > >> >> > > > > > > > could > > > > > >> > >> >> > > > > > > > > > try > > > > > >> > >> >> > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the > > > moment. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not > > touch > > > > the > > > > > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > > > > >> > >> >> > > > > > > > > > > > > parameter > > > > > >> > >> >> > > > > > > > > > > > > > at all and assume that all the > > > > direct > > > > > >> > memory > > > > > >> > >> >> > > > allocations > > > > > >> > >> >> > > > > > that > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent enough > > to > > > be > > > > > >> > cleaned > > > > > >> > >> up > > > > > >> > >> >> > fast > > > > > >> > >> >> > > > > enough > > > > > >> > >> >> > > > > > > > > through > > > > > >> > >> >> > > > > > > > > > > > > regular > > > > > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a > > > valid > > > > > >> > >> assumption, > > > > > >> > >> >> > > > though. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > > > > > >> > >> >> > > > > > > > > > > > > > Stephan > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 > PM > > > > > Xintong > > > > > >> > Song > > > > > >> > >> < > > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > > opinion > > > > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > > alternative > > > > 2. > > > > > I > > > > > >> was > > > > > >> > >> >> > wondering > > > > > >> > >> >> > > > > > whether > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > can > > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > > > off-heap > > > > > >> > >> managed > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > and > > > > > >> > >> >> > > > > > > > network > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > with > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > > giving > > > > it a > > > > > >> > second > > > > > >> > >> >> > > thought, > > > > > >> > >> >> > > > I > > > > > >> > >> >> > > > > > > think > > > > > >> > >> >> > > > > > > > > even > > > > > >> > >> >> > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > > memory > > > > > for > > > > > >> > >> off-heap > > > > > >> > >> >> > > > managed > > > > > >> > >> >> > > > > > > memory > > > > > >> > >> >> > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > cause > > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > > think > > > > what > > > > > >> > >> proposed > > > > > >> > >> >> in > > > > > >> > >> >> > > this > > > > > >> > >> >> > > > > > FLIP > > > > > >> > >> >> > > > > > > it > > > > > >> > >> >> > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > both > > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > > > network > > > > > >> > memory > > > > > >> > >> >> > > allocated > > > > > >> > >> >> > > > > > > through > > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which > means > > > > they > > > > > >> are > > > > > >> > >> >> > practically > > > > > >> > >> >> > > > > > native > > > > > >> > >> >> > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > > memory. > > > > > The > > > > > >> > only > > > > > >> > >> >> parts > > > > > >> > >> >> > of > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > > > limited > > > > > >> > >> >> > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > > off-heap > > > > > >> > memory > > > > > >> > >> and > > > > > >> > >> >> > JVM > > > > > >> > >> >> > > > > > > overhead, > > > > > >> > >> >> > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 > suggests > > > to > > > > > set > > > > > >> the > > > > > >> > >> JVM > > > > > >> > >> >> max > > > > > >> > >> >> > > > > direct > > > > > >> > >> >> > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > to. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 > > PM > > > > Till > > > > > >> > >> Rohrmann > > > > > >> > >> >> < > > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the > clarification > > > > > >> Xintong. I > > > > > >> > >> >> > > understand > > > > > >> > >> >> > > > > the > > > > > >> > >> >> > > > > > > two > > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > > option 2 > > > > > >> > because > > > > > >> > >> it > > > > > >> > >> >> > makes > > > > > >> > >> >> > > > > > things > > > > > >> > >> >> > > > > > > > > > > explicit. > > > > > >> > >> >> > > > > > > > > > > > If > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > > memory, I > > > > > >> fear > > > > > >> > >> that > > > > > >> > >> >> we > > > > > >> > >> >> > > might > > > > > >> > >> >> > > > > end > > > > > >> > >> >> > > > > > > up > > > > > >> > >> >> > > > > > > > > in a > > > > > >> > >> >> > > > > > > > > > > > > similar > > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > > currently > > > > in: > > > > > >> The > > > > > >> > >> user > > > > > >> > >> >> > might > > > > > >> > >> >> > > > see > > > > > >> > >> >> > > > > > that > > > > > >> > >> >> > > > > > > > her > > > > > >> > >> >> > > > > > > > > > > > process > > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does > > not > > > > know > > > > > >> why > > > > > >> > >> this > > > > > >> > >> >> is > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > case. > > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > > process > > > > > memory > > > > > >> > size > > > > > >> > >> >> > > (similar > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > > > increasing > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > > accommodate > > > > for > > > > > >> the > > > > > >> > >> extra > > > > > >> > >> >> > > direct > > > > > >> > >> >> > > > > > > memory. > > > > > >> > >> >> > > > > > > > > > Even > > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > > budgets > > > > > >> which > > > > > >> > >> are > > > > > >> > >> >> not > > > > > >> > >> >> > > > fully > > > > > >> > >> >> > > > > > used > > > > > >> > >> >> > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > hence > > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > > > >> consumption. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > 11:01 > > > AM > > > > > >> > Xintong > > > > > >> > >> >> Song < > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this > with a > > > > > >> concrete > > > > > >> > >> >> example > > > > > >> > >> >> > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > > following > > > > > >> > >> scenario. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: > 1GB > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > > > Off-Heap > > > > > >> > >> Memory + > > > > > >> > >> >> JVM > > > > > >> > >> >> > > > > > > Overhead): > > > > > >> > >> >> > > > > > > > > > 200MB > > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > > Memory, > > > > > JVM > > > > > >> > >> >> Metaspace, > > > > > >> > >> >> > > > > > Off-Heap > > > > > >> > >> >> > > > > > > > > > Managed > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we > set > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > 200MB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we > set > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > a > > > > > >> > >> >> > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > memory > > > > > usage > > > > > >> of > > > > > >> > >> Task > > > > > >> > >> >> > > > Off-Heap > > > > > >> > >> >> > > > > > > Memory > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, > then > > > > > >> > alternative 2 > > > > > >> > >> >> and > > > > > >> > >> >> > > > > > > alternative 3 > > > > > >> > >> >> > > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > > larger > > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > > > will > > > > > >> > >> >> > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other > memory > > > > pools. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > memory > > > > > usage > > > > > >> of > > > > > >> > >> Task > > > > > >> > >> >> > > > Off-Heap > > > > > >> > >> >> > > > > > > Memory > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > > exceed > > > > > 200MB, > > > > > >> > then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 > suffers > > > > from > > > > > >> > >> frequent > > > > > >> > >> >> OOM. > > > > > >> > >> >> > > To > > > > > >> > >> >> > > > > > avoid > > > > > >> > >> >> > > > > > > > > that, > > > > > >> > >> >> > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > only > > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > > modify > > > > the > > > > > >> > >> >> configuration > > > > > >> > >> >> > > and > > > > > >> > >> >> > > > > > > > increase > > > > > >> > >> >> > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap > Memory + > > > JVM > > > > > >> > >> Overhead). > > > > > >> > >> >> > Let's > > > > > >> > >> >> > > > say > > > > > >> > >> >> > > > > > > that > > > > > >> > >> >> > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to > 250MB, > > > > this > > > > > >> will > > > > > >> > >> >> reduce > > > > > >> > >> >> > the > > > > > >> > >> >> > > > > total > > > > > >> > >> >> > > > > > > > size > > > > > >> > >> >> > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > other > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given > > the > > > > > total > > > > > >> > >> process > > > > > >> > >> >> > > memory > > > > > >> > >> >> > > > > > > remains > > > > > >> > >> >> > > > > > > > > > 1GB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > > > there > > > > is > > > > > >> no > > > > > >> > >> >> chance of > > > > > >> > >> >> > > > > direct > > > > > >> > >> >> > > > > > > OOM. > > > > > >> > >> >> > > > > > > > > > There > > > > > >> > >> >> > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the > total > > > > > process > > > > > >> > >> memory > > > > > >> > >> >> > limit, > > > > > >> > >> >> > > > but > > > > > >> > >> >> > > > > > > given > > > > > >> > >> >> > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > > reserved > > > > > >> native > > > > > >> > >> >> memory > > > > > >> > >> >> > > > > > (Off-Heap > > > > > >> > >> >> > > > > > > > > > Managed > > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM > Metaspace), > > > if > > > > > the > > > > > >> > >> actual > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > > 200MB, > > > > user > > > > > >> > >> probably > > > > > >> > >> >> do > > > > > >> > >> >> > > not > > > > > >> > >> >> > > > > need > > > > > >> > >> >> > > > > > > to > > > > > >> > >> >> > > > > > > > > > change > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from > > the > > > > > user's > > > > > >> > >> >> > > perspective, a > > > > > >> > >> >> > > > > > > > feasible > > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may > lead > > > to > > > > > >> lower > > > > > >> > >> >> resource > > > > > >> > >> >> > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > > compared > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > > 10:28 > > > > AM > > > > > >> Till > > > > > >> > >> >> > Rohrmann > > > > > >> > >> >> > > < > > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to > help > > > me > > > > > >> > >> understand > > > > > >> > >> >> the > > > > > >> > >> >> > > > > > difference > > > > > >> > >> >> > > > > > > > > > between > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > > under > > > > > >> > utilization > > > > > >> > >> >> > > Xintong. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > Task > > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there > is > > > the > > > > > risk > > > > > >> > that > > > > > >> > >> >> this > > > > > >> > >> >> > > size > > > > > >> > >> >> > > > > is > > > > > >> > >> >> > > > > > > too > > > > > >> > >> >> > > > > > > > > low > > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > > collection > > > > and > > > > > >> > >> >> potentially > > > > > >> > >> >> > an > > > > > >> > >> >> > > > OOM. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > > > > something > > > > > >> > >> >> > > > > > > > > > > larger > > > > > >> > >> >> > > > > > > > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > > would > > > of > > > > > >> course > > > > > >> > >> >> reduce > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > sizes > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > other > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative > 2 > > > now > > > > > >> result > > > > > >> > >> in an > > > > > >> > >> >> > > under > > > > > >> > >> >> > > > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to > alternative > > 3? > > > > If > > > > > >> > >> >> alternative 3 > > > > > >> > >> >> > > > > > strictly > > > > > >> > >> >> > > > > > > > > sets a > > > > > >> > >> >> > > > > > > > > > > > > higher > > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and > > we > > > > use > > > > > >> only > > > > > >> > >> >> little, > > > > > >> > >> >> > > > then I > > > > > >> > >> >> > > > > > > would > > > > > >> > >> >> > > > > > > > > > > expect > > > > > >> > >> >> > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results > in > > > > > memory > > > > > >> > under > > > > > >> > >> >> > > > > utilization. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > at > > > 4:19 > > > > > PM > > > > > >> > Yang > > > > > >> > >> >> Wang < > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting > a > > > very > > > > > >> large > > > > > >> > >> max > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > size > > > > > >> > >> >> > > > > > > > > > > when > > > > > >> > >> >> > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > do > > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct > > and > > > > > >> native > > > > > >> > >> >> memory. > > > > > >> > >> >> > If > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > > framework > > > > > >> > direct > > > > > >> > >> >> > > > memory,could > > > > > >> > >> >> > > > > > be > > > > > >> > >> >> > > > > > > > > > > calculated > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > > setting > > > > > >> direct > > > > > >> > >> memory > > > > > >> > >> >> > with > > > > > >> > >> >> > > > > fixed > > > > > >> > >> >> > > > > > > > > value. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with > xintong. > > > For > > > > > Yarn > > > > > >> > and > > > > > >> > >> >> k8s,we > > > > > >> > >> >> > > > need > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > check > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > > client > > > > to > > > > > >> avoid > > > > > >> > >> >> > > submitting > > > > > >> > >> >> > > > > > > > > successfully > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > > >> > >> [hidden email] > > > > > >> > >> >> > > > > >于2019年8月13日 > > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for > replying, > > > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > > MemorySegment, I > > > > > think > > > > > >> > you > > > > > >> > >> are > > > > > >> > >> >> > > right > > > > > >> > >> >> > > > > that > > > > > >> > >> >> > > > > > > we > > > > > >> > >> >> > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope > > of > > > > this > > > > > >> > FLIP. > > > > > >> > >> >> This > > > > > >> > >> >> > > FLIP > > > > > >> > >> >> > > > > > should > > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > > >> > >> >> > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > > pools > > > > for > > > > > >> > >> >> > TaskExecutors, > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > > > minimum > > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers > use > > > it. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > > memory, I > > > > > think > > > > > >> > >> >> > alternative > > > > > >> > >> >> > > 3 > > > > > >> > >> >> > > > > may > > > > > >> > >> >> > > > > > > not > > > > > >> > >> >> > > > > > > > > > having > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue > > that > > > > > >> > >> alternative 2 > > > > > >> > >> >> > > does, > > > > > >> > >> >> > > > > but > > > > > >> > >> >> > > > > > at > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > cost > > > > > >> > >> >> > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at > the > > > > > >> container > > > > > >> > >> level, > > > > > >> > >> >> > > which > > > > > >> > >> >> > > > is > > > > > >> > >> >> > > > > > not > > > > > >> > >> >> > > > > > > > > good. > > > > > >> > >> >> > > > > > > > > > > My > > > > > >> > >> >> > > > > > > > > > > > > > point > > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task > Off-Heap > > > > > Memory" > > > > > >> and > > > > > >> > >> "JVM > > > > > >> > >> >> > > > > Overhead" > > > > > >> > >> >> > > > > > > are > > > > > >> > >> >> > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > easy > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, > users > > > > might > > > > > >> > >> configure > > > > > >> > >> >> > them > > > > > >> > >> >> > > > > > higher > > > > > >> > >> >> > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > what > > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > > getting > > > a > > > > > >> direct > > > > > >> > >> OOM. > > > > > >> > >> >> For > > > > > >> > >> >> > > > > > > alternative > > > > > >> > >> >> > > > > > > > > 3, > > > > > >> > >> >> > > > > > > > > > > > users > > > > > >> > >> >> > > > > > > > > > > > > do > > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so > they > > > may > > > > > not > > > > > >> > >> config > > > > > >> > >> >> the > > > > > >> > >> >> > > two > > > > > >> > >> >> > > > > > > options > > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences > are > > > > risks > > > > > >> of > > > > > >> > >> >> overall > > > > > >> > >> >> > > > > container > > > > > >> > >> >> > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, > 2019 > > > at > > > > > >> 9:39 AM > > > > > >> > >> Till > > > > > >> > >> >> > > > > Rohrmann < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > > proposing > > > > > this > > > > > >> > FLIP > > > > > >> > >> >> > Xintong. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I > think > > > it > > > > > >> already > > > > > >> > >> >> looks > > > > > >> > >> >> > > quite > > > > > >> > >> >> > > > > > good. > > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > > >> > >> >> > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > > allocating > > > > > >> > >> memory > > > > > >> > >> >> > > > segments, > > > > > >> > >> >> > > > > I > > > > > >> > >> >> > > > > > > was > > > > > >> > >> >> > > > > > > > > > > > wondering > > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly > necessary > > > to > > > > do > > > > > >> in > > > > > >> > the > > > > > >> > >> >> > context > > > > > >> > >> >> > > > of > > > > > >> > >> >> > > > > > this > > > > > >> > >> >> > > > > > > > > FLIP > > > > > >> > >> >> > > > > > > > > > or > > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > > follow > > > > up? > > > > > >> > Without > > > > > >> > >> >> > knowing > > > > > >> > >> >> > > > all > > > > > >> > >> >> > > > > > > > > details, > > > > > >> > >> >> > > > > > > > > > I > > > > > >> > >> >> > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > > widen > > > > the > > > > > >> scope > > > > > >> > >> of > > > > > >> > >> >> this > > > > > >> > >> >> > > > FLIP > > > > > >> > >> >> > > > > > too > > > > > >> > >> >> > > > > > > > much > > > > > >> > >> >> > > > > > > > > > > > because > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > > > existing > > > > > >> > call > > > > > >> > >> >> sites > > > > > >> > >> >> > of > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > > >> > >> >> > > > > > > > > > > > > > where > > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > > > (this > > > > > >> should > > > > > >> > >> >> mainly > > > > > >> > >> >> > be > > > > > >> > >> >> > > > > batch > > > > > >> > >> >> > > > > > > > > > > operators). > > > > > >> > >> >> > > > > > > > > > > > > The > > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > > reservation > > > > > >> call > > > > > >> > to > > > > > >> > >> the > > > > > >> > >> >> > > > > > > MemoryManager > > > > > >> > >> >> > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > > > hope > > > > > that > > > > > >> > >> this is > > > > > >> > >> >> > the > > > > > >> > >> >> > > > only > > > > > >> > >> >> > > > > > > point > > > > > >> > >> >> > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > > would > > > > have > > > > > >> with > > > > > >> > >> the > > > > > >> > >> >> > > > > > > MemoryManager. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > > second > > > > > open > > > > > >> > >> >> question > > > > > >> > >> >> > > about > > > > > >> > >> >> > > > > > > setting > > > > > >> > >> >> > > > > > > > > or > > > > > >> > >> >> > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > > limit, I > > > > > >> would > > > > > >> > >> also > > > > > >> > >> >> be > > > > > >> > >> >> > > > > > interested > > > > > >> > >> >> > > > > > > > why > > > > > >> > >> >> > > > > > > > > > > Yang > > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > > > would > > > > be > > > > > >> > best. > > > > > >> > >> My > > > > > >> > >> >> > > concern > > > > > >> > >> >> > > > > > about > > > > > >> > >> >> > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > > > situation > > > > > >> as > > > > > >> > we > > > > > >> > >> >> are > > > > > >> > >> >> > now > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > > memory > > > > > pools > > > > > >> > are > > > > > >> > >> not > > > > > >> > >> >> > > > clearly > > > > > >> > >> >> > > > > > > > > separated > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > can > > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different > pool, > > > then > > > > > it > > > > > >> is > > > > > >> > >> quite > > > > > >> > >> >> > hard > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > > understand > > > > > >> > >> >> > > > > > > > > > > > what > > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > > killed > > > > > for > > > > > >> > using > > > > > >> > >> >> too > > > > > >> > >> >> > > much > > > > > >> > >> >> > > > > > > memory. > > > > > >> > >> >> > > > > > > > > This > > > > > >> > >> >> > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a > similar > > > > > >> situation > > > > > >> > >> what > > > > > >> > >> >> we > > > > > >> > >> >> > > have > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > > default > > > > > >> value > > > > > >> > >> for > > > > > >> > >> >> max > > > > > >> > >> >> > > > direct > > > > > >> > >> >> > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > giving > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to > increase > > > it > > > > if > > > > > >> he > > > > > >> > >> runs > > > > > >> > >> >> into > > > > > >> > >> >> > > an > > > > > >> > >> >> > > > > OOM. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > > would > > > > > >> > >> alternative 2 > > > > > >> > >> >> > lead > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > > lower > > > > > >> > >> >> > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > > where > > > we > > > > > set > > > > > >> > the > > > > > >> > >> >> direct > > > > > >> > >> >> > > > > memory > > > > > >> > >> >> > > > > > > to a > > > > > >> > >> >> > > > > > > > > > > higher > > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > > 2019 > > > at > > > > > >> 9:12 > > > > > >> > AM > > > > > >> > >> >> > Xintong > > > > > >> > >> >> > > > > Song < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > > > feedback, > > > > > >> > >> Yang. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > > > comments: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > > Direct > > > > > >> Memory* > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > setting > > a > > > > very > > > > > >> > large > > > > > >> > >> max > > > > > >> > >> >> > > direct > > > > > >> > >> >> > > > > > > memory > > > > > >> > >> >> > > > > > > > > size > > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > > E.g., > > > we > > > > > do > > > > > >> not > > > > > >> > >> >> worry > > > > > >> > >> >> > > about > > > > > >> > >> >> > > > > > > direct > > > > > >> > >> >> > > > > > > > > OOM, > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > > managed > > > > / > > > > > >> > network > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > with > > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there > > are > > > > > also > > > > > >> > some > > > > > >> > >> >> down > > > > > >> > >> >> > > sides > > > > > >> > >> >> > > > > of > > > > > >> > >> >> > > > > > > > doing > > > > > >> > >> >> > > > > > > > > > > this. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One > thing I > > > can > > > > > >> think > > > > > >> > >> of is > > > > > >> > >> >> > that > > > > > >> > >> >> > > > if > > > > > >> > >> >> > > > > a > > > > > >> > >> >> > > > > > > task > > > > > >> > >> >> > > > > > > > > > > > executor > > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due > to > > > > > >> overusing > > > > > >> > >> >> memory, > > > > > >> > >> >> > it > > > > > >> > >> >> > > > > could > > > > > >> > >> >> > > > > > > be > > > > > >> > >> >> > > > > > > > > hard > > > > > >> > >> >> > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > use > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the > memory > > > is > > > > > >> > overused. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > > down > > > > side > > > > > >> is > > > > > >> > >> that > > > > > >> > >> >> the > > > > > >> > >> >> > > JVM > > > > > >> > >> >> > > > > > never > > > > > >> > >> >> > > > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > > GC > > > > > >> > >> >> > > > > > > > > > > > > due > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct > memory > > > > > limit, > > > > > >> > >> because > > > > > >> > >> >> the > > > > > >> > >> >> > > > limit > > > > > >> > >> >> > > > > > is > > > > > >> > >> >> > > > > > > > too > > > > > >> > >> >> > > > > > > > > > high > > > > > >> > >> >> > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we > kind > > > of > > > > > >> relay > > > > > >> > on > > > > > >> > >> >> heap > > > > > >> > >> >> > > > memory > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > > GC > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. > That > > > > could > > > > > >> be a > > > > > >> > >> >> problem > > > > > >> > >> >> > in > > > > > >> > >> >> > > > > cases > > > > > >> > >> >> > > > > > > > where > > > > > >> > >> >> > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but > not > > > > > enough > > > > > >> > heap > > > > > >> > >> >> > activity > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > > > share > > > > > your > > > > > >> > >> reasons > > > > > >> > >> >> > for > > > > > >> > >> >> > > > > > > preferring > > > > > >> > >> >> > > > > > > > > > > > setting a > > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > > anything > > > > > >> else > > > > > >> > I > > > > > >> > >> >> > > > overlooked. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > > Calculation* > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is > any > > > > > conflict > > > > > >> > >> between > > > > > >> > >> >> > > > multiple > > > > > >> > >> >> > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > > specified, > > > > > I > > > > > >> > >> think we > > > > > >> > >> >> > > should > > > > > >> > >> >> > > > > > throw > > > > > >> > >> >> > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > error. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > > > checking > > > > > >> on > > > > > >> > the > > > > > >> > >> >> > client > > > > > >> > >> >> > > > side > > > > > >> > >> >> > > > > > is > > > > > >> > >> >> > > > > > > a > > > > > >> > >> >> > > > > > > > > good > > > > > >> > >> >> > > > > > > > > > > > idea, > > > > > >> > >> >> > > > > > > > > > > > > > so > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > > discover > > > > > the > > > > > >> > >> problem > > > > > >> > >> >> > > before > > > > > >> > >> >> > > > > > > > submitting > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a > good > > > > > thing. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > > > only > > > > > >> rely on > > > > > >> > >> the > > > > > >> > >> >> > > client > > > > > >> > >> >> > > > > side > > > > > >> > >> >> > > > > > > > > > checking, > > > > > >> > >> >> > > > > > > > > > > > > > because > > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > > cluster > > > > > >> > >> TaskManagers > > > > > >> > >> >> on > > > > > >> > >> >> > > > > > different > > > > > >> > >> >> > > > > > > > > > machines > > > > > >> > >> >> > > > > > > > > > > > may > > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations > > and > > > > the > > > > > >> > client > > > > > >> > >> >> does > > > > > >> > >> >> > > see > > > > > >> > >> >> > > > > > that. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > > think? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > > > 2019 > > > > at > > > > > >> 5:09 > > > > > >> > >> PM > > > > > >> > >> >> Yang > > > > > >> > >> >> > > > Wang > > > > > >> > >> >> > > > > < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > > your > > > > > >> detailed > > > > > >> > >> >> > proposal. > > > > > >> > >> >> > > > > After > > > > > >> > >> >> > > > > > > all > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, > it > > > > will > > > > > be > > > > > >> > more > > > > > >> > >> >> > > powerful > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > control > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have > few > > > > > >> questions > > > > > >> > >> about > > > > > >> > >> >> it. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native > > and > > > > > Direct > > > > > >> > >> Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > > >> differentiate > > > > > >> > >> user > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > native > > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in > > task > > > > > >> off-heap > > > > > >> > >> >> memory. > > > > > >> > >> >> > > > > Right? > > > > > >> > >> >> > > > > > > So i > > > > > >> > >> >> > > > > > > > > > don’t > > > > > >> > >> >> > > > > > > > > > > > > think > > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > > >> > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > properly. I > > > > > >> > >> >> > > > > > > > prefer > > > > > >> > >> >> > > > > > > > > > > > leaving > > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > > > >> Calculation > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum > of > > > and > > > > > >> > >> fine-grained > > > > > >> > >> >> > > > > > > memory(network > > > > > >> > >> >> > > > > > > > > > > memory, > > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger > than > > > > total > > > > > >> > >> process > > > > > >> > >> >> > > memory, > > > > > >> > >> >> > > > > how > > > > > >> > >> >> > > > > > do > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > deal > > > > > >> > >> >> > > > > > > > > > > > > with > > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > > check > > > > the > > > > > >> > memory > > > > > >> > >> >> > > > > configuration > > > > > >> > >> >> > > > > > > in > > > > > >> > >> >> > > > > > > > > > > client? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong > Song < > > > > > >> > >> >> > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi > everyone, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > > like > > > to > > > > > >> start > > > > > >> > a > > > > > >> > >> >> > > discussion > > > > > >> > >> >> > > > > > > thread > > > > > >> > >> >> > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > > for > > > > > >> > >> >> > > > TaskExecutors"[1], > > > > > >> > >> >> > > > > > > where > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > describe > > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > memory > > > > > >> > >> >> > > configurations. > > > > > >> > >> >> > > > > The > > > > > >> > >> >> > > > > > > > FLIP > > > > > >> > >> >> > > > > > > > > > > > document > > > > > >> > >> >> > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early > design > > > > > "Memory > > > > > >> > >> >> Management > > > > > >> > >> >> > > and > > > > > >> > >> >> > > > > > > > > > Configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with > updates > > > > from > > > > > >> > >> follow-up > > > > > >> > >> >> > > > > discussions > > > > > >> > >> >> > > > > > > > both > > > > > >> > >> >> > > > > > > > > > > online > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > > > addresses > > > > > >> > >> several > > > > > >> > >> >> > > > > > shortcomings > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > > > current > > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > memory > > > > > >> > >> >> > > configuration. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Different > > > > > >> > >> configuration > > > > > >> > >> >> > for > > > > > >> > >> >> > > > > > > Streaming > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Complex > > > and > > > > > >> > >> difficult > > > > > >> > >> >> > > > > > configuration > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > > >> > >> >> > > > > > > > > > > > > in > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > > Complicated, > > > > > >> > >> uncertain > > > > > >> > >> >> and > > > > > >> > >> >> > > > hard > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > > > > understand. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key > changes > > to > > > > > solve > > > > > >> > the > > > > > >> > >> >> > problems > > > > > >> > >> >> > > > can > > > > > >> > >> >> > > > > > be > > > > > >> > >> >> > > > > > > > > > > summarized > > > > > >> > >> >> > > > > > > > > > > > > as > > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Extend > > > > memory > > > > > >> > >> manager > > > > > >> > >> >> to > > > > > >> > >> >> > > also > > > > > >> > >> >> > > > > > > account > > > > > >> > >> >> > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Modify > > > how > > > > > >> > >> TaskExecutor > > > > > >> > >> >> > > memory > > > > > >> > >> >> > > > > is > > > > > >> > >> >> > > > > > > > > > > partitioned > > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > > > >> reservations > > > > > >> > >> and > > > > > >> > >> >> > pools. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Simplify > > > > > memory > > > > > >> > >> >> > > configuration > > > > > >> > >> >> > > > > > > options > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please > find > > > more > > > > > >> > details > > > > > >> > >> in > > > > > >> > >> >> the > > > > > >> > >> >> > > > FLIP > > > > > >> > >> >> > > > > > wiki > > > > > >> > >> >> > > > > > > > > > > document > > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please > note > > > > that > > > > > >> the > > > > > >> > >> early > > > > > >> > >> >> > > design > > > > > >> > >> >> > > > > doc > > > > > >> > >> >> > > > > > > [2] > > > > > >> > >> >> > > > > > > > is > > > > > >> > >> >> > > > > > > > > > out > > > > > >> > >> >> > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > appreciated > > to > > > > > have > > > > > >> the > > > > > >> > >> >> > > discussion > > > > > >> > >> >> > > > in > > > > > >> > >> >> > > > > > > this > > > > > >> > >> >> > > > > > > > > > > mailing > > > > > >> > >> >> > > > > > > > > > > > > list > > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > > forward > > > > to > > > > > >> your > > > > > >> > >> >> > > feedbacks. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong > Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 > PM > > > > > Xintong > > > > > >> > Song > > > > > >> > >> < > > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > > opinion > > > > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > > alternative > > > > 2. > > > > > I > > > > > >> was > > > > > >> > >> >> > wondering > > > > > >> > >> >> > > > > > whether > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > can > > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for > > > > off-heap > > > > > >> > >> managed > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > and > > > > > >> > >> >> > > > > > > > network > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > with > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > > giving > > > > it a > > > > > >> > second > > > > > >> > >> >> > > thought, > > > > > >> > >> >> > > > I > > > > > >> > >> >> > > > > > > think > > > > > >> > >> >> > > > > > > > > even > > > > > >> > >> >> > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > > memory > > > > > for > > > > > >> > >> off-heap > > > > > >> > >> >> > > > managed > > > > > >> > >> >> > > > > > > memory > > > > > >> > >> >> > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > cause > > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > > think > > > > what > > > > > >> > >> proposed > > > > > >> > >> >> in > > > > > >> > >> >> > > this > > > > > >> > >> >> > > > > > FLIP > > > > > >> > >> >> > > > > > > it > > > > > >> > >> >> > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > both > > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory and > > > > network > > > > > >> > memory > > > > > >> > >> >> > > allocated > > > > > >> > >> >> > > > > > > through > > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which > means > > > > they > > > > > >> are > > > > > >> > >> >> > practically > > > > > >> > >> >> > > > > > native > > > > > >> > >> >> > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > > memory. > > > > > The > > > > > >> > only > > > > > >> > >> >> parts > > > > > >> > >> >> > of > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > > > limited > > > > > >> > >> >> > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > > off-heap > > > > > >> > memory > > > > > >> > >> and > > > > > >> > >> >> > JVM > > > > > >> > >> >> > > > > > > overhead, > > > > > >> > >> >> > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 > suggests > > > to > > > > > set > > > > > >> the > > > > > >> > >> JVM > > > > > >> > >> >> max > > > > > >> > >> >> > > > > direct > > > > > >> > >> >> > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > to. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 > > PM > > > > Till > > > > > >> > >> Rohrmann > > > > > >> > >> >> < > > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the > clarification > > > > > >> Xintong. I > > > > > >> > >> >> > > understand > > > > > >> > >> >> > > > > the > > > > > >> > >> >> > > > > > > two > > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > > option 2 > > > > > >> > because > > > > > >> > >> it > > > > > >> > >> >> > makes > > > > > >> > >> >> > > > > > things > > > > > >> > >> >> > > > > > > > > > > explicit. > > > > > >> > >> >> > > > > > > > > > > > If > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > > memory, I > > > > > >> fear > > > > > >> > >> that > > > > > >> > >> >> we > > > > > >> > >> >> > > might > > > > > >> > >> >> > > > > end > > > > > >> > >> >> > > > > > > up > > > > > >> > >> >> > > > > > > > > in a > > > > > >> > >> >> > > > > > > > > > > > > similar > > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > > currently > > > > in: > > > > > >> The > > > > > >> > >> user > > > > > >> > >> >> > might > > > > > >> > >> >> > > > see > > > > > >> > >> >> > > > > > that > > > > > >> > >> >> > > > > > > > her > > > > > >> > >> >> > > > > > > > > > > > process > > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and does > > not > > > > know > > > > > >> why > > > > > >> > >> this > > > > > >> > >> >> is > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > case. > > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > > process > > > > > memory > > > > > >> > size > > > > > >> > >> >> > > (similar > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > > > increasing > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > > accommodate > > > > for > > > > > >> the > > > > > >> > >> extra > > > > > >> > >> >> > > direct > > > > > >> > >> >> > > > > > > memory. > > > > > >> > >> >> > > > > > > > > > Even > > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > > budgets > > > > > >> which > > > > > >> > >> are > > > > > >> > >> >> not > > > > > >> > >> >> > > > fully > > > > > >> > >> >> > > > > > used > > > > > >> > >> >> > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > hence > > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall memory > > > > > >> consumption. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > 11:01 > > > AM > > > > > >> > Xintong > > > > > >> > >> >> Song < > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this > with a > > > > > >> concrete > > > > > >> > >> >> example > > > > > >> > >> >> > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > > following > > > > > >> > >> scenario. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: > 1GB > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task > > > > Off-Heap > > > > > >> > >> Memory + > > > > > >> > >> >> JVM > > > > > >> > >> >> > > > > > > Overhead): > > > > > >> > >> >> > > > > > > > > > 200MB > > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > > Memory, > > > > > JVM > > > > > >> > >> >> Metaspace, > > > > > >> > >> >> > > > > > Off-Heap > > > > > >> > >> >> > > > > > > > > > Managed > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we > set > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > 200MB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we > set > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > a > > > > > >> > >> >> > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > memory > > > > > usage > > > > > >> of > > > > > >> > >> Task > > > > > >> > >> >> > > > Off-Heap > > > > > >> > >> >> > > > > > > Memory > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, > then > > > > > >> > alternative 2 > > > > > >> > >> >> and > > > > > >> > >> >> > > > > > > alternative 3 > > > > > >> > >> >> > > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > > larger > > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > > > > will > > > > > >> > >> >> > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other > memory > > > > pools. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > memory > > > > > usage > > > > > >> of > > > > > >> > >> Task > > > > > >> > >> >> > > > Off-Heap > > > > > >> > >> >> > > > > > > Memory > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > > exceed > > > > > 200MB, > > > > > >> > then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 > suffers > > > > from > > > > > >> > >> frequent > > > > > >> > >> >> OOM. > > > > > >> > >> >> > > To > > > > > >> > >> >> > > > > > avoid > > > > > >> > >> >> > > > > > > > > that, > > > > > >> > >> >> > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > only > > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > > modify > > > > the > > > > > >> > >> >> configuration > > > > > >> > >> >> > > and > > > > > >> > >> >> > > > > > > > increase > > > > > >> > >> >> > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap > Memory + > > > JVM > > > > > >> > >> Overhead). > > > > > >> > >> >> > Let's > > > > > >> > >> >> > > > say > > > > > >> > >> >> > > > > > > that > > > > > >> > >> >> > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to > 250MB, > > > > this > > > > > >> will > > > > > >> > >> >> reduce > > > > > >> > >> >> > the > > > > > >> > >> >> > > > > total > > > > > >> > >> >> > > > > > > > size > > > > > >> > >> >> > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > other > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given > > the > > > > > total > > > > > >> > >> process > > > > > >> > >> >> > > memory > > > > > >> > >> >> > > > > > > remains > > > > > >> > >> >> > > > > > > > > > 1GB. > > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative 3, > > > there > > > > is > > > > > >> no > > > > > >> > >> >> chance of > > > > > >> > >> >> > > > > direct > > > > > >> > >> >> > > > > > > OOM. > > > > > >> > >> >> > > > > > > > > > There > > > > > >> > >> >> > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the > total > > > > > process > > > > > >> > >> memory > > > > > >> > >> >> > limit, > > > > > >> > >> >> > > > but > > > > > >> > >> >> > > > > > > given > > > > > >> > >> >> > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > > reserved > > > > > >> native > > > > > >> > >> >> memory > > > > > >> > >> >> > > > > > (Off-Heap > > > > > >> > >> >> > > > > > > > > > Managed > > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM > Metaspace), > > > if > > > > > the > > > > > >> > >> actual > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > > 200MB, > > > > user > > > > > >> > >> probably > > > > > >> > >> >> do > > > > > >> > >> >> > > not > > > > > >> > >> >> > > > > need > > > > > >> > >> >> > > > > > > to > > > > > >> > >> >> > > > > > > > > > change > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think from > > the > > > > > user's > > > > > >> > >> >> > > perspective, a > > > > > >> > >> >> > > > > > > > feasible > > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may > lead > > > to > > > > > >> lower > > > > > >> > >> >> resource > > > > > >> > >> >> > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > > compared > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > > 10:28 > > > > AM > > > > > >> Till > > > > > >> > >> >> > Rohrmann > > > > > >> > >> >> > > < > > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to > help > > > me > > > > > >> > >> understand > > > > > >> > >> >> the > > > > > >> > >> >> > > > > > difference > > > > > >> > >> >> > > > > > > > > > between > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > > under > > > > > >> > utilization > > > > > >> > >> >> > > Xintong. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > Task > > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there > is > > > the > > > > > risk > > > > > >> > that > > > > > >> > >> >> this > > > > > >> > >> >> > > size > > > > > >> > >> >> > > > > is > > > > > >> > >> >> > > > > > > too > > > > > >> > >> >> > > > > > > > > low > > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > > collection > > > > and > > > > > >> > >> >> potentially > > > > > >> > >> >> > an > > > > > >> > >> >> > > > OOM. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > > > > something > > > > > >> > >> >> > > > > > > > > > > larger > > > > > >> > >> >> > > > > > > > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > > would > > > of > > > > > >> course > > > > > >> > >> >> reduce > > > > > >> > >> >> > > the > > > > > >> > >> >> > > > > > sizes > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > other > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would alternative > 2 > > > now > > > > > >> result > > > > > >> > >> in an > > > > > >> > >> >> > > under > > > > > >> > >> >> > > > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to > alternative > > 3? > > > > If > > > > > >> > >> >> alternative 3 > > > > > >> > >> >> > > > > > strictly > > > > > >> > >> >> > > > > > > > > sets a > > > > > >> > >> >> > > > > > > > > > > > > higher > > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size and > > we > > > > use > > > > > >> only > > > > > >> > >> >> little, > > > > > >> > >> >> > > > then I > > > > > >> > >> >> > > > > > > would > > > > > >> > >> >> > > > > > > > > > > expect > > > > > >> > >> >> > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 results > in > > > > > memory > > > > > >> > under > > > > > >> > >> >> > > > > utilization. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > at > > > 4:19 > > > > > PM > > > > > >> > Yang > > > > > >> > >> >> Wang < > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is setting > a > > > very > > > > > >> large > > > > > >> > >> max > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > size > > > > > >> > >> >> > > > > > > > > > > when > > > > > >> > >> >> > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > do > > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate direct > > and > > > > > >> native > > > > > >> > >> >> memory. > > > > > >> > >> >> > If > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > > framework > > > > > >> > direct > > > > > >> > >> >> > > > memory,could > > > > > >> > >> >> > > > > > be > > > > > >> > >> >> > > > > > > > > > > calculated > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > > setting > > > > > >> direct > > > > > >> > >> memory > > > > > >> > >> >> > with > > > > > >> > >> >> > > > > fixed > > > > > >> > >> >> > > > > > > > > value. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with > xintong. > > > For > > > > > Yarn > > > > > >> > and > > > > > >> > >> >> k8s,we > > > > > >> > >> >> > > > need > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > check > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > > client > > > > to > > > > > >> avoid > > > > > >> > >> >> > > submitting > > > > > >> > >> >> > > > > > > > > successfully > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > > >> > >> [hidden email] > > > > > >> > >> >> > > > > >于2019年8月13日 > > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for > replying, > > > > Till. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > > MemorySegment, I > > > > > think > > > > > >> > you > > > > > >> > >> are > > > > > >> > >> >> > > right > > > > > >> > >> >> > > > > that > > > > > >> > >> >> > > > > > > we > > > > > >> > >> >> > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope > > of > > > > this > > > > > >> > FLIP. > > > > > >> > >> >> This > > > > > >> > >> >> > > FLIP > > > > > >> > >> >> > > > > > should > > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > > >> > >> >> > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > > pools > > > > for > > > > > >> > >> >> > TaskExecutors, > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > > > minimum > > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers > use > > > it. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > > memory, I > > > > > think > > > > > >> > >> >> > alternative > > > > > >> > >> >> > > 3 > > > > > >> > >> >> > > > > may > > > > > >> > >> >> > > > > > > not > > > > > >> > >> >> > > > > > > > > > having > > > > > >> > >> >> > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation issue > > that > > > > > >> > >> alternative 2 > > > > > >> > >> >> > > does, > > > > > >> > >> >> > > > > but > > > > > >> > >> >> > > > > > at > > > > > >> > >> >> > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > cost > > > > > >> > >> >> > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at > the > > > > > >> container > > > > > >> > >> level, > > > > > >> > >> >> > > which > > > > > >> > >> >> > > > is > > > > > >> > >> >> > > > > > not > > > > > >> > >> >> > > > > > > > > good. > > > > > >> > >> >> > > > > > > > > > > My > > > > > >> > >> >> > > > > > > > > > > > > > point > > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task > Off-Heap > > > > > Memory" > > > > > >> and > > > > > >> > >> "JVM > > > > > >> > >> >> > > > > Overhead" > > > > > >> > >> >> > > > > > > are > > > > > >> > >> >> > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > easy > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, > users > > > > might > > > > > >> > >> configure > > > > > >> > >> >> > them > > > > > >> > >> >> > > > > > higher > > > > > >> > >> >> > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > what > > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > > getting > > > a > > > > > >> direct > > > > > >> > >> OOM. > > > > > >> > >> >> For > > > > > >> > >> >> > > > > > > alternative > > > > > >> > >> >> > > > > > > > > 3, > > > > > >> > >> >> > > > > > > > > > > > users > > > > > >> > >> >> > > > > > > > > > > > > do > > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so > they > > > may > > > > > not > > > > > >> > >> config > > > > > >> > >> >> the > > > > > >> > >> >> > > two > > > > > >> > >> >> > > > > > > options > > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences > are > > > > risks > > > > > >> of > > > > > >> > >> >> overall > > > > > >> > >> >> > > > > container > > > > > >> > >> >> > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, > 2019 > > > at > > > > > >> 9:39 AM > > > > > >> > >> Till > > > > > >> > >> >> > > > > Rohrmann < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > > proposing > > > > > this > > > > > >> > FLIP > > > > > >> > >> >> > Xintong. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I > think > > > it > > > > > >> already > > > > > >> > >> >> looks > > > > > >> > >> >> > > quite > > > > > >> > >> >> > > > > > good. > > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > > >> > >> >> > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > > allocating > > > > > >> > >> memory > > > > > >> > >> >> > > > segments, > > > > > >> > >> >> > > > > I > > > > > >> > >> >> > > > > > > was > > > > > >> > >> >> > > > > > > > > > > > wondering > > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly > necessary > > > to > > > > do > > > > > >> in > > > > > >> > the > > > > > >> > >> >> > context > > > > > >> > >> >> > > > of > > > > > >> > >> >> > > > > > this > > > > > >> > >> >> > > > > > > > > FLIP > > > > > >> > >> >> > > > > > > > > > or > > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > > follow > > > > up? > > > > > >> > Without > > > > > >> > >> >> > knowing > > > > > >> > >> >> > > > all > > > > > >> > >> >> > > > > > > > > details, > > > > > >> > >> >> > > > > > > > > > I > > > > > >> > >> >> > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > > widen > > > > the > > > > > >> scope > > > > > >> > >> of > > > > > >> > >> >> this > > > > > >> > >> >> > > > FLIP > > > > > >> > >> >> > > > > > too > > > > > >> > >> >> > > > > > > > much > > > > > >> > >> >> > > > > > > > > > > > because > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the > > > > > existing > > > > > >> > call > > > > > >> > >> >> sites > > > > > >> > >> >> > of > > > > > >> > >> >> > > > the > > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > > >> > >> >> > > > > > > > > > > > > > where > > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory segments > > > (this > > > > > >> should > > > > > >> > >> >> mainly > > > > > >> > >> >> > be > > > > > >> > >> >> > > > > batch > > > > > >> > >> >> > > > > > > > > > > operators). > > > > > >> > >> >> > > > > > > > > > > > > The > > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > > reservation > > > > > >> call > > > > > >> > to > > > > > >> > >> the > > > > > >> > >> >> > > > > > > MemoryManager > > > > > >> > >> >> > > > > > > > > > should > > > > > >> > >> >> > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I would > > > hope > > > > > that > > > > > >> > >> this is > > > > > >> > >> >> > the > > > > > >> > >> >> > > > only > > > > > >> > >> >> > > > > > > point > > > > > >> > >> >> > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > > would > > > > have > > > > > >> with > > > > > >> > >> the > > > > > >> > >> >> > > > > > > MemoryManager. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > > second > > > > > open > > > > > >> > >> >> question > > > > > >> > >> >> > > about > > > > > >> > >> >> > > > > > > setting > > > > > >> > >> >> > > > > > > > > or > > > > > >> > >> >> > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > > limit, I > > > > > >> would > > > > > >> > >> also > > > > > >> > >> >> be > > > > > >> > >> >> > > > > > interested > > > > > >> > >> >> > > > > > > > why > > > > > >> > >> >> > > > > > > > > > > Yang > > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open > > > would > > > > be > > > > > >> > best. > > > > > >> > >> My > > > > > >> > >> >> > > concern > > > > > >> > >> >> > > > > > about > > > > > >> > >> >> > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar > > > > > situation > > > > > >> as > > > > > >> > we > > > > > >> > >> >> are > > > > > >> > >> >> > now > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > > memory > > > > > pools > > > > > >> > are > > > > > >> > >> not > > > > > >> > >> >> > > > clearly > > > > > >> > >> >> > > > > > > > > separated > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > can > > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different > pool, > > > then > > > > > it > > > > > >> is > > > > > >> > >> quite > > > > > >> > >> >> > hard > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > > understand > > > > > >> > >> >> > > > > > > > > > > > what > > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > > killed > > > > > for > > > > > >> > using > > > > > >> > >> >> too > > > > > >> > >> >> > > much > > > > > >> > >> >> > > > > > > memory. > > > > > >> > >> >> > > > > > > > > This > > > > > >> > >> >> > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a > similar > > > > > >> situation > > > > > >> > >> what > > > > > >> > >> >> we > > > > > >> > >> >> > > have > > > > > >> > >> >> > > > > with > > > > > >> > >> >> > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > > default > > > > > >> value > > > > > >> > >> for > > > > > >> > >> >> max > > > > > >> > >> >> > > > direct > > > > > >> > >> >> > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > giving > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to > increase > > > it > > > > if > > > > > >> he > > > > > >> > >> runs > > > > > >> > >> >> into > > > > > >> > >> >> > > an > > > > > >> > >> >> > > > > OOM. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > > would > > > > > >> > >> alternative 2 > > > > > >> > >> >> > lead > > > > > >> > >> >> > > to > > > > > >> > >> >> > > > > > lower > > > > > >> > >> >> > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > > where > > > we > > > > > set > > > > > >> > the > > > > > >> > >> >> direct > > > > > >> > >> >> > > > > memory > > > > > >> > >> >> > > > > > > to a > > > > > >> > >> >> > > > > > > > > > > higher > > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > > 2019 > > > at > > > > > >> 9:12 > > > > > >> > AM > > > > > >> > >> >> > Xintong > > > > > >> > >> >> > > > > Song < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the > > > > > feedback, > > > > > >> > >> Yang. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your > > > > > comments: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > > Direct > > > > > >> Memory* > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > setting > > a > > > > very > > > > > >> > large > > > > > >> > >> max > > > > > >> > >> >> > > direct > > > > > >> > >> >> > > > > > > memory > > > > > >> > >> >> > > > > > > > > size > > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > > E.g., > > > we > > > > > do > > > > > >> not > > > > > >> > >> >> worry > > > > > >> > >> >> > > about > > > > > >> > >> >> > > > > > > direct > > > > > >> > >> >> > > > > > > > > OOM, > > > > > >> > >> >> > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > > managed > > > > / > > > > > >> > network > > > > > >> > >> >> > memory > > > > > >> > >> >> > > > with > > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, there > > are > > > > > also > > > > > >> > some > > > > > >> > >> >> down > > > > > >> > >> >> > > sides > > > > > >> > >> >> > > > > of > > > > > >> > >> >> > > > > > > > doing > > > > > >> > >> >> > > > > > > > > > > this. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One > thing I > > > can > > > > > >> think > > > > > >> > >> of is > > > > > >> > >> >> > that > > > > > >> > >> >> > > > if > > > > > >> > >> >> > > > > a > > > > > >> > >> >> > > > > > > task > > > > > >> > >> >> > > > > > > > > > > > executor > > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed due > to > > > > > >> overusing > > > > > >> > >> >> memory, > > > > > >> > >> >> > it > > > > > >> > >> >> > > > > could > > > > > >> > >> >> > > > > > > be > > > > > >> > >> >> > > > > > > > > hard > > > > > >> > >> >> > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > use > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the > memory > > > is > > > > > >> > overused. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > > down > > > > side > > > > > >> is > > > > > >> > >> that > > > > > >> > >> >> the > > > > > >> > >> >> > > JVM > > > > > >> > >> >> > > > > > never > > > > > >> > >> >> > > > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > > GC > > > > > >> > >> >> > > > > > > > > > > > > due > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct > memory > > > > > limit, > > > > > >> > >> because > > > > > >> > >> >> the > > > > > >> > >> >> > > > limit > > > > > >> > >> >> > > > > > is > > > > > >> > >> >> > > > > > > > too > > > > > >> > >> >> > > > > > > > > > high > > > > > >> > >> >> > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we > kind > > > of > > > > > >> relay > > > > > >> > on > > > > > >> > >> >> heap > > > > > >> > >> >> > > > memory > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > > GC > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. > That > > > > could > > > > > >> be a > > > > > >> > >> >> problem > > > > > >> > >> >> > in > > > > > >> > >> >> > > > > cases > > > > > >> > >> >> > > > > > > > where > > > > > >> > >> >> > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but > not > > > > > enough > > > > > >> > heap > > > > > >> > >> >> > activity > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > trigger > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can > > > share > > > > > your > > > > > >> > >> reasons > > > > > >> > >> >> > for > > > > > >> > >> >> > > > > > > preferring > > > > > >> > >> >> > > > > > > > > > > > setting a > > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > > anything > > > > > >> else > > > > > >> > I > > > > > >> > >> >> > > > overlooked. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > > Calculation* > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is > any > > > > > conflict > > > > > >> > >> between > > > > > >> > >> >> > > > multiple > > > > > >> > >> >> > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > > specified, > > > > > I > > > > > >> > >> think we > > > > > >> > >> >> > > should > > > > > >> > >> >> > > > > > throw > > > > > >> > >> >> > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > error. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing > > > > checking > > > > > >> on > > > > > >> > the > > > > > >> > >> >> > client > > > > > >> > >> >> > > > side > > > > > >> > >> >> > > > > > is > > > > > >> > >> >> > > > > > > a > > > > > >> > >> >> > > > > > > > > good > > > > > >> > >> >> > > > > > > > > > > > idea, > > > > > >> > >> >> > > > > > > > > > > > > > so > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > > discover > > > > > the > > > > > >> > >> problem > > > > > >> > >> >> > > before > > > > > >> > >> >> > > > > > > > submitting > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a > good > > > > > thing. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not > > > only > > > > > >> rely on > > > > > >> > >> the > > > > > >> > >> >> > > client > > > > > >> > >> >> > > > > side > > > > > >> > >> >> > > > > > > > > > checking, > > > > > >> > >> >> > > > > > > > > > > > > > because > > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > > cluster > > > > > >> > >> TaskManagers > > > > > >> > >> >> on > > > > > >> > >> >> > > > > > different > > > > > >> > >> >> > > > > > > > > > machines > > > > > >> > >> >> > > > > > > > > > > > may > > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > configurations > > and > > > > the > > > > > >> > client > > > > > >> > >> >> does > > > > > >> > >> >> > > see > > > > > >> > >> >> > > > > > that. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > > think? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, > > > 2019 > > > > at > > > > > >> 5:09 > > > > > >> > >> PM > > > > > >> > >> >> Yang > > > > > >> > >> >> > > > Wang > > > > > >> > >> >> > > > > < > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > > your > > > > > >> detailed > > > > > >> > >> >> > proposal. > > > > > >> > >> >> > > > > After > > > > > >> > >> >> > > > > > > all > > > > > >> > >> >> > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, > it > > > > will > > > > > be > > > > > >> > more > > > > > >> > >> >> > > powerful > > > > > >> > >> >> > > > to > > > > > >> > >> >> > > > > > > > control > > > > > >> > >> >> > > > > > > > > > the > > > > > >> > >> >> > > > > > > > > > > > > flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have > few > > > > > >> questions > > > > > >> > >> about > > > > > >> > >> >> it. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native > > and > > > > > Direct > > > > > >> > >> Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > > >> differentiate > > > > > >> > >> user > > > > > >> > >> >> > direct > > > > > >> > >> >> > > > > > memory > > > > > >> > >> >> > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > native > > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included in > > task > > > > > >> off-heap > > > > > >> > >> >> memory. > > > > > >> > >> >> > > > > Right? > > > > > >> > >> >> > > > > > > So i > > > > > >> > >> >> > > > > > > > > > don’t > > > > > >> > >> >> > > > > > > > > > > > > think > > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > > >> > -XX:MaxDirectMemorySize > > > > > >> > >> >> > > > properly. I > > > > > >> > >> >> > > > > > > > prefer > > > > > >> > >> >> > > > > > > > > > > > leaving > > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory > > > > > >> Calculation > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum > of > > > and > > > > > >> > >> fine-grained > > > > > >> > >> >> > > > > > > memory(network > > > > > >> > >> >> > > > > > > > > > > memory, > > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger > than > > > > total > > > > > >> > >> process > > > > > >> > >> >> > > memory, > > > > > >> > >> >> > > > > how > > > > > >> > >> >> > > > > > do > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > deal > > > > > >> > >> >> > > > > > > > > > > > > with > > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > > check > > > > the > > > > > >> > memory > > > > > >> > >> >> > > > > configuration > > > > > >> > >> >> > > > > > > in > > > > > >> > >> >> > > > > > > > > > > client? > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong > Song < > > > > > >> > >> >> > > [hidden email]> > > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi > everyone, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > > like > > > to > > > > > >> start > > > > > >> > a > > > > > >> > >> >> > > discussion > > > > > >> > >> >> > > > > > > thread > > > > > >> > >> >> > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > > for > > > > > >> > >> >> > > > TaskExecutors"[1], > > > > > >> > >> >> > > > > > > where > > > > > >> > >> >> > > > > > > > we > > > > > >> > >> >> > > > > > > > > > > > > describe > > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > memory > > > > > >> > >> >> > > configurations. > > > > > >> > >> >> > > > > The > > > > > >> > >> >> > > > > > > > FLIP > > > > > >> > >> >> > > > > > > > > > > > document > > > > > >> > >> >> > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early > design > > > > > "Memory > > > > > >> > >> >> Management > > > > > >> > >> >> > > and > > > > > >> > >> >> > > > > > > > > > Configuration > > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with > updates > > > > from > > > > > >> > >> follow-up > > > > > >> > >> >> > > > > discussions > > > > > >> > >> >> > > > > > > > both > > > > > >> > >> >> > > > > > > > > > > online > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP > > > > > addresses > > > > > >> > >> several > > > > > >> > >> >> > > > > > shortcomings > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > > > current > > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > memory > > > > > >> > >> >> > > configuration. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Different > > > > > >> > >> configuration > > > > > >> > >> >> > for > > > > > >> > >> >> > > > > > > Streaming > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Complex > > > and > > > > > >> > >> difficult > > > > > >> > >> >> > > > > > configuration > > > > > >> > >> >> > > > > > > of > > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > > >> > >> >> > > > > > > > > > > > > in > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > > Complicated, > > > > > >> > >> uncertain > > > > > >> > >> >> and > > > > > >> > >> >> > > > hard > > > > > >> > >> >> > > > > to > > > > > >> > >> >> > > > > > > > > > > understand. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key > changes > > to > > > > > solve > > > > > >> > the > > > > > >> > >> >> > problems > > > > > >> > >> >> > > > can > > > > > >> > >> >> > > > > > be > > > > > >> > >> >> > > > > > > > > > > summarized > > > > > >> > >> >> > > > > > > > > > > > > as > > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Extend > > > > memory > > > > > >> > >> manager > > > > > >> > >> >> to > > > > > >> > >> >> > > also > > > > > >> > >> >> > > > > > > account > > > > > >> > >> >> > > > > > > > > for > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > Modify > > > how > > > > > >> > >> TaskExecutor > > > > > >> > >> >> > > memory > > > > > >> > >> >> > > > > is > > > > > >> > >> >> > > > > > > > > > > partitioned > > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory > > > > > >> reservations > > > > > >> > >> and > > > > > >> > >> >> > pools. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Simplify > > > > > memory > > > > > >> > >> >> > > configuration > > > > > >> > >> >> > > > > > > options > > > > > >> > >> >> > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please > find > > > more > > > > > >> > details > > > > > >> > >> in > > > > > >> > >> >> the > > > > > >> > >> >> > > > FLIP > > > > > >> > >> >> > > > > > wiki > > > > > >> > >> >> > > > > > > > > > > document > > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please > note > > > > that > > > > > >> the > > > > > >> > >> early > > > > > >> > >> >> > > design > > > > > >> > >> >> > > > > doc > > > > > >> > >> >> > > > > > > [2] > > > > > >> > >> >> > > > > > > > is > > > > > >> > >> >> > > > > > > > > > out > > > > > >> > >> >> > > > > > > > > > > > of > > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > appreciated > > to > > > > > have > > > > > >> the > > > > > >> > >> >> > > discussion > > > > > >> > >> >> > > > in > > > > > >> > >> >> > > > > > > this > > > > > >> > >> >> > > > > > > > > > > mailing > > > > > >> > >> >> > > > > > > > > > > > > list > > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > > forward > > > > to > > > > > >> your > > > > > >> > >> >> > > feedbacks. > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong > Song > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> >> > > > > > >> > >> > > > > > > >> > >> > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > |
@Xintong
Just to understand, what kind of overhead do you mean by creating the memory manager for every dynamic slot? We agreed not to preallocate segments anymore and there was no plan to do any segment caching at the moment, AFAIK. This means the segments are just garbage collected upon release and not reused. If we decide to go with caching or revoking later and share pools among slots, it would already mean mixed segment/reservation types of memory allocation in general (like batch and streaming). Then I think memory manager would need some input from tasks because it is hard for Flink to decide how much to cache and which pool to allocate from eventually. As I understand, tasks do not really need this now. That is why I think having memory manager per slot will give more security by isolating slots with relatively small effort. If there is no preallocation/caching then creating the memory manager is almost no-op, later we can reconsider it if needed. What do you think? Thanks, Andrey On Sat, Sep 14, 2019 at 8:11 AM Xintong Song <[hidden email]> wrote: > @Andrey, > > If we speak in the condition of dynamic slot allocation, then DataSet tasks > should always request slots with unknown resource profiles. Then the > allocated slots should have 1/n on-heap managed memory and 1/n off-heap > managed memory of the task executor, where n is the configured > 'numberOfSlot'. Given that DataSet operators only allocate segments (no > reservation), and does not care whether the segments are on-heap or > off-heap, I think it makes sense to wrapping both pools for DataSet > operators, with respect to the on-heap/off-heap quota of the allocated > slot. > > I'm not sure whether we want to create one memory manager per slot. I think > we can have limit the quota of each slot with one memory manager per task > executor. Given the dynamic slot allocation that dynamically create and > destroy slots, I think having one memory manager per task executor may save > the overhead for frequently creating and destroying memory managers, and > provide the chance for reusing memory segments across slots. > > Thank you~ > > Xintong Song > > > > On Fri, Sep 13, 2019 at 11:47 PM Andrey Zagrebin <[hidden email]> > wrote: > > > Hi Xintong, > > > > True, there would be no regression if only one type of memory is > > configured. This can be a problem only for the old jobs running in a > newly > > configured cluster. > > > > About the pool type precedence, in general, it should not matter for the > > users which type the segments have. > > The first implementation can be just to pull from any pool, e.g. empty > one > > pool firstly and then another or some other random pulling. > > This might be a problem if we mix segment allocations and reservation of > > memory chunks from the same memory manager. > > The reservation will be usually for a certain type of memory then the > task > > will probably have to also decide from which pool to allocate the > segments. > > I would suggest we create a memory manager per slot and give it memory > > limit of the slot then we do not have this kind of mixed operation > > because Dataset/Batch jobs need only segment memory allocations and > > streaming jobs need only memory chunks for state backends as I understand > > the current plan. > > I would suggest we will look at it if we have the mixed operations at > some > > point and it becomes a problem. > > > > Thanks, > > Andrey > > > > On Fri, Sep 13, 2019 at 5:24 PM Andrey Zagrebin <[hidden email]> > > wrote: > > > > > > > > > > > ---------- Forwarded message --------- > > > From: Xintong Song <[hidden email]> > > > Date: Thu, Sep 12, 2019 at 4:21 AM > > > Subject: Re: [DISCUSS] FLIP-49: Unified Memory Configuration for > > > TaskExecutors > > > To: dev <[hidden email]> > > > > > > > > > Hi Andrey, > > > > > > Thanks for bringing this up. > > > > > > If I understand correctly, this issue only occurs where the cluster is > > > configured with both on-heap and off-heap memory. There should be no > > > regression for clusters configured in the old way (either all on-heap > or > > > all off-heap). > > > > > > I also agree that it would be good if the DataSet API jobs can use both > > > memory types. The only question I can see is that, from which pool > (heap > > / > > > off-heap) should we allocate memory for DataSet API operators? Do we > > > always prioritize one pool over the other? Or do we always prioritize > the > > > pool with more available memory left? > > > > > > Thank you~ > > > > > > Xintong Song > > > > > > > > > > > > On Tue, Sep 10, 2019 at 8:15 PM Andrey Zagrebin <[hidden email]> > > > wrote: > > > > > > > Hi All, > > > > > > > > While looking more into the implementation details of Step 4, we > > released > > > > during some offline discussions with @Till > > > > that there can be a performance degradation for the batch DataSet API > > if > > > we > > > > simply continue to pull memory from the pool > > > > according the legacy option taskmanager.memory.off-heap. > > > > > > > > The reason is that if the cluster is newly configured to statically > > split > > > > heap/off-heap (not like previously either heap or 0ff-heap) > > > > then the batch DataSet API jobs will be able to use only one type of > > > > memory. Although it does not really matter where the memory segments > > come > > > > from > > > > and potentially batch jobs can use both. Also, currently the Dataset > > API > > > > does not result in absolute resource requirements and its batch jobs > > will > > > > always get a default share of TM resources. > > > > > > > > The suggestion is that we let the batch tasks of Dataset API pull > from > > > both > > > > pools according to their fair slot share of each memory type. > > > > For that we can have a special wrapping view of both pools which will > > > pull > > > > segments (can be randomly) according to the slot limits. > > > > The view can wrap TM level memory pools and be given to the Task. > > > > > > > > Best, > > > > Andrey > > > > > > > > On Mon, Sep 2, 2019 at 1:35 PM Xintong Song <[hidden email]> > > > wrote: > > > > > > > > > Thanks for your comments, Andrey. > > > > > > > > > > - Regarding Task Off-Heap Memory, I think you're right that the > user > > > need > > > > > to make sure that direct memory and native memory together used by > > the > > > > user > > > > > code (external libs) do not exceed the configured value. As far as > I > > > can > > > > > think of, there is nothing we can do about it. > > > > > > > > > > I addressed the rest of your comment in the wiki page [1]. Please > > take > > > a > > > > > look. > > > > > > > > > > Thank you~ > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > > > > > On Mon, Sep 2, 2019 at 6:13 PM Andrey Zagrebin < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > EDIT: sorry for confusion I meant > > > > > > taskmanager.memory.off-heap > > > > > > instead of > > > > > > setting taskmanager.memory.preallocate > > > > > > > > > > > > On Mon, Sep 2, 2019 at 11:29 AM Andrey Zagrebin < > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hi All, > > > > > > > > > > > > > > @Xitong thanks a lot for driving the discussion. > > > > > > > > > > > > > > I also reviewed the FLIP and it looks quite good to me. > > > > > > > Here are some comments: > > > > > > > > > > > > > > > > > > > > > - One thing I wanted to discuss is the > backwards-compatibility > > > > with > > > > > > > the previous user setups. We could list which options we > plan > > to > > > > > > deprecate. > > > > > > > From the first glance it looks possible to provide the > > > > same/similar > > > > > > > behaviour for the setups relying on the deprecated options. > > E.g. > > > > > > > setting taskmanager.memory.preallocate to true could > override > > > the > > > > > > > new taskmanager.memory.managed.offheap-fraction to 1 etc. At > > the > > > > > > moment the > > > > > > > FLIP just states that in some cases it may require > > > re-configuring > > > > of > > > > > > > cluster if migrated from prior versions. My suggestion is > that > > > we > > > > > try > > > > > > to > > > > > > > keep it backwards-compatible unless there is a good reason > > like > > > > some > > > > > > major > > > > > > > complication for the implementation. > > > > > > > > > > > > > > > > > > > > > Also couple of smaller things: > > > > > > > > > > > > > > - I suggest we remove TaskExecutorSpecifics from the FLIP > and > > > > leave > > > > > > > some general wording atm, like 'data structure to store' or > > > > 'utility > > > > > > > classes'. When the classes are implemented, we put the > > concrete > > > > > class > > > > > > > names. This way we can avoid confusion and stale documents. > > > > > > > > > > > > > > > > > > > > > - As I understand, if user task uses native memory (not > direct > > > > > memory, > > > > > > > but e.g. unsafe.allocate or from external lib), there will > be > > no > > > > > > > explicit guard against exceeding 'task off heap memory'. > Then > > > user > > > > > > should > > > > > > > still explicitly make sure that her/his direct buffer > > allocation > > > > > plus > > > > > > any > > > > > > > other memory usages does not exceed value announced as 'task > > off > > > > > > heap'. I > > > > > > > guess there is no so much that can be done about it except > > > > > mentioning > > > > > > in > > > > > > > docs, similar to controlling the heap state backend. > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > Andrey > > > > > > > > > > > > > > On Mon, Sep 2, 2019 at 10:07 AM Yang Wang < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > > > >> I also agree that all the configuration should be calculated > out > > > of > > > > > > >> TaskManager. > > > > > > >> > > > > > > >> So a full configuration should be generated before TaskManager > > > > > started. > > > > > > >> > > > > > > >> Override the calculated configurations through -D now seems > > > better. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Best, > > > > > > >> > > > > > > >> Yang > > > > > > >> > > > > > > >> Xintong Song <[hidden email]> 于2019年9月2日周一 上午11:39写道: > > > > > > >> > > > > > > >> > I just updated the FLIP wiki page [1], with the following > > > changes: > > > > > > >> > > > > > > > >> > - Network memory uses JVM direct memory, and is accounted > > > when > > > > > > >> setting > > > > > > >> > JVM max direct memory size parameter. > > > > > > >> > - Use dynamic configurations (`-Dkey=value`) to pass > > > calculated > > > > > > >> memory > > > > > > >> > configs into TaskExecutors, instead of ENV variables. > > > > > > >> > - Remove 'supporting memory reservation' from the scope > of > > > this > > > > > > FLIP. > > > > > > >> > > > > > > > >> > @till @stephan, please take another look see if there are > any > > > > other > > > > > > >> > concerns. > > > > > > >> > > > > > > > >> > Thank you~ > > > > > > >> > > > > > > > >> > Xintong Song > > > > > > >> > > > > > > > >> > > > > > > > >> > [1] > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > >> > > > > > > > >> > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song < > > > > [hidden email] > > > > > > > > > > > > >> > wrote: > > > > > > >> > > > > > > > >> > > Sorry for the late response. > > > > > > >> > > > > > > > > >> > > - Regarding the `TaskExecutorSpecifics` naming, let's > > discuss > > > > the > > > > > > >> detail > > > > > > >> > > in PR. > > > > > > >> > > - Regarding passing parameters into the `TaskExecutor`, +1 > > for > > > > > using > > > > > > >> > > dynamic configuration at the moment, given that there are > > more > > > > > > >> questions > > > > > > >> > to > > > > > > >> > > be discussed to have a general framework for overwriting > > > > > > >> configurations > > > > > > >> > > with ENV variables. > > > > > > >> > > - Regarding memory reservation, I double checked with Yu > and > > > he > > > > > will > > > > > > >> take > > > > > > >> > > care of it. > > > > > > >> > > > > > > > > >> > > Thank you~ > > > > > > >> > > > > > > > > >> > > Xintong Song > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann < > > > > > [hidden email] > > > > > > > > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > >> What I forgot to add is that we could tackle specifying > the > > > > > > >> > configuration > > > > > > >> > >> fully in an incremental way and that the full > specification > > > > > should > > > > > > be > > > > > > >> > the > > > > > > >> > >> desired end state. > > > > > > >> > >> > > > > > > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann < > > > > > > [hidden email]> > > > > > > >> > >> wrote: > > > > > > >> > >> > > > > > > >> > >> > I think our goal should be that the configuration is > > fully > > > > > > >> specified > > > > > > >> > >> when > > > > > > >> > >> > the process is started. By considering the internal > > > > calculation > > > > > > >> step > > > > > > >> > to > > > > > > >> > >> be > > > > > > >> > >> > rather validate existing values and calculate missing > > ones, > > > > > these > > > > > > >> two > > > > > > >> > >> > proposal shouldn't even conflict (given determinism). > > > > > > >> > >> > > > > > > > >> > >> > Since we don't want to change an existing > > flink-conf.yaml, > > > > > > >> specifying > > > > > > >> > >> the > > > > > > >> > >> > full configuration would require to pass in the options > > > > > > >> differently. > > > > > > >> > >> > > > > > > > >> > >> > One way could be the ENV variables approach. The reason > > why > > > > I'm > > > > > > >> trying > > > > > > >> > >> to > > > > > > >> > >> > exclude this feature from the FLIP is that I believe it > > > > needs a > > > > > > bit > > > > > > >> > more > > > > > > >> > >> > discussion. Just some questions which come to my mind: > > What > > > > > would > > > > > > >> be > > > > > > >> > the > > > > > > >> > >> > exact format (FLINK_KEY_NAME)? Would we support a dot > > > > separator > > > > > > >> which > > > > > > >> > is > > > > > > >> > >> > supported by some systems (FLINK.KEY.NAME)? If we > accept > > > the > > > > > dot > > > > > > >> > >> > separator what would be the order of precedence if > there > > > are > > > > > two > > > > > > >> ENV > > > > > > >> > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? > > > What > > > > is > > > > > > the > > > > > > >> > >> > precedence of env variable vs. dynamic configuration > > value > > > > > > >> specified > > > > > > >> > >> via -D? > > > > > > >> > >> > > > > > > > >> > >> > Another approach could be to pass in the dynamic > > > > configuration > > > > > > >> values > > > > > > >> > >> via > > > > > > >> > >> > `-Dkey=value` to the Flink process. For that we don't > > have > > > to > > > > > > >> change > > > > > > >> > >> > anything because the functionality already exists. > > > > > > >> > >> > > > > > > > >> > >> > Cheers, > > > > > > >> > >> > Till > > > > > > >> > >> > > > > > > > >> > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen < > > > > > [hidden email]> > > > > > > >> > wrote: > > > > > > >> > >> > > > > > > > >> > >> >> I see. Under the assumption of strict determinism that > > > > should > > > > > > >> work. > > > > > > >> > >> >> > > > > > > >> > >> >> The original proposal had this point "don't compute > > inside > > > > the > > > > > > TM, > > > > > > >> > >> compute > > > > > > >> > >> >> outside and supply a full config", because that > sounded > > > more > > > > > > >> > intuitive. > > > > > > >> > >> >> > > > > > > >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann < > > > > > > >> [hidden email] > > > > > > >> > > > > > > > > >> > >> >> wrote: > > > > > > >> > >> >> > > > > > > >> > >> >> > My understanding was that before starting the Flink > > > > process > > > > > we > > > > > > >> > call a > > > > > > >> > >> >> > utility which calculates these values. I assume that > > > this > > > > > > >> utility > > > > > > >> > >> will > > > > > > >> > >> >> do > > > > > > >> > >> >> > the calculation based on a set of configured values > > > > (process > > > > > > >> > memory, > > > > > > >> > >> >> flink > > > > > > >> > >> >> > memory, network memory etc.). Assuming that these > > values > > > > > don't > > > > > > >> > differ > > > > > > >> > >> >> from > > > > > > >> > >> >> > the values with which the JVM is started, it should > be > > > > > > possible > > > > > > >> to > > > > > > >> > >> >> > recompute them in the Flink process in order to set > > the > > > > > > values. > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > > >> > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen < > > > > > > [hidden email] > > > > > > >> > > > > > > > >> > >> wrote: > > > > > > >> > >> >> > > > > > > > >> > >> >> > > When computing the values in the JVM process after > > it > > > > > > started, > > > > > > >> > how > > > > > > >> > >> >> would > > > > > > >> > >> >> > > you deal with values like Max Direct Memory, > > Metaspace > > > > > size. > > > > > > >> > native > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > reservation (reduce heap size), etc? All the > values > > > that > > > > > are > > > > > > >> > >> >> parameters > > > > > > >> > >> >> > to > > > > > > >> > >> >> > > the JVM process and that need to be supplied at > > > process > > > > > > >> startup? > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > > > > > > >> > >> [hidden email]> > > > > > > >> > >> >> > > wrote: > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > Thanks for the clarification. I have some more > > > > comments: > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > - I would actually split the logic to compute > the > > > > > process > > > > > > >> > memory > > > > > > >> > >> >> > > > requirements and storing the values into two > > things. > > > > > E.g. > > > > > > >> one > > > > > > >> > >> could > > > > > > >> > >> >> > name > > > > > > >> > >> >> > > > the former TaskExecutorProcessUtility and the > > > latter > > > > > > >> > >> >> > > > TaskExecutorProcessMemory. But we can discuss > this > > > on > > > > > the > > > > > > PR > > > > > > >> > >> since > > > > > > >> > >> >> it's > > > > > > >> > >> >> > > > just a naming detail. > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > - Generally, I'm not opposed to making > > configuration > > > > > > values > > > > > > >> > >> >> overridable > > > > > > >> > >> >> > > by > > > > > > >> > >> >> > > > ENV variables. I think this is a very good idea > > and > > > > > makes > > > > > > >> the > > > > > > >> > >> >> > > > configurability of Flink processes easier. > > However, > > > I > > > > > > think > > > > > > >> > that > > > > > > >> > >> >> adding > > > > > > >> > >> >> > > > this functionality should not be part of this > FLIP > > > > > because > > > > > > >> it > > > > > > >> > >> would > > > > > > >> > >> >> > > simply > > > > > > >> > >> >> > > > widen the scope unnecessarily. > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > The reasons why I believe it is unnecessary are > > the > > > > > > >> following: > > > > > > >> > >> For > > > > > > >> > >> >> Yarn > > > > > > >> > >> >> > > we > > > > > > >> > >> >> > > > already create write a flink-conf.yaml which > could > > > be > > > > > > >> populated > > > > > > >> > >> with > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > memory settings. For the other processes it > should > > > not > > > > > > make > > > > > > >> a > > > > > > >> > >> >> > difference > > > > > > >> > >> >> > > > whether the loaded Configuration is populated > with > > > the > > > > > > >> memory > > > > > > >> > >> >> settings > > > > > > >> > >> >> > > from > > > > > > >> > >> >> > > > ENV variables or by using > > TaskExecutorProcessUtility > > > > to > > > > > > >> compute > > > > > > >> > >> the > > > > > > >> > >> >> > > missing > > > > > > >> > >> >> > > > values from the loaded configuration. If the > > latter > > > > > would > > > > > > >> not > > > > > > >> > be > > > > > > >> > >> >> > possible > > > > > > >> > >> >> > > > (wrong or missing configuration values), then we > > > > should > > > > > > not > > > > > > >> > have > > > > > > >> > >> >> been > > > > > > >> > >> >> > > able > > > > > > >> > >> >> > > > to actually start the process in the first > place. > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > - Concerning the memory reservation: I agree > with > > > you > > > > > that > > > > > > >> we > > > > > > >> > >> need > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > memory reservation functionality to make > streaming > > > > jobs > > > > > > work > > > > > > >> > with > > > > > > >> > >> >> > > "managed" > > > > > > >> > >> >> > > > memory. However, w/o this functionality the > whole > > > Flip > > > > > > would > > > > > > >> > >> already > > > > > > >> > >> >> > > bring > > > > > > >> > >> >> > > > a good amount of improvements to our users when > > > > running > > > > > > >> batch > > > > > > >> > >> jobs. > > > > > > >> > >> >> > > > Moreover, by keeping the scope smaller we can > > > complete > > > > > the > > > > > > >> FLIP > > > > > > >> > >> >> faster. > > > > > > >> > >> >> > > > Hence, I would propose to address the memory > > > > reservation > > > > > > >> > >> >> functionality > > > > > > >> > >> >> > > as a > > > > > > >> > >> >> > > > follow up FLIP (which Yu is working on if I'm > not > > > > > > mistaken). > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > Cheers, > > > > > > >> > >> >> > > > Till > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > > > > > > >> > >> [hidden email]> > > > > > > >> > >> >> > > wrote: > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > Just add my 2 cents. > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > Using environment variables to override the > > > > > > configuration > > > > > > >> for > > > > > > >> > >> >> > different > > > > > > >> > >> >> > > > > taskmanagers is better. > > > > > > >> > >> >> > > > > We do not need to generate dedicated > > > flink-conf.yaml > > > > > for > > > > > > >> all > > > > > > >> > >> >> > > > taskmanagers. > > > > > > >> > >> >> > > > > A common flink-conf.yam and different > > environment > > > > > > >> variables > > > > > > >> > are > > > > > > >> > >> >> > enough. > > > > > > >> > >> >> > > > > By reducing the distributed cached files, it > > could > > > > > make > > > > > > >> > >> launching > > > > > > >> > >> >> a > > > > > > >> > >> >> > > > > taskmanager faster. > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > Stephan gives a good suggestion that we could > > move > > > > the > > > > > > >> logic > > > > > > >> > >> into > > > > > > >> > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > > > > > > >> > >> >> > > > > Maybe the client could also benefit from this. > > > > > Different > > > > > > >> > users > > > > > > >> > >> do > > > > > > >> > >> >> not > > > > > > >> > >> >> > > > have > > > > > > >> > >> >> > > > > to export FLINK_CONF_DIR to update few config > > > > options. > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > Best, > > > > > > >> > >> >> > > > > Yang > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > Stephan Ewen <[hidden email]> 于2019年8月28日周三 > > > > > 上午1:21写道: > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > One note on the Environment Variables and > > > > > > Configuration > > > > > > >> > >> >> discussion. > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > My understanding is that passed ENV > variables > > > are > > > > > > added > > > > > > >> to > > > > > > >> > >> the > > > > > > >> > >> >> > > > > > configuration in the > > > > > > "GlobalConfiguration.loadConfig()" > > > > > > >> > >> method > > > > > > >> > >> >> (or > > > > > > >> > >> >> > > > > > similar). > > > > > > >> > >> >> > > > > > For all the code inside Flink, it looks like > > the > > > > > data > > > > > > >> was > > > > > > >> > in > > > > > > >> > >> the > > > > > > >> > >> >> > > config > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > start with, just that the scripts that > compute > > > the > > > > > > >> > variables > > > > > > >> > >> can > > > > > > >> > >> >> > pass > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > values to the process without actually > needing > > > to > > > > > > write > > > > > > >> a > > > > > > >> > >> file. > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > For example the > > > "GlobalConfiguration.loadConfig()" > > > > > > >> method > > > > > > >> > >> would > > > > > > >> > >> >> > take > > > > > > >> > >> >> > > > any > > > > > > >> > >> >> > > > > > ENV variable prefixed with "flink" and add > it > > > as a > > > > > > >> config > > > > > > >> > >> key. > > > > > > >> > >> >> > > > > > "flink_taskmanager_memory_size=2g" would > > become > > > > > > >> > >> >> > > > "taskmanager.memory.size: > > > > > > >> > >> >> > > > > > 2g". > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong > Song < > > > > > > >> > >> >> > [hidden email]> > > > > > > >> > >> >> > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > Thanks for the comments, Till. > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > I've also seen your comments on the wiki > > page, > > > > but > > > > > > >> let's > > > > > > >> > >> keep > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > > > > discussion here. > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how > do > > > you > > > > > > think > > > > > > >> > about > > > > > > >> > >> >> > naming > > > > > > >> > >> >> > > it > > > > > > >> > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > > > > > > >> > >> >> > > > > > > - Regarding passing memory configurations > > into > > > > > task > > > > > > >> > >> executors, > > > > > > >> > >> >> > I'm > > > > > > >> > >> >> > > in > > > > > > >> > >> >> > > > > > favor > > > > > > >> > >> >> > > > > > > of do it via environment variables rather > > than > > > > > > >> > >> configurations, > > > > > > >> > >> >> > with > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > following two reasons. > > > > > > >> > >> >> > > > > > > - It is easier to keep the memory > options > > > once > > > > > > >> > calculate > > > > > > >> > >> >> not to > > > > > > >> > >> >> > > be > > > > > > >> > >> >> > > > > > > changed with environment variables rather > > than > > > > > > >> > >> configurations. > > > > > > >> > >> >> > > > > > > - I'm not sure whether we should write > the > > > > > > >> > configuration > > > > > > >> > >> in > > > > > > >> > >> >> > > startup > > > > > > >> > >> >> > > > > > > scripts. Writing changes into the > > > configuration > > > > > > files > > > > > > >> > when > > > > > > >> > >> >> > running > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > startup scripts does not sounds right to > me. > > > Or > > > > we > > > > > > >> could > > > > > > >> > >> make > > > > > > >> > >> >> a > > > > > > >> > >> >> > > copy > > > > > > >> > >> >> > > > of > > > > > > >> > >> >> > > > > > > configuration files per flink cluster, and > > > make > > > > > the > > > > > > >> task > > > > > > >> > >> >> executor > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > load > > > > > > >> > >> >> > > > > > > from the copy, and clean up the copy after > > the > > > > > > >> cluster is > > > > > > >> > >> >> > shutdown, > > > > > > >> > >> >> > > > > which > > > > > > >> > >> >> > > > > > > is complicated. (I think this is also what > > > > Stephan > > > > > > >> means > > > > > > >> > in > > > > > > >> > >> >> his > > > > > > >> > >> >> > > > comment > > > > > > >> > >> >> > > > > > on > > > > > > >> > >> >> > > > > > > the wiki page?) > > > > > > >> > >> >> > > > > > > - Regarding reserving memory, I think this > > > > change > > > > > > >> should > > > > > > >> > be > > > > > > >> > >> >> > > included > > > > > > >> > >> >> > > > in > > > > > > >> > >> >> > > > > > > this FLIP. I think a big part of > motivations > > > of > > > > > this > > > > > > >> FLIP > > > > > > >> > >> is > > > > > > >> > >> >> to > > > > > > >> > >> >> > > unify > > > > > > >> > >> >> > > > > > > memory configuration for streaming / batch > > and > > > > > make > > > > > > it > > > > > > >> > easy > > > > > > >> > >> >> for > > > > > > >> > >> >> > > > > > configuring > > > > > > >> > >> >> > > > > > > rocksdb memory. If we don't support memory > > > > > > >> reservation, > > > > > > >> > >> then > > > > > > >> > >> >> > > > streaming > > > > > > >> > >> >> > > > > > jobs > > > > > > >> > >> >> > > > > > > cannot use managed memory (neither on-heap > > or > > > > > > >> off-heap), > > > > > > >> > >> which > > > > > > >> > >> >> > > makes > > > > > > >> > >> >> > > > > this > > > > > > >> > >> >> > > > > > > FLIP incomplete. > > > > > > >> > >> >> > > > > > > - Regarding network memory, I think you > are > > > > > right. I > > > > > > >> > think > > > > > > >> > >> we > > > > > > >> > >> >> > > > probably > > > > > > >> > >> >> > > > > > > don't need to change network stack from > > using > > > > > direct > > > > > > >> > >> memory to > > > > > > >> > >> >> > > using > > > > > > >> > >> >> > > > > > unsafe > > > > > > >> > >> >> > > > > > > native memory. Network memory size is > > > > > deterministic, > > > > > > >> > >> cannot be > > > > > > >> > >> >> > > > reserved > > > > > > >> > >> >> > > > > > as > > > > > > >> > >> >> > > > > > > managed memory does, and cannot be > > overused. I > > > > > think > > > > > > >> it > > > > > > >> > >> also > > > > > > >> > >> >> > works > > > > > > >> > >> >> > > if > > > > > > >> > >> >> > > > > we > > > > > > >> > >> >> > > > > > > simply keep using direct memory for > network > > > and > > > > > > >> include > > > > > > >> > it > > > > > > >> > >> in > > > > > > >> > >> >> jvm > > > > > > >> > >> >> > > max > > > > > > >> > >> >> > > > > > > direct memory size. > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till > > Rohrmann > > > < > > > > > > >> > >> >> > > [hidden email]> > > > > > > >> > >> >> > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > Hi Xintong, > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > thanks for addressing the comments and > > > adding > > > > a > > > > > > more > > > > > > >> > >> >> detailed > > > > > > >> > >> >> > > > > > > > implementation plan. I have a couple of > > > > comments > > > > > > >> > >> concerning > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > > > > > implementation plan: > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is > not > > > > really > > > > > > >> > >> >> descriptive. > > > > > > >> > >> >> > > > > Choosing > > > > > > >> > >> >> > > > > > a > > > > > > >> > >> >> > > > > > > > different name could help here. > > > > > > >> > >> >> > > > > > > > - I'm not sure whether I would pass the > > > memory > > > > > > >> > >> >> configuration to > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > > > TaskExecutor via environment variables. > I > > > > think > > > > > it > > > > > > >> > would > > > > > > >> > >> be > > > > > > >> > >> >> > > better > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > write > > > > > > >> > >> >> > > > > > > > it into the configuration one uses to > > start > > > > the > > > > > TM > > > > > > >> > >> process. > > > > > > >> > >> >> > > > > > > > - If possible, I would exclude the > memory > > > > > > >> reservation > > > > > > >> > >> from > > > > > > >> > >> >> this > > > > > > >> > >> >> > > > FLIP > > > > > > >> > >> >> > > > > > and > > > > > > >> > >> >> > > > > > > > add this as part of a dedicated FLIP. > > > > > > >> > >> >> > > > > > > > - If possible, then I would exclude > > changes > > > to > > > > > the > > > > > > >> > >> network > > > > > > >> > >> >> > stack > > > > > > >> > >> >> > > > from > > > > > > >> > >> >> > > > > > > this > > > > > > >> > >> >> > > > > > > > FLIP. Maybe we can simply say that the > > > direct > > > > > > memory > > > > > > >> > >> needed > > > > > > >> > >> >> by > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > > network > > > > > > >> > >> >> > > > > > > > stack is the framework direct memory > > > > > requirement. > > > > > > >> > >> Changing > > > > > > >> > >> >> how > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > is allocated can happen in a second > step. > > > This > > > > > > would > > > > > > >> > keep > > > > > > >> > >> >> the > > > > > > >> > >> >> > > scope > > > > > > >> > >> >> > > > > of > > > > > > >> > >> >> > > > > > > this > > > > > > >> > >> >> > > > > > > > FLIP smaller. > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong > > > Song < > > > > > > >> > >> >> > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > Hi everyone, > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > I just updated the FLIP document on > wiki > > > > [1], > > > > > > with > > > > > > >> > the > > > > > > >> > >> >> > > following > > > > > > >> > >> >> > > > > > > changes. > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > - Removed open question regarding > > > > > > MemorySegment > > > > > > >> > >> >> > allocation. > > > > > > >> > >> >> > > As > > > > > > >> > >> >> > > > > > > > > discussed, we exclude this topic > from > > > the > > > > > > >> scope of > > > > > > >> > >> this > > > > > > >> > >> >> > > FLIP. > > > > > > >> > >> >> > > > > > > > > - Updated content about JVM direct > > > memory > > > > > > >> > parameter > > > > > > >> > >> >> > > according > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > recent > > > > > > >> > >> >> > > > > > > > > discussions, and moved the other > > > options > > > > to > > > > > > >> > >> "Rejected > > > > > > >> > >> >> > > > > > Alternatives" > > > > > > >> > >> >> > > > > > > > for > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > moment. > > > > > > >> > >> >> > > > > > > > > - Added implementation steps. > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > [1] > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM > Stephan > > > > Ewen < > > > > > > >> > >> >> > [hidden email] > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong: Concerning "wait for > memory > > > > users > > > > > > >> before > > > > > > >> > >> task > > > > > > >> > >> >> > > dispose > > > > > > >> > >> >> > > > > and > > > > > > >> > >> >> > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > release": I agree, that's how it > > should > > > > be. > > > > > > >> Let's > > > > > > >> > >> try it > > > > > > >> > >> >> > out. > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM > > > does > > > > > not > > > > > > >> wait > > > > > > >> > >> for > > > > > > >> > >> >> GC > > > > > > >> > >> >> > > when > > > > > > >> > >> >> > > > > > > > allocating > > > > > > >> > >> >> > > > > > > > > > direct memory buffer": There seems > to > > be > > > > > > pretty > > > > > > >> > >> >> elaborate > > > > > > >> > >> >> > > logic > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > free > > > > > > >> > >> >> > > > > > > > > > buffers when allocating new ones. > See > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > @Till: Maybe. If we assume that the > > JVM > > > > > > default > > > > > > >> > works > > > > > > >> > >> >> (like > > > > > > >> > >> >> > > > going > > > > > > >> > >> >> > > > > > > with > > > > > > >> > >> >> > > > > > > > > > option 2 and not setting > > > > > > >> "-XX:MaxDirectMemorySize" > > > > > > >> > at > > > > > > >> > >> >> all), > > > > > > >> > >> >> > > > then > > > > > > >> > >> >> > > > > I > > > > > > >> > >> >> > > > > > > > think > > > > > > >> > >> >> > > > > > > > > it > > > > > > >> > >> >> > > > > > > > > > should be okay to set > > > > > > "-XX:MaxDirectMemorySize" > > > > > > >> to > > > > > > >> > >> >> > > > > > > > > > "off_heap_managed_memory + > > > direct_memory" > > > > > even > > > > > > >> if > > > > > > >> > we > > > > > > >> > >> use > > > > > > >> > >> >> > > > RocksDB. > > > > > > >> > >> >> > > > > > > That > > > > > > >> > >> >> > > > > > > > > is a > > > > > > >> > >> >> > > > > > > > > > big if, though, I honestly have no > > idea > > > :D > > > > > > >> Would be > > > > > > >> > >> >> good to > > > > > > >> > >> >> > > > > > > understand > > > > > > >> > >> >> > > > > > > > > > this, though, because this would > > affect > > > > > option > > > > > > >> (2) > > > > > > >> > >> and > > > > > > >> > >> >> > option > > > > > > >> > >> >> > > > > > (1.2). > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM > > Xintong > > > > > Song < > > > > > > >> > >> >> > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Let me try to summarize your > points. > > > > > Please > > > > > > >> > correct > > > > > > >> > >> >> me if > > > > > > >> > >> >> > > I'm > > > > > > >> > >> >> > > > > > > wrong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > - Memory consumers should > always > > > > avoid > > > > > > >> > returning > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > > > segments > > > > > > >> > >> >> > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > memory manager while there are > > > still > > > > > > >> > un-cleaned > > > > > > >> > >> >> > > > structures / > > > > > > >> > >> >> > > > > > > > threads > > > > > > >> > >> >> > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > may use the memory. Otherwise, > it > > > > would > > > > > > >> cause > > > > > > >> > >> >> serious > > > > > > >> > >> >> > > > > problems > > > > > > >> > >> >> > > > > > > by > > > > > > >> > >> >> > > > > > > > > > having > > > > > > >> > >> >> > > > > > > > > > > multiple consumers trying to > use > > > the > > > > > same > > > > > > >> > memory > > > > > > >> > >> >> > > segment. > > > > > > >> > >> >> > > > > > > > > > > - JVM does not wait for GC when > > > > > > allocating > > > > > > >> > >> direct > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > > > buffer. > > > > > > >> > >> >> > > > > > > > > > > Therefore even we set proper > max > > > > direct > > > > > > >> memory > > > > > > >> > >> size > > > > > > >> > >> >> > > limit, > > > > > > >> > >> >> > > > > we > > > > > > >> > >> >> > > > > > > may > > > > > > >> > >> >> > > > > > > > > > still > > > > > > >> > >> >> > > > > > > > > > > encounter direct memory oom if > > the > > > GC > > > > > > >> cleaning > > > > > > >> > >> >> memory > > > > > > >> > >> >> > > > slower > > > > > > >> > >> >> > > > > > > than > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > direct memory allocation. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Am I understanding this correctly? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM > > > > > JingsongLee > > > > > > < > > > > > > >> > >> >> > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > .invalid> > > > > > > >> > >> >> > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Hi stephan: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About option 2: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > if additional threads not > cleanly > > > shut > > > > > > down > > > > > > >> > >> before > > > > > > >> > >> >> we > > > > > > >> > >> >> > can > > > > > > >> > >> >> > > > > exit > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > task: > > > > > > >> > >> >> > > > > > > > > > > > In the current case of memory > > reuse, > > > > it > > > > > > has > > > > > > >> > >> freed up > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > > memory > > > > > > >> > >> >> > > > > > > it > > > > > > >> > >> >> > > > > > > > > > > > uses. If this memory is used by > > > other > > > > > > tasks > > > > > > >> > and > > > > > > >> > >> >> > > > asynchronous > > > > > > >> > >> >> > > > > > > > threads > > > > > > >> > >> >> > > > > > > > > > > > of exited task may still be > > > writing, > > > > > > there > > > > > > >> > will > > > > > > >> > >> be > > > > > > >> > >> >> > > > > concurrent > > > > > > >> > >> >> > > > > > > > > security > > > > > > >> > >> >> > > > > > > > > > > > problems, and even lead to > errors > > > in > > > > > user > > > > > > >> > >> computing > > > > > > >> > >> >> > > > results. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > So I think this is a serious and > > > > > > intolerable > > > > > > >> > >> bug, No > > > > > > >> > >> >> > > matter > > > > > > >> > >> >> > > > > > what > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > option is, it should be > avoided. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > About direct memory cleaned by > GC: > > > > > > >> > >> >> > > > > > > > > > > > I don't think it is a good idea, > > > I've > > > > > > >> > >> encountered so > > > > > > >> > >> >> > many > > > > > > >> > >> >> > > > > > > > situations > > > > > > >> > >> >> > > > > > > > > > > > that it's too late for GC to > > cause > > > > > > >> > DirectMemory > > > > > > >> > >> >> OOM. > > > > > > >> > >> >> > > > Release > > > > > > >> > >> >> > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > allocate DirectMemory depend on > > the > > > > > type > > > > > > of > > > > > > >> > user > > > > > > >> > >> >> job, > > > > > > >> > >> >> > > > which > > > > > > >> > >> >> > > > > is > > > > > > >> > >> >> > > > > > > > > > > > often beyond our control. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > Best, > > > > > > >> > >> >> > > > > > > > > > > > Jingsong Lee > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > >> > > ------------------------------------------------------------------ > > > > > > >> > >> >> > > > > > > > > > > > From:Stephan Ewen < > > [hidden email] > > > > > > > > > > >> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > > > > > > >> > >> >> > > > > > > > > > > > To:dev <[hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: > > > Unified > > > > > > >> Memory > > > > > > >> > >> >> > > Configuration > > > > > > >> > >> >> > > > > for > > > > > > >> > >> >> > > > > > > > > > > > TaskExecutors > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > My main concern with option 2 > > > > (manually > > > > > > >> release > > > > > > >> > >> >> memory) > > > > > > >> > >> >> > > is > > > > > > >> > >> >> > > > > that > > > > > > >> > >> >> > > > > > > > > > segfaults > > > > > > >> > >> >> > > > > > > > > > > > in the JVM send off all sorts of > > > > alarms > > > > > on > > > > > > >> user > > > > > > >> > >> >> ends. > > > > > > >> > >> >> > So > > > > > > >> > >> >> > > we > > > > > > >> > >> >> > > > > > need > > > > > > >> > >> >> > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > guarantee that this never > happens. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > The trickyness is in tasks that > > uses > > > > > data > > > > > > >> > >> >> structures / > > > > > > >> > >> >> > > > > > algorithms > > > > > > >> > >> >> > > > > > > > > with > > > > > > >> > >> >> > > > > > > > > > > > additional threads, like hash > > table > > > > > > >> spill/read > > > > > > >> > >> and > > > > > > >> > >> >> > > sorting > > > > > > >> > >> >> > > > > > > threads. > > > > > > >> > >> >> > > > > > > > > We > > > > > > >> > >> >> > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > to ensure that these cleanly > shut > > > down > > > > > > >> before > > > > > > >> > we > > > > > > >> > >> can > > > > > > >> > >> >> > exit > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > task. > > > > > > >> > >> >> > > > > > > > > > > > I am not sure that we have that > > > > > guaranteed > > > > > > >> > >> already, > > > > > > >> > >> >> > > that's > > > > > > >> > >> >> > > > > why > > > > > > >> > >> >> > > > > > > > option > > > > > > >> > >> >> > > > > > > > > > 1.1 > > > > > > >> > >> >> > > > > > > > > > > > seemed simpler to me. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM > > > > Xintong > > > > > > >> Song < > > > > > > >> > >> >> > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thanks for the comments, > > Stephan. > > > > > > >> Summarized > > > > > > >> > in > > > > > > >> > >> >> this > > > > > > >> > >> >> > > way > > > > > > >> > >> >> > > > > > really > > > > > > >> > >> >> > > > > > > > > makes > > > > > > >> > >> >> > > > > > > > > > > > > things easier to understand. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > I'm in favor of option 2, at > > least > > > > for > > > > > > the > > > > > > >> > >> >> moment. I > > > > > > >> > >> >> > > > think > > > > > > >> > >> >> > > > > it > > > > > > >> > >> >> > > > > > > is > > > > > > >> > >> >> > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > difficult to keep it segfault > > safe > > > > for > > > > > > >> memory > > > > > > >> > >> >> > manager, > > > > > > >> > >> >> > > as > > > > > > >> > >> >> > > > > > long > > > > > > >> > >> >> > > > > > > as > > > > > > >> > >> >> > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > always > > > > > > >> > >> >> > > > > > > > > > > > > de-allocate the memory segment > > > when > > > > it > > > > > > is > > > > > > >> > >> released > > > > > > >> > >> >> > from > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > consumers. Only if the memory > > > > consumer > > > > > > >> > continue > > > > > > >> > >> >> using > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > > buffer > > > > > > >> > >> >> > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > segment after releasing it, in > > > which > > > > > > case > > > > > > >> we > > > > > > >> > do > > > > > > >> > >> >> want > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > job > > > > > > >> > >> >> > > > > > to > > > > > > >> > >> >> > > > > > > > > fail > > > > > > >> > >> >> > > > > > > > > > so > > > > > > >> > >> >> > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > detect the memory leak early. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.2, I don't think > > this > > > > is > > > > > a > > > > > > >> good > > > > > > >> > >> idea. > > > > > > >> > >> >> > Not > > > > > > >> > >> >> > > > only > > > > > > >> > >> >> > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > assumption (regular GC is > enough > > > to > > > > > > clean > > > > > > >> > >> direct > > > > > > >> > >> >> > > buffers) > > > > > > >> > >> >> > > > > may > > > > > > >> > >> >> > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > always > > > > > > >> > >> >> > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > true, but also it makes harder > > for > > > > > > finding > > > > > > >> > >> >> problems > > > > > > >> > >> >> > in > > > > > > >> > >> >> > > > > cases > > > > > > >> > >> >> > > > > > of > > > > > > >> > >> >> > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > overuse. E.g., user configured > > > some > > > > > > direct > > > > > > >> > >> memory > > > > > > >> > >> >> for > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > user > > > > > > >> > >> >> > > > > > > > > > > libraries. > > > > > > >> > >> >> > > > > > > > > > > > > If the library actually use > more > > > > > direct > > > > > > >> > memory > > > > > > >> > >> >> then > > > > > > >> > >> >> > > > > > configured, > > > > > > >> > >> >> > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > > cannot be cleaned by GC > because > > > they > > > > > are > > > > > > >> > still > > > > > > >> > >> in > > > > > > >> > >> >> > use, > > > > > > >> > >> >> > > > may > > > > > > >> > >> >> > > > > > lead > > > > > > >> > >> >> > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > overuse > > > > > > >> > >> >> > > > > > > > > > > > > of the total container memory. > > In > > > > that > > > > > > >> case, > > > > > > >> > >> if it > > > > > > >> > >> >> > > didn't > > > > > > >> > >> >> > > > > > touch > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > default max direct memory > limit, > > > we > > > > > > cannot > > > > > > >> > get > > > > > > >> > >> a > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > OOM > > > > > > >> > >> >> > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > it > > > > > > >> > >> >> > > > > > > > > > > > > will become super hard to > > > understand > > > > > > which > > > > > > >> > >> part of > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > > to be updated. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > For option 1.1, it has the > > similar > > > > > > >> problem as > > > > > > >> > >> >> 1.2, if > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > > > exceeded > > > > > > >> > >> >> > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > memory does not reach the max > > > direct > > > > > > >> memory > > > > > > >> > >> limit > > > > > > >> > >> >> > > > specified > > > > > > >> > >> >> > > > > > by > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > dedicated parameter. I think > it > > is > > > > > > >> slightly > > > > > > >> > >> better > > > > > > >> > >> >> > than > > > > > > >> > >> >> > > > > 1.2, > > > > > > >> > >> >> > > > > > > only > > > > > > >> > >> >> > > > > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > > > > we can tune the parameter. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 > PM > > > > > Stephan > > > > > > >> Ewen > > > > > > >> > < > > > > > > >> > >> >> > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > About the > > > > "-XX:MaxDirectMemorySize" > > > > > > >> > >> discussion, > > > > > > >> > >> >> > maybe > > > > > > >> > >> >> > > > let > > > > > > >> > >> >> > > > > > me > > > > > > >> > >> >> > > > > > > > > > > summarize > > > > > > >> > >> >> > > > > > > > > > > > > it a > > > > > > >> > >> >> > > > > > > > > > > > > > bit differently: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > We have the following two > > > options: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be > > > > > > >> de-allocated > > > > > > >> > by > > > > > > >> > >> the > > > > > > >> > >> >> > GC. > > > > > > >> > >> >> > > > That > > > > > > >> > >> >> > > > > > > makes > > > > > > >> > >> >> > > > > > > > > it > > > > > > >> > >> >> > > > > > > > > > > > > segfault > > > > > > >> > >> >> > > > > > > > > > > > > > safe. But then we need a way > > to > > > > > > trigger > > > > > > >> GC > > > > > > >> > in > > > > > > >> > >> >> case > > > > > > >> > >> >> > > > > > > > de-allocation > > > > > > >> > >> >> > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > re-allocation of a bunch of > > > > segments > > > > > > >> > happens > > > > > > >> > >> >> > quickly, > > > > > > >> > >> >> > > > > which > > > > > > >> > >> >> > > > > > > is > > > > > > >> > >> >> > > > > > > > > > often > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > case during batch scheduling > > or > > > > task > > > > > > >> > restart. > > > > > > >> > >> >> > > > > > > > > > > > > > - The > > > "-XX:MaxDirectMemorySize" > > > > > > >> (option > > > > > > >> > >> 1.1) > > > > > > >> > >> >> is > > > > > > >> > >> >> > one > > > > > > >> > >> >> > > > way > > > > > > >> > >> >> > > > > > to > > > > > > >> > >> >> > > > > > > do > > > > > > >> > >> >> > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > - Another way could be to > > > have a > > > > > > >> > dedicated > > > > > > >> > >> >> > > > bookkeeping > > > > > > >> > >> >> > > > > in > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), > so > > > > that > > > > > > this > > > > > > >> > is a > > > > > > >> > >> >> > number > > > > > > >> > >> >> > > > > > > > independent > > > > > > >> > >> >> > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" > > > > parameter. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > (2) We manually allocate and > > > > > > de-allocate > > > > > > >> > the > > > > > > >> > >> >> memory > > > > > > >> > >> >> > > for > > > > > > >> > >> >> > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > MemorySegments > > > > > > >> > >> >> > > > > > > > > > > > > > (option 2). That way we need > > not > > > > > worry > > > > > > >> > about > > > > > > >> > >> >> > > triggering > > > > > > >> > >> >> > > > > GC > > > > > > >> > >> >> > > > > > by > > > > > > >> > >> >> > > > > > > > > some > > > > > > >> > >> >> > > > > > > > > > > > > > threshold or bookkeeping, > but > > it > > > > is > > > > > > >> harder > > > > > > >> > to > > > > > > >> > >> >> > prevent > > > > > > >> > >> >> > > > > > > > segfaults. > > > > > > >> > >> >> > > > > > > > > We > > > > > > >> > >> >> > > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > be very careful about when > we > > > > > release > > > > > > >> the > > > > > > >> > >> memory > > > > > > >> > >> >> > > > segments > > > > > > >> > >> >> > > > > > > (only > > > > > > >> > >> >> > > > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > cleanup phase of the main > > > thread). > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.1, we > > > > > probably > > > > > > >> need > > > > > > >> > to > > > > > > >> > >> >> set > > > > > > >> > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > > > > > > >> > >> >> > > "off_heap_managed_memory + > > > > > > >> > >> >> > > > > > > > > > > direct_memory" > > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > have "direct_memory" as a > > > separate > > > > > > >> reserved > > > > > > >> > >> >> memory > > > > > > >> > >> >> > > > pool. > > > > > > >> > >> >> > > > > > > > Because > > > > > > >> > >> >> > > > > > > > > if > > > > > > >> > >> >> > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > just > > > > > > >> > >> >> > > > > > > > > > > > > > set > "-XX:MaxDirectMemorySize" > > to > > > > > > >> > >> >> > > > > "off_heap_managed_memory + > > > > > > >> > >> >> > > > > > > > > > > > > jvm_overhead", > > > > > > >> > >> >> > > > > > > > > > > > > > then there will be times > when > > > that > > > > > > >> entire > > > > > > >> > >> >> memory is > > > > > > >> > >> >> > > > > > allocated > > > > > > >> > >> >> > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > buffers and we have nothing > > left > > > > for > > > > > > the > > > > > > >> > JVM > > > > > > >> > >> >> > > overhead. > > > > > > >> > >> >> > > > So > > > > > > >> > >> >> > > > > > we > > > > > > >> > >> >> > > > > > > > > either > > > > > > >> > >> >> > > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > > a > > > > > > >> > >> >> > > > > > > > > > > > > > way to compensate for that > > > (again > > > > > some > > > > > > >> > safety > > > > > > >> > >> >> > margin > > > > > > >> > >> >> > > > > cutoff > > > > > > >> > >> >> > > > > > > > > value) > > > > > > >> > >> >> > > > > > > > > > or > > > > > > >> > >> >> > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > will exceed container > memory. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > If we go with option 1.2, we > > > need > > > > to > > > > > > be > > > > > > >> > aware > > > > > > >> > >> >> that > > > > > > >> > >> >> > it > > > > > > >> > >> >> > > > > takes > > > > > > >> > >> >> > > > > > > > > > elaborate > > > > > > >> > >> >> > > > > > > > > > > > > logic > > > > > > >> > >> >> > > > > > > > > > > > > > to push recycling of direct > > > > buffers > > > > > > >> without > > > > > > >> > >> >> always > > > > > > >> > >> >> > > > > > > triggering a > > > > > > >> > >> >> > > > > > > > > > full > > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > My first guess is that the > > > options > > > > > > will > > > > > > >> be > > > > > > >> > >> >> easiest > > > > > > >> > >> >> > to > > > > > > >> > >> >> > > > do > > > > > > >> > >> >> > > > > in > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > following > > > > > > >> > >> >> > > > > > > > > > > > > > order: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.1 with a > > dedicated > > > > > > >> > direct_memory > > > > > > >> > >> >> > > > parameter, > > > > > > >> > >> >> > > > > as > > > > > > >> > >> >> > > > > > > > > > discussed > > > > > > >> > >> >> > > > > > > > > > > > > > above. We would need to > find a > > > way > > > > > to > > > > > > >> set > > > > > > >> > the > > > > > > >> > >> >> > > > > direct_memory > > > > > > >> > >> >> > > > > > > > > > parameter > > > > > > >> > >> >> > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > default. We could start with > > 64 > > > MB > > > > > and > > > > > > >> see > > > > > > >> > >> how > > > > > > >> > >> >> it > > > > > > >> > >> >> > > goes > > > > > > >> > >> >> > > > in > > > > > > >> > >> >> > > > > > > > > practice. > > > > > > >> > >> >> > > > > > > > > > > One > > > > > > >> > >> >> > > > > > > > > > > > > > danger I see is that setting > > > this > > > > > loo > > > > > > >> low > > > > > > >> > can > > > > > > >> > >> >> > cause a > > > > > > >> > >> >> > > > > bunch > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > additional > > > > > > >> > >> >> > > > > > > > > > > > > > GCs compared to before (we > > need > > > to > > > > > > watch > > > > > > >> > this > > > > > > >> > >> >> > > > carefully). > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 2. It is actually > > > quite > > > > > > >> simple > > > > > > >> > to > > > > > > >> > >> >> > > implement, > > > > > > >> > >> >> > > > > we > > > > > > >> > >> >> > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > try > > > > > > >> > >> >> > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > segfault safe we are at the > > > > moment. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > - Option 1.2: We would not > > > touch > > > > > the > > > > > > >> > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > > > > > > >> > >> >> > > > > > > > > > > > > parameter > > > > > > >> > >> >> > > > > > > > > > > > > > at all and assume that all > the > > > > > direct > > > > > > >> > memory > > > > > > >> > >> >> > > > allocations > > > > > > >> > >> >> > > > > > that > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > Netty do are infrequent > enough > > > to > > > > be > > > > > > >> > cleaned > > > > > > >> > >> up > > > > > > >> > >> >> > fast > > > > > > >> > >> >> > > > > enough > > > > > > >> > >> >> > > > > > > > > through > > > > > > >> > >> >> > > > > > > > > > > > > regular > > > > > > >> > >> >> > > > > > > > > > > > > > GC. I am not sure if that > is a > > > > valid > > > > > > >> > >> assumption, > > > > > > >> > >> >> > > > though. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > Best, > > > > > > >> > >> >> > > > > > > > > > > > > > Stephan > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 > > PM > > > > > > Xintong > > > > > > >> > Song > > > > > > >> > >> < > > > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > > > opinion > > > > > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > > > alternative > > > > > 2. > > > > > > I > > > > > > >> was > > > > > > >> > >> >> > wondering > > > > > > >> > >> >> > > > > > whether > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > can > > > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() > for > > > > > off-heap > > > > > > >> > >> managed > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > and > > > > > > >> > >> >> > > > > > > > network > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > > > giving > > > > > it a > > > > > > >> > second > > > > > > >> > >> >> > > thought, > > > > > > >> > >> >> > > > I > > > > > > >> > >> >> > > > > > > think > > > > > > >> > >> >> > > > > > > > > even > > > > > > >> > >> >> > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > > > memory > > > > > > for > > > > > > >> > >> off-heap > > > > > > >> > >> >> > > > managed > > > > > > >> > >> >> > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > cause > > > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > > > think > > > > > what > > > > > > >> > >> proposed > > > > > > >> > >> >> in > > > > > > >> > >> >> > > this > > > > > > >> > >> >> > > > > > FLIP > > > > > > >> > >> >> > > > > > > it > > > > > > >> > >> >> > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > both > > > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory > and > > > > > network > > > > > > >> > memory > > > > > > >> > >> >> > > allocated > > > > > > >> > >> >> > > > > > > through > > > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which > > means > > > > > they > > > > > > >> are > > > > > > >> > >> >> > practically > > > > > > >> > >> >> > > > > > native > > > > > > >> > >> >> > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > > > memory. > > > > > > The > > > > > > >> > only > > > > > > >> > >> >> parts > > > > > > >> > >> >> > of > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > > > limited > > > > > > >> > >> >> > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > > > off-heap > > > > > > >> > memory > > > > > > >> > >> and > > > > > > >> > >> >> > JVM > > > > > > >> > >> >> > > > > > > overhead, > > > > > > >> > >> >> > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 > > suggests > > > > to > > > > > > set > > > > > > >> the > > > > > > >> > >> JVM > > > > > > >> > >> >> max > > > > > > >> > >> >> > > > > direct > > > > > > >> > >> >> > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > to. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 1:48 > > > PM > > > > > Till > > > > > > >> > >> Rohrmann > > > > > > >> > >> >> < > > > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the > > clarification > > > > > > >> Xintong. I > > > > > > >> > >> >> > > understand > > > > > > >> > >> >> > > > > the > > > > > > >> > >> >> > > > > > > two > > > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > > > option 2 > > > > > > >> > because > > > > > > >> > >> it > > > > > > >> > >> >> > makes > > > > > > >> > >> >> > > > > > things > > > > > > >> > >> >> > > > > > > > > > > explicit. > > > > > > >> > >> >> > > > > > > > > > > > If > > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > > > memory, I > > > > > > >> fear > > > > > > >> > >> that > > > > > > >> > >> >> we > > > > > > >> > >> >> > > might > > > > > > >> > >> >> > > > > end > > > > > > >> > >> >> > > > > > > up > > > > > > >> > >> >> > > > > > > > > in a > > > > > > >> > >> >> > > > > > > > > > > > > similar > > > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > > > currently > > > > > in: > > > > > > >> The > > > > > > >> > >> user > > > > > > >> > >> >> > might > > > > > > >> > >> >> > > > see > > > > > > >> > >> >> > > > > > that > > > > > > >> > >> >> > > > > > > > her > > > > > > >> > >> >> > > > > > > > > > > > process > > > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and > does > > > not > > > > > know > > > > > > >> why > > > > > > >> > >> this > > > > > > >> > >> >> is > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > case. > > > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > > > process > > > > > > memory > > > > > > >> > size > > > > > > >> > >> >> > > (similar > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > > > increasing > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > > > accommodate > > > > > for > > > > > > >> the > > > > > > >> > >> extra > > > > > > >> > >> >> > > direct > > > > > > >> > >> >> > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > > Even > > > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > > > budgets > > > > > > >> which > > > > > > >> > >> are > > > > > > >> > >> >> not > > > > > > >> > >> >> > > > fully > > > > > > >> > >> >> > > > > > used > > > > > > >> > >> >> > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > hence > > > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall > memory > > > > > > >> consumption. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > > 11:01 > > > > AM > > > > > > >> > Xintong > > > > > > >> > >> >> Song < > > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this > > with a > > > > > > >> concrete > > > > > > >> > >> >> example > > > > > > >> > >> >> > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > > > following > > > > > > >> > >> scenario. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: > > 1GB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory > (Task > > > > > Off-Heap > > > > > > >> > >> Memory + > > > > > > >> > >> >> JVM > > > > > > >> > >> >> > > > > > > Overhead): > > > > > > >> > >> >> > > > > > > > > > 200MB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > > > Memory, > > > > > > JVM > > > > > > >> > >> >> Metaspace, > > > > > > >> > >> >> > > > > > Off-Heap > > > > > > >> > >> >> > > > > > > > > > Managed > > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we > > set > > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > 200MB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we > > set > > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > a > > > > > > >> > >> >> > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > > memory > > > > > > usage > > > > > > >> of > > > > > > >> > >> Task > > > > > > >> > >> >> > > > Off-Heap > > > > > > >> > >> >> > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, > > then > > > > > > >> > alternative 2 > > > > > > >> > >> >> and > > > > > > >> > >> >> > > > > > > alternative 3 > > > > > > >> > >> >> > > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > > > larger > > > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > > > will > > > > > > >> > >> >> > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other > > memory > > > > > pools. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > > memory > > > > > > usage > > > > > > >> of > > > > > > >> > >> Task > > > > > > >> > >> >> > > > Off-Heap > > > > > > >> > >> >> > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > > > exceed > > > > > > 200MB, > > > > > > >> > then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 > > suffers > > > > > from > > > > > > >> > >> frequent > > > > > > >> > >> >> OOM. > > > > > > >> > >> >> > > To > > > > > > >> > >> >> > > > > > avoid > > > > > > >> > >> >> > > > > > > > > that, > > > > > > >> > >> >> > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > only > > > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > > > modify > > > > > the > > > > > > >> > >> >> configuration > > > > > > >> > >> >> > > and > > > > > > >> > >> >> > > > > > > > increase > > > > > > >> > >> >> > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap > > Memory + > > > > JVM > > > > > > >> > >> Overhead). > > > > > > >> > >> >> > Let's > > > > > > >> > >> >> > > > say > > > > > > >> > >> >> > > > > > > that > > > > > > >> > >> >> > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to > > 250MB, > > > > > this > > > > > > >> will > > > > > > >> > >> >> reduce > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > > total > > > > > > >> > >> >> > > > > > > > size > > > > > > >> > >> >> > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > other > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, > given > > > the > > > > > > total > > > > > > >> > >> process > > > > > > >> > >> >> > > memory > > > > > > >> > >> >> > > > > > > remains > > > > > > >> > >> >> > > > > > > > > > 1GB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative > 3, > > > > there > > > > > is > > > > > > >> no > > > > > > >> > >> >> chance of > > > > > > >> > >> >> > > > > direct > > > > > > >> > >> >> > > > > > > OOM. > > > > > > >> > >> >> > > > > > > > > > There > > > > > > >> > >> >> > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the > > total > > > > > > process > > > > > > >> > >> memory > > > > > > >> > >> >> > limit, > > > > > > >> > >> >> > > > but > > > > > > >> > >> >> > > > > > > given > > > > > > >> > >> >> > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > > > reserved > > > > > > >> native > > > > > > >> > >> >> memory > > > > > > >> > >> >> > > > > > (Off-Heap > > > > > > >> > >> >> > > > > > > > > > Managed > > > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM > > Metaspace), > > > > if > > > > > > the > > > > > > >> > >> actual > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > > > 200MB, > > > > > user > > > > > > >> > >> probably > > > > > > >> > >> >> do > > > > > > >> > >> >> > > not > > > > > > >> > >> >> > > > > need > > > > > > >> > >> >> > > > > > > to > > > > > > >> > >> >> > > > > > > > > > change > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think > from > > > the > > > > > > user's > > > > > > >> > >> >> > > perspective, a > > > > > > >> > >> >> > > > > > > > feasible > > > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may > > lead > > > > to > > > > > > >> lower > > > > > > >> > >> >> resource > > > > > > >> > >> >> > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > > compared > > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 > at > > > > 10:28 > > > > > AM > > > > > > >> Till > > > > > > >> > >> >> > Rohrmann > > > > > > >> > >> >> > > < > > > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to > > help > > > > me > > > > > > >> > >> understand > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > > > difference > > > > > > >> > >> >> > > > > > > > > > between > > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > > > under > > > > > > >> > utilization > > > > > > >> > >> >> > > Xintong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > Task > > > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there > > is > > > > the > > > > > > risk > > > > > > >> > that > > > > > > >> > >> >> this > > > > > > >> > >> >> > > size > > > > > > >> > >> >> > > > > is > > > > > > >> > >> >> > > > > > > too > > > > > > >> > >> >> > > > > > > > > low > > > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > > > collection > > > > > and > > > > > > >> > >> >> potentially > > > > > > >> > >> >> > an > > > > > > >> > >> >> > > > OOM. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > > > > something > > > > > > >> > >> >> > > > > > > > > > > larger > > > > > > >> > >> >> > > > > > > > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > > > would > > > > of > > > > > > >> course > > > > > > >> > >> >> reduce > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > sizes > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > other > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would > alternative > > 2 > > > > now > > > > > > >> result > > > > > > >> > >> in an > > > > > > >> > >> >> > > under > > > > > > >> > >> >> > > > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to > > alternative > > > 3? > > > > > If > > > > > > >> > >> >> alternative 3 > > > > > > >> > >> >> > > > > > strictly > > > > > > >> > >> >> > > > > > > > > sets a > > > > > > >> > >> >> > > > > > > > > > > > > higher > > > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size > and > > > we > > > > > use > > > > > > >> only > > > > > > >> > >> >> little, > > > > > > >> > >> >> > > > then I > > > > > > >> > >> >> > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > expect > > > > > > >> > >> >> > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 > results > > in > > > > > > memory > > > > > > >> > under > > > > > > >> > >> >> > > > > utilization. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > > at > > > > 4:19 > > > > > > PM > > > > > > >> > Yang > > > > > > >> > >> >> Wang < > > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and > Direct > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is > setting > > a > > > > very > > > > > > >> large > > > > > > >> > >> max > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > size > > > > > > >> > >> >> > > > > > > > > > > when > > > > > > >> > >> >> > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > do > > > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate > direct > > > and > > > > > > >> native > > > > > > >> > >> >> memory. > > > > > > >> > >> >> > If > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > > > framework > > > > > > >> > direct > > > > > > >> > >> >> > > > memory,could > > > > > > >> > >> >> > > > > > be > > > > > > >> > >> >> > > > > > > > > > > calculated > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > > > setting > > > > > > >> direct > > > > > > >> > >> memory > > > > > > >> > >> >> > with > > > > > > >> > >> >> > > > > fixed > > > > > > >> > >> >> > > > > > > > > value. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory > Calculation > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with > > xintong. > > > > For > > > > > > Yarn > > > > > > >> > and > > > > > > >> > >> >> k8s,we > > > > > > >> > >> >> > > > need > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > check > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > > > client > > > > > to > > > > > > >> avoid > > > > > > >> > >> >> > > submitting > > > > > > >> > >> >> > > > > > > > > successfully > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > > > >> > >> [hidden email] > > > > > > >> > >> >> > > > > >于2019年8月13日 > > > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for > > replying, > > > > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > > > MemorySegment, I > > > > > > think > > > > > > >> > you > > > > > > >> > >> are > > > > > > >> > >> >> > > right > > > > > > >> > >> >> > > > > that > > > > > > >> > >> >> > > > > > > we > > > > > > >> > >> >> > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the > scope > > > of > > > > > this > > > > > > >> > FLIP. > > > > > > >> > >> >> This > > > > > > >> > >> >> > > FLIP > > > > > > >> > >> >> > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > > > >> > >> >> > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > > > pools > > > > > for > > > > > > >> > >> >> > TaskExecutors, > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > > > minimum > > > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers > > use > > > > it. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > > > memory, I > > > > > > think > > > > > > >> > >> >> > alternative > > > > > > >> > >> >> > > 3 > > > > > > >> > >> >> > > > > may > > > > > > >> > >> >> > > > > > > not > > > > > > >> > >> >> > > > > > > > > > having > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation > issue > > > that > > > > > > >> > >> alternative 2 > > > > > > >> > >> >> > > does, > > > > > > >> > >> >> > > > > but > > > > > > >> > >> >> > > > > > at > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > cost > > > > > > >> > >> >> > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at > > the > > > > > > >> container > > > > > > >> > >> level, > > > > > > >> > >> >> > > which > > > > > > >> > >> >> > > > is > > > > > > >> > >> >> > > > > > not > > > > > > >> > >> >> > > > > > > > > good. > > > > > > >> > >> >> > > > > > > > > > > My > > > > > > >> > >> >> > > > > > > > > > > > > > point > > > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task > > Off-Heap > > > > > > Memory" > > > > > > >> and > > > > > > >> > >> "JVM > > > > > > >> > >> >> > > > > Overhead" > > > > > > >> > >> >> > > > > > > are > > > > > > >> > >> >> > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > easy > > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, > > users > > > > > might > > > > > > >> > >> configure > > > > > > >> > >> >> > them > > > > > > >> > >> >> > > > > > higher > > > > > > >> > >> >> > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > what > > > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > > > getting > > > > a > > > > > > >> direct > > > > > > >> > >> OOM. > > > > > > >> > >> >> For > > > > > > >> > >> >> > > > > > > alternative > > > > > > >> > >> >> > > > > > > > > 3, > > > > > > >> > >> >> > > > > > > > > > > > users > > > > > > >> > >> >> > > > > > > > > > > > > do > > > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so > > they > > > > may > > > > > > not > > > > > > >> > >> config > > > > > > >> > >> >> the > > > > > > >> > >> >> > > two > > > > > > >> > >> >> > > > > > > options > > > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences > > are > > > > > risks > > > > > > >> of > > > > > > >> > >> >> overall > > > > > > >> > >> >> > > > > container > > > > > > >> > >> >> > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, > > 2019 > > > > at > > > > > > >> 9:39 AM > > > > > > >> > >> Till > > > > > > >> > >> >> > > > > Rohrmann < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > > > proposing > > > > > > this > > > > > > >> > FLIP > > > > > > >> > >> >> > Xintong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I > > think > > > > it > > > > > > >> already > > > > > > >> > >> >> looks > > > > > > >> > >> >> > > quite > > > > > > >> > >> >> > > > > > good. > > > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > > > >> > >> >> > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > > > allocating > > > > > > >> > >> memory > > > > > > >> > >> >> > > > segments, > > > > > > >> > >> >> > > > > I > > > > > > >> > >> >> > > > > > > was > > > > > > >> > >> >> > > > > > > > > > > > wondering > > > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly > > necessary > > > > to > > > > > do > > > > > > >> in > > > > > > >> > the > > > > > > >> > >> >> > context > > > > > > >> > >> >> > > > of > > > > > > >> > >> >> > > > > > this > > > > > > >> > >> >> > > > > > > > > FLIP > > > > > > >> > >> >> > > > > > > > > > or > > > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > > > follow > > > > > up? > > > > > > >> > Without > > > > > > >> > >> >> > knowing > > > > > > >> > >> >> > > > all > > > > > > >> > >> >> > > > > > > > > details, > > > > > > >> > >> >> > > > > > > > > > I > > > > > > >> > >> >> > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > > > widen > > > > > the > > > > > > >> scope > > > > > > >> > >> of > > > > > > >> > >> >> this > > > > > > >> > >> >> > > > FLIP > > > > > > >> > >> >> > > > > > too > > > > > > >> > >> >> > > > > > > > much > > > > > > >> > >> >> > > > > > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all > the > > > > > > existing > > > > > > >> > call > > > > > > >> > >> >> sites > > > > > > >> > >> >> > of > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > > > >> > >> >> > > > > > > > > > > > > > where > > > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory > segments > > > > (this > > > > > > >> should > > > > > > >> > >> >> mainly > > > > > > >> > >> >> > be > > > > > > >> > >> >> > > > > batch > > > > > > >> > >> >> > > > > > > > > > > operators). > > > > > > >> > >> >> > > > > > > > > > > > > The > > > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > > > reservation > > > > > > >> call > > > > > > >> > to > > > > > > >> > >> the > > > > > > >> > >> >> > > > > > > MemoryManager > > > > > > >> > >> >> > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I > would > > > > hope > > > > > > that > > > > > > >> > >> this is > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > only > > > > > > >> > >> >> > > > > > > point > > > > > > >> > >> >> > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > > > would > > > > > have > > > > > > >> with > > > > > > >> > >> the > > > > > > >> > >> >> > > > > > > MemoryManager. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > > > second > > > > > > open > > > > > > >> > >> >> question > > > > > > >> > >> >> > > about > > > > > > >> > >> >> > > > > > > setting > > > > > > >> > >> >> > > > > > > > > or > > > > > > >> > >> >> > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > > > limit, I > > > > > > >> would > > > > > > >> > >> also > > > > > > >> > >> >> be > > > > > > >> > >> >> > > > > > interested > > > > > > >> > >> >> > > > > > > > why > > > > > > >> > >> >> > > > > > > > > > > Yang > > > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it > open > > > > would > > > > > be > > > > > > >> > best. > > > > > > >> > >> My > > > > > > >> > >> >> > > concern > > > > > > >> > >> >> > > > > > about > > > > > > >> > >> >> > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a > similar > > > > > > situation > > > > > > >> as > > > > > > >> > we > > > > > > >> > >> >> are > > > > > > >> > >> >> > now > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > > > memory > > > > > > pools > > > > > > >> > are > > > > > > >> > >> not > > > > > > >> > >> >> > > > clearly > > > > > > >> > >> >> > > > > > > > > separated > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > can > > > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different > > pool, > > > > then > > > > > > it > > > > > > >> is > > > > > > >> > >> quite > > > > > > >> > >> >> > hard > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > > understand > > > > > > >> > >> >> > > > > > > > > > > > what > > > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > > > killed > > > > > > for > > > > > > >> > using > > > > > > >> > >> >> too > > > > > > >> > >> >> > > much > > > > > > >> > >> >> > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > This > > > > > > >> > >> >> > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a > > similar > > > > > > >> situation > > > > > > >> > >> what > > > > > > >> > >> >> we > > > > > > >> > >> >> > > have > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > > > default > > > > > > >> value > > > > > > >> > >> for > > > > > > >> > >> >> max > > > > > > >> > >> >> > > > direct > > > > > > >> > >> >> > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > giving > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to > > increase > > > > it > > > > > if > > > > > > >> he > > > > > > >> > >> runs > > > > > > >> > >> >> into > > > > > > >> > >> >> > > an > > > > > > >> > >> >> > > > > OOM. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > > > would > > > > > > >> > >> alternative 2 > > > > > > >> > >> >> > lead > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > > lower > > > > > > >> > >> >> > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > > > where > > > > we > > > > > > set > > > > > > >> > the > > > > > > >> > >> >> direct > > > > > > >> > >> >> > > > > memory > > > > > > >> > >> >> > > > > > > to a > > > > > > >> > >> >> > > > > > > > > > > higher > > > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > > > 2019 > > > > at > > > > > > >> 9:12 > > > > > > >> > AM > > > > > > >> > >> >> > Xintong > > > > > > >> > >> >> > > > > Song < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for > the > > > > > > feedback, > > > > > > >> > >> Yang. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding > your > > > > > > comments: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > > > Direct > > > > > > >> Memory* > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > > setting > > > a > > > > > very > > > > > > >> > large > > > > > > >> > >> max > > > > > > >> > >> >> > > direct > > > > > > >> > >> >> > > > > > > memory > > > > > > >> > >> >> > > > > > > > > size > > > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > > > E.g., > > > > we > > > > > > do > > > > > > >> not > > > > > > >> > >> >> worry > > > > > > >> > >> >> > > about > > > > > > >> > >> >> > > > > > > direct > > > > > > >> > >> >> > > > > > > > > OOM, > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > > > managed > > > > > / > > > > > > >> > network > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, > there > > > are > > > > > > also > > > > > > >> > some > > > > > > >> > >> >> down > > > > > > >> > >> >> > > sides > > > > > > >> > >> >> > > > > of > > > > > > >> > >> >> > > > > > > > doing > > > > > > >> > >> >> > > > > > > > > > > this. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One > > thing I > > > > can > > > > > > >> think > > > > > > >> > >> of is > > > > > > >> > >> >> > that > > > > > > >> > >> >> > > > if > > > > > > >> > >> >> > > > > a > > > > > > >> > >> >> > > > > > > task > > > > > > >> > >> >> > > > > > > > > > > > executor > > > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed > due > > to > > > > > > >> overusing > > > > > > >> > >> >> memory, > > > > > > >> > >> >> > it > > > > > > >> > >> >> > > > > could > > > > > > >> > >> >> > > > > > > be > > > > > > >> > >> >> > > > > > > > > hard > > > > > > >> > >> >> > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > use > > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the > > memory > > > > is > > > > > > >> > overused. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > > > down > > > > > side > > > > > > >> is > > > > > > >> > >> that > > > > > > >> > >> >> the > > > > > > >> > >> >> > > JVM > > > > > > >> > >> >> > > > > > never > > > > > > >> > >> >> > > > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > > GC > > > > > > >> > >> >> > > > > > > > > > > > > due > > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct > > memory > > > > > > limit, > > > > > > >> > >> because > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > limit > > > > > > >> > >> >> > > > > > is > > > > > > >> > >> >> > > > > > > > too > > > > > > >> > >> >> > > > > > > > > > high > > > > > > >> > >> >> > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we > > kind > > > > of > > > > > > >> relay > > > > > > >> > on > > > > > > >> > >> >> heap > > > > > > >> > >> >> > > > memory > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > > GC > > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. > > That > > > > > could > > > > > > >> be a > > > > > > >> > >> >> problem > > > > > > >> > >> >> > in > > > > > > >> > >> >> > > > > cases > > > > > > >> > >> >> > > > > > > > where > > > > > > >> > >> >> > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but > > not > > > > > > enough > > > > > > >> > heap > > > > > > >> > >> >> > activity > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you > can > > > > share > > > > > > your > > > > > > >> > >> reasons > > > > > > >> > >> >> > for > > > > > > >> > >> >> > > > > > > preferring > > > > > > >> > >> >> > > > > > > > > > > > setting a > > > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > > > anything > > > > > > >> else > > > > > > >> > I > > > > > > >> > >> >> > > > overlooked. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > > > Calculation* > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is > > any > > > > > > conflict > > > > > > >> > >> between > > > > > > >> > >> >> > > > multiple > > > > > > >> > >> >> > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > > > specified, > > > > > > I > > > > > > >> > >> think we > > > > > > >> > >> >> > > should > > > > > > >> > >> >> > > > > > throw > > > > > > >> > >> >> > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > error. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > doing > > > > > checking > > > > > > >> on > > > > > > >> > the > > > > > > >> > >> >> > client > > > > > > >> > >> >> > > > side > > > > > > >> > >> >> > > > > > is > > > > > > >> > >> >> > > > > > > a > > > > > > >> > >> >> > > > > > > > > good > > > > > > >> > >> >> > > > > > > > > > > > idea, > > > > > > >> > >> >> > > > > > > > > > > > > > so > > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > > > discover > > > > > > the > > > > > > >> > >> problem > > > > > > >> > >> >> > > before > > > > > > >> > >> >> > > > > > > > submitting > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a > > good > > > > > > thing. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can > not > > > > only > > > > > > >> rely on > > > > > > >> > >> the > > > > > > >> > >> >> > > client > > > > > > >> > >> >> > > > > side > > > > > > >> > >> >> > > > > > > > > > checking, > > > > > > >> > >> >> > > > > > > > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > > > cluster > > > > > > >> > >> TaskManagers > > > > > > >> > >> >> on > > > > > > >> > >> >> > > > > > different > > > > > > >> > >> >> > > > > > > > > > machines > > > > > > >> > >> >> > > > > > > > > > > > may > > > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > configurations > > > and > > > > > the > > > > > > >> > client > > > > > > >> > >> >> does > > > > > > >> > >> >> > > see > > > > > > >> > >> >> > > > > > that. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > > > think? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug > 8, > > > > 2019 > > > > > at > > > > > > >> 5:09 > > > > > > >> > >> PM > > > > > > >> > >> >> Yang > > > > > > >> > >> >> > > > Wang > > > > > > >> > >> >> > > > > < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi > xintong, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > > > your > > > > > > >> detailed > > > > > > >> > >> >> > proposal. > > > > > > >> > >> >> > > > > After > > > > > > >> > >> >> > > > > > > all > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > introduced, > > it > > > > > will > > > > > > be > > > > > > >> > more > > > > > > >> > >> >> > > powerful > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > control > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have > > few > > > > > > >> questions > > > > > > >> > >> about > > > > > > >> > >> >> it. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - > Native > > > and > > > > > > Direct > > > > > > >> > >> Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > > > >> differentiate > > > > > > >> > >> user > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > native > > > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included > in > > > task > > > > > > >> off-heap > > > > > > >> > >> >> memory. > > > > > > >> > >> >> > > > > Right? > > > > > > >> > >> >> > > > > > > So i > > > > > > >> > >> >> > > > > > > > > > don’t > > > > > > >> > >> >> > > > > > > > > > > > > think > > > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > > > >> > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > properly. I > > > > > > >> > >> >> > > > > > > > prefer > > > > > > >> > >> >> > > > > > > > > > > > leaving > > > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - > Memory > > > > > > >> Calculation > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum > > of > > > > and > > > > > > >> > >> fine-grained > > > > > > >> > >> >> > > > > > > memory(network > > > > > > >> > >> >> > > > > > > > > > > memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger > > than > > > > > total > > > > > > >> > >> process > > > > > > >> > >> >> > > memory, > > > > > > >> > >> >> > > > > how > > > > > > >> > >> >> > > > > > do > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > deal > > > > > > >> > >> >> > > > > > > > > > > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > > > check > > > > > the > > > > > > >> > memory > > > > > > >> > >> >> > > > > configuration > > > > > > >> > >> >> > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > client? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong > > Song < > > > > > > >> > >> >> > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi > > everyone, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > > > like > > > > to > > > > > > >> start > > > > > > >> > a > > > > > > >> > >> >> > > discussion > > > > > > >> > >> >> > > > > > > thread > > > > > > >> > >> >> > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > > > for > > > > > > >> > >> >> > > > TaskExecutors"[1], > > > > > > >> > >> >> > > > > > > where > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > describe > > > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > > memory > > > > > > >> > >> >> > > configurations. > > > > > > >> > >> >> > > > > The > > > > > > >> > >> >> > > > > > > > FLIP > > > > > > >> > >> >> > > > > > > > > > > > document > > > > > > >> > >> >> > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early > > design > > > > > > "Memory > > > > > > >> > >> >> Management > > > > > > >> > >> >> > > and > > > > > > >> > >> >> > > > > > > > > > Configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with > > updates > > > > > from > > > > > > >> > >> follow-up > > > > > > >> > >> >> > > > > discussions > > > > > > >> > >> >> > > > > > > > both > > > > > > >> > >> >> > > > > > > > > > > online > > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This > FLIP > > > > > > addresses > > > > > > >> > >> several > > > > > > >> > >> >> > > > > > shortcomings > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > current > > > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > > memory > > > > > > >> > >> >> > > configuration. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Different > > > > > > >> > >> configuration > > > > > > >> > >> >> > for > > > > > > >> > >> >> > > > > > > Streaming > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Complex > > > > and > > > > > > >> > >> difficult > > > > > > >> > >> >> > > > > > configuration > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > > > >> > >> >> > > > > > > > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > > > Complicated, > > > > > > >> > >> uncertain > > > > > > >> > >> >> and > > > > > > >> > >> >> > > > hard > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > > > > understand. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key > > changes > > > to > > > > > > solve > > > > > > >> > the > > > > > > >> > >> >> > problems > > > > > > >> > >> >> > > > can > > > > > > >> > >> >> > > > > > be > > > > > > >> > >> >> > > > > > > > > > > summarized > > > > > > >> > >> >> > > > > > > > > > > > > as > > > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Extend > > > > > memory > > > > > > >> > >> manager > > > > > > >> > >> >> to > > > > > > >> > >> >> > > also > > > > > > >> > >> >> > > > > > > account > > > > > > >> > >> >> > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Modify > > > > how > > > > > > >> > >> TaskExecutor > > > > > > >> > >> >> > > memory > > > > > > >> > >> >> > > > > is > > > > > > >> > >> >> > > > > > > > > > > partitioned > > > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > memory > > > > > > >> reservations > > > > > > >> > >> and > > > > > > >> > >> >> > pools. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Simplify > > > > > > memory > > > > > > >> > >> >> > > configuration > > > > > > >> > >> >> > > > > > > options > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please > > find > > > > more > > > > > > >> > details > > > > > > >> > >> in > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > FLIP > > > > > > >> > >> >> > > > > > wiki > > > > > > >> > >> >> > > > > > > > > > > document > > > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please > > note > > > > > that > > > > > > >> the > > > > > > >> > >> early > > > > > > >> > >> >> > > design > > > > > > >> > >> >> > > > > doc > > > > > > >> > >> >> > > > > > > [2] > > > > > > >> > >> >> > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > out > > > > > > >> > >> >> > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > appreciated > > > to > > > > > > have > > > > > > >> the > > > > > > >> > >> >> > > discussion > > > > > > >> > >> >> > > > in > > > > > > >> > >> >> > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > mailing > > > > > > >> > >> >> > > > > > > > > > > > > list > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > > > forward > > > > > to > > > > > > >> your > > > > > > >> > >> >> > > feedbacks. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank > you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong > > Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 > > PM > > > > > > Xintong > > > > > > >> > Song > > > > > > >> > >> < > > > > > > >> > >> >> > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thanks for sharing your > > > opinion > > > > > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > I'm also in favor of > > > alternative > > > > > 2. > > > > > > I > > > > > > >> was > > > > > > >> > >> >> > wondering > > > > > > >> > >> >> > > > > > whether > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > can > > > > > > >> > >> >> > > > > > > > > > > > > avoid > > > > > > >> > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() > for > > > > > off-heap > > > > > > >> > >> managed > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > and > > > > > > >> > >> >> > > > > > > > network > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3. But after > > > giving > > > > > it a > > > > > > >> > second > > > > > > >> > >> >> > > thought, > > > > > > >> > >> >> > > > I > > > > > > >> > >> >> > > > > > > think > > > > > > >> > >> >> > > > > > > > > even > > > > > > >> > >> >> > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > > > alternative 3 using direct > > > > memory > > > > > > for > > > > > > >> > >> off-heap > > > > > > >> > >> >> > > > managed > > > > > > >> > >> >> > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > cause > > > > > > >> > >> >> > > > > > > > > > > > > > > problems. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Hi Yang, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Regarding your concern, I > > > think > > > > > what > > > > > > >> > >> proposed > > > > > > >> > >> >> in > > > > > > >> > >> >> > > this > > > > > > >> > >> >> > > > > > FLIP > > > > > > >> > >> >> > > > > > > it > > > > > > >> > >> >> > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > both > > > > > > >> > >> >> > > > > > > > > > > > > > > off-heap managed memory > and > > > > > network > > > > > > >> > memory > > > > > > >> > >> >> > > allocated > > > > > > >> > >> >> > > > > > > through > > > > > > >> > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which > > means > > > > > they > > > > > > >> are > > > > > > >> > >> >> > practically > > > > > > >> > >> >> > > > > > native > > > > > > >> > >> >> > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > limited by JVM max direct > > > > memory. > > > > > > The > > > > > > >> > only > > > > > > >> > >> >> parts > > > > > > >> > >> >> > of > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > > > limited > > > > > > >> > >> >> > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > max direct memory are task > > > > > off-heap > > > > > > >> > memory > > > > > > >> > >> and > > > > > > >> > >> >> > JVM > > > > > > >> > >> >> > > > > > > overhead, > > > > > > >> > >> >> > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > exactly alternative 2 > > suggests > > > > to > > > > > > set > > > > > > >> the > > > > > > >> > >> JVM > > > > > > >> > >> >> max > > > > > > >> > >> >> > > > > direct > > > > > > >> > >> >> > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > to. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > 1:48 > > > PM > > > > > Till > > > > > > >> > >> Rohrmann > > > > > > >> > >> >> < > > > > > > >> > >> >> > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Thanks for the > > clarification > > > > > > >> Xintong. I > > > > > > >> > >> >> > > understand > > > > > > >> > >> >> > > > > the > > > > > > >> > >> >> > > > > > > two > > > > > > >> > >> >> > > > > > > > > > > > > alternatives > > > > > > >> > >> >> > > > > > > > > > > > > > > > now. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > I would be in favour of > > > > option 2 > > > > > > >> > because > > > > > > >> > >> it > > > > > > >> > >> >> > makes > > > > > > >> > >> >> > > > > > things > > > > > > >> > >> >> > > > > > > > > > > explicit. > > > > > > >> > >> >> > > > > > > > > > > > If > > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > don't limit the direct > > > > memory, I > > > > > > >> fear > > > > > > >> > >> that > > > > > > >> > >> >> we > > > > > > >> > >> >> > > might > > > > > > >> > >> >> > > > > end > > > > > > >> > >> >> > > > > > > up > > > > > > >> > >> >> > > > > > > > > in a > > > > > > >> > >> >> > > > > > > > > > > > > similar > > > > > > >> > >> >> > > > > > > > > > > > > > > > situation as we are > > > currently > > > > > in: > > > > > > >> The > > > > > > >> > >> user > > > > > > >> > >> >> > might > > > > > > >> > >> >> > > > see > > > > > > >> > >> >> > > > > > that > > > > > > >> > >> >> > > > > > > > her > > > > > > >> > >> >> > > > > > > > > > > > process > > > > > > >> > >> >> > > > > > > > > > > > > > > gets > > > > > > >> > >> >> > > > > > > > > > > > > > > > killed by the OS and > does > > > not > > > > > know > > > > > > >> why > > > > > > >> > >> this > > > > > > >> > >> >> is > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > case. > > > > > > >> > >> >> > > > > > > > > > > > > Consequently, > > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease the > > > process > > > > > > memory > > > > > > >> > size > > > > > > >> > >> >> > > (similar > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > > > increasing > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > cutoff > > > > > > >> > >> >> > > > > > > > > > > > > > > > ratio) in order to > > > accommodate > > > > > for > > > > > > >> the > > > > > > >> > >> extra > > > > > > >> > >> >> > > direct > > > > > > >> > >> >> > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > > Even > > > > > > >> > >> >> > > > > > > > > > > > > worse, > > > > > > >> > >> >> > > > > > > > > > > > > > > she > > > > > > >> > >> >> > > > > > > > > > > > > > > > tries to decrease memory > > > > budgets > > > > > > >> which > > > > > > >> > >> are > > > > > > >> > >> >> not > > > > > > >> > >> >> > > > fully > > > > > > >> > >> >> > > > > > used > > > > > > >> > >> >> > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > hence > > > > > > >> > >> >> > > > > > > > > > > > > > won't > > > > > > >> > >> >> > > > > > > > > > > > > > > > change the overall > memory > > > > > > >> consumption. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at > > > 11:01 > > > > AM > > > > > > >> > Xintong > > > > > > >> > >> >> Song < > > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let me explain this > > with a > > > > > > >> concrete > > > > > > >> > >> >> example > > > > > > >> > >> >> > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Let's say we have the > > > > > following > > > > > > >> > >> scenario. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Total Process Memory: > > 1GB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory > (Task > > > > > Off-Heap > > > > > > >> > >> Memory + > > > > > > >> > >> >> JVM > > > > > > >> > >> >> > > > > > > Overhead): > > > > > > >> > >> >> > > > > > > > > > 200MB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap > > > > Memory, > > > > > > JVM > > > > > > >> > >> >> Metaspace, > > > > > > >> > >> >> > > > > > Off-Heap > > > > > > >> > >> >> > > > > > > > > > Managed > > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 2, we > > set > > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > 200MB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > For alternative 3, we > > set > > > > > > >> > >> >> > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > a > > > > > > >> > >> >> > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > value, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > > memory > > > > > > usage > > > > > > >> of > > > > > > >> > >> Task > > > > > > >> > >> >> > > > Off-Heap > > > > > > >> > >> >> > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > Overhead > > > > > > >> > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, > > then > > > > > > >> > alternative 2 > > > > > > >> > >> >> and > > > > > > >> > >> >> > > > > > > alternative 3 > > > > > > >> > >> >> > > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > same utility. Setting > > > larger > > > > > > >> > >> >> > > > > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > > > > will > > > > > > >> > >> >> > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > reduce > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > sizes of the other > > memory > > > > > pools. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > If the actual direct > > > memory > > > > > > usage > > > > > > >> of > > > > > > >> > >> Task > > > > > > >> > >> >> > > > Off-Heap > > > > > > >> > >> >> > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Overhead potentially > > > exceed > > > > > > 200MB, > > > > > > >> > then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - Alternative 2 > > suffers > > > > > from > > > > > > >> > >> frequent > > > > > > >> > >> >> OOM. > > > > > > >> > >> >> > > To > > > > > > >> > >> >> > > > > > avoid > > > > > > >> > >> >> > > > > > > > > that, > > > > > > >> > >> >> > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > only > > > > > > >> > >> >> > > > > > > > > > > > > > > > thing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > user can do is to > > > modify > > > > > the > > > > > > >> > >> >> configuration > > > > > > >> > >> >> > > and > > > > > > >> > >> >> > > > > > > > increase > > > > > > >> > >> >> > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > Direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap > > Memory + > > > > JVM > > > > > > >> > >> Overhead). > > > > > > >> > >> >> > Let's > > > > > > >> > >> >> > > > say > > > > > > >> > >> >> > > > > > > that > > > > > > >> > >> >> > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > increases > > > > > > >> > >> >> > > > > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Direct Memory to > > 250MB, > > > > > this > > > > > > >> will > > > > > > >> > >> >> reduce > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > > total > > > > > > >> > >> >> > > > > > > > size > > > > > > >> > >> >> > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > other > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > pools to 750MB, > given > > > the > > > > > > total > > > > > > >> > >> process > > > > > > >> > >> >> > > memory > > > > > > >> > >> >> > > > > > > remains > > > > > > >> > >> >> > > > > > > > > > 1GB. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > - For alternative > 3, > > > > there > > > > > is > > > > > > >> no > > > > > > >> > >> >> chance of > > > > > > >> > >> >> > > > > direct > > > > > > >> > >> >> > > > > > > OOM. > > > > > > >> > >> >> > > > > > > > > > There > > > > > > >> > >> >> > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > chances > > > > > > >> > >> >> > > > > > > > > > > > > > > > > of exceeding the > > total > > > > > > process > > > > > > >> > >> memory > > > > > > >> > >> >> > limit, > > > > > > >> > >> >> > > > but > > > > > > >> > >> >> > > > > > > given > > > > > > >> > >> >> > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > process > > > > > > >> > >> >> > > > > > > > > > > > > > > > > may > > > > > > >> > >> >> > > > > > > > > > > > > > > > > not use up all the > > > > reserved > > > > > > >> native > > > > > > >> > >> >> memory > > > > > > >> > >> >> > > > > > (Off-Heap > > > > > > >> > >> >> > > > > > > > > > Managed > > > > > > >> > >> >> > > > > > > > > > > > > > Memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Network > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Memory, JVM > > Metaspace), > > > > if > > > > > > the > > > > > > >> > >> actual > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > slightly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > above > > > > > > >> > >> >> > > > > > > > > > > > > > > > > yet very close to > > > 200MB, > > > > > user > > > > > > >> > >> probably > > > > > > >> > >> >> do > > > > > > >> > >> >> > > not > > > > > > >> > >> >> > > > > need > > > > > > >> > >> >> > > > > > > to > > > > > > >> > >> >> > > > > > > > > > change > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > configurations. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Therefore, I think > from > > > the > > > > > > user's > > > > > > >> > >> >> > > perspective, a > > > > > > >> > >> >> > > > > > > > feasible > > > > > > >> > >> >> > > > > > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > for alternative 2 may > > lead > > > > to > > > > > > >> lower > > > > > > >> > >> >> resource > > > > > > >> > >> >> > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > > compared > > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 3. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 > at > > > > 10:28 > > > > > AM > > > > > > >> Till > > > > > > >> > >> >> > Rohrmann > > > > > > >> > >> >> > > < > > > > > > >> > >> >> > > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > I guess you have to > > help > > > > me > > > > > > >> > >> understand > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > > > difference > > > > > > >> > >> >> > > > > > > > > > between > > > > > > >> > >> >> > > > > > > > > > > > > > > > > alternative 2 > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory > > > under > > > > > > >> > utilization > > > > > > >> > >> >> > > Xintong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > Task > > > > > > >> > >> >> > > > > > > > > > Off-Heap > > > > > > >> > >> >> > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > JVM > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there > > is > > > > the > > > > > > risk > > > > > > >> > that > > > > > > >> > >> >> this > > > > > > >> > >> >> > > size > > > > > > >> > >> >> > > > > is > > > > > > >> > >> >> > > > > > > too > > > > > > >> > >> >> > > > > > > > > low > > > > > > >> > >> >> > > > > > > > > > > > > > resulting > > > > > > >> > >> >> > > > > > > > > > > > > > > > in a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > lot of garbage > > > collection > > > > > and > > > > > > >> > >> >> potentially > > > > > > >> > >> >> > an > > > > > > >> > >> >> > > > OOM. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > > > > > > >> > >> >> XX:MaxDirectMemorySize > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > > > > something > > > > > > >> > >> >> > > > > > > > > > > larger > > > > > > >> > >> >> > > > > > > > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 2. This > > > would > > > > of > > > > > > >> course > > > > > > >> > >> >> reduce > > > > > > >> > >> >> > > the > > > > > > >> > >> >> > > > > > sizes > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > other > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > types. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > How would > alternative > > 2 > > > > now > > > > > > >> result > > > > > > >> > >> in an > > > > > > >> > >> >> > > under > > > > > > >> > >> >> > > > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > compared to > > alternative > > > 3? > > > > > If > > > > > > >> > >> >> alternative 3 > > > > > > >> > >> >> > > > > > strictly > > > > > > >> > >> >> > > > > > > > > sets a > > > > > > >> > >> >> > > > > > > > > > > > > higher > > > > > > >> > >> >> > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct memory size > and > > > we > > > > > use > > > > > > >> only > > > > > > >> > >> >> little, > > > > > > >> > >> >> > > > then I > > > > > > >> > >> >> > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > expect > > > > > > >> > >> >> > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > alternative 3 > results > > in > > > > > > memory > > > > > > >> > under > > > > > > >> > >> >> > > > > utilization. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 > > at > > > > 4:19 > > > > > > PM > > > > > > >> > Yang > > > > > > >> > >> >> Wang < > > > > > > >> > >> >> > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Native and > Direct > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > My point is > setting > > a > > > > very > > > > > > >> large > > > > > > >> > >> max > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > size > > > > > > >> > >> >> > > > > > > > > > > when > > > > > > >> > >> >> > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > do > > > > > > >> > >> >> > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > differentiate > direct > > > and > > > > > > >> native > > > > > > >> > >> >> memory. > > > > > > >> > >> >> > If > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > memory,including > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > direct memory and > > > > > framework > > > > > > >> > direct > > > > > > >> > >> >> > > > memory,could > > > > > > >> > >> >> > > > > > be > > > > > > >> > >> >> > > > > > > > > > > calculated > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > correctly,then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > i am in favor of > > > setting > > > > > > >> direct > > > > > > >> > >> memory > > > > > > >> > >> >> > with > > > > > > >> > >> >> > > > > fixed > > > > > > >> > >> >> > > > > > > > > value. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Memory > Calculation > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > I agree with > > xintong. > > > > For > > > > > > Yarn > > > > > > >> > and > > > > > > >> > >> >> k8s,we > > > > > > >> > >> >> > > > need > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > check > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configurations in > > > client > > > > > to > > > > > > >> avoid > > > > > > >> > >> >> > > submitting > > > > > > >> > >> >> > > > > > > > > successfully > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > failing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > the flink master. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Best, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Yang > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > > > > > > >> > >> [hidden email] > > > > > > >> > >> >> > > > > >于2019年8月13日 > > > > > > >> > >> >> > > > > > > > > > 周二22:07写道: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thanks for > > replying, > > > > > Till. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About > > > MemorySegment, I > > > > > > think > > > > > > >> > you > > > > > > >> > >> are > > > > > > >> > >> >> > > right > > > > > > >> > >> >> > > > > that > > > > > > >> > >> >> > > > > > > we > > > > > > >> > >> >> > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > include > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > issue in the > scope > > > of > > > > > this > > > > > > >> > FLIP. > > > > > > >> > >> >> This > > > > > > >> > >> >> > > FLIP > > > > > > >> > >> >> > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > concentrate > > > > > > >> > >> >> > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > configure memory > > > pools > > > > > for > > > > > > >> > >> >> > TaskExecutors, > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > > > minimum > > > > > > >> > >> >> > > > > > > > > > > > > > > involvement > > > > > > >> > >> >> > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > memory consumers > > use > > > > it. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > About direct > > > memory, I > > > > > > think > > > > > > >> > >> >> > alternative > > > > > > >> > >> >> > > 3 > > > > > > >> > >> >> > > > > may > > > > > > >> > >> >> > > > > > > not > > > > > > >> > >> >> > > > > > > > > > having > > > > > > >> > >> >> > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > same > > > > > > >> > >> >> > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > reservation > issue > > > that > > > > > > >> > >> alternative 2 > > > > > > >> > >> >> > > does, > > > > > > >> > >> >> > > > > but > > > > > > >> > >> >> > > > > > at > > > > > > >> > >> >> > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > cost > > > > > > >> > >> >> > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > risk > > > > > > >> > >> >> > > > > > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > using memory at > > the > > > > > > >> container > > > > > > >> > >> level, > > > > > > >> > >> >> > > which > > > > > > >> > >> >> > > > is > > > > > > >> > >> >> > > > > > not > > > > > > >> > >> >> > > > > > > > > good. > > > > > > >> > >> >> > > > > > > > > > > My > > > > > > >> > >> >> > > > > > > > > > > > > > point > > > > > > >> > >> >> > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > both "Task > > Off-Heap > > > > > > Memory" > > > > > > >> and > > > > > > >> > >> "JVM > > > > > > >> > >> >> > > > > Overhead" > > > > > > >> > >> >> > > > > > > are > > > > > > >> > >> >> > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > easy > > > > > > >> > >> >> > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > config. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > For > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, > > users > > > > > might > > > > > > >> > >> configure > > > > > > >> > >> >> > them > > > > > > >> > >> >> > > > > > higher > > > > > > >> > >> >> > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > what > > > > > > >> > >> >> > > > > > > > > > > > > > > actually > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > needed, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > just to avoid > > > getting > > > > a > > > > > > >> direct > > > > > > >> > >> OOM. > > > > > > >> > >> >> For > > > > > > >> > >> >> > > > > > > alternative > > > > > > >> > >> >> > > > > > > > > 3, > > > > > > >> > >> >> > > > > > > > > > > > users > > > > > > >> > >> >> > > > > > > > > > > > > do > > > > > > >> > >> >> > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > get > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so > > they > > > > may > > > > > > not > > > > > > >> > >> config > > > > > > >> > >> >> the > > > > > > >> > >> >> > > two > > > > > > >> > >> >> > > > > > > options > > > > > > >> > >> >> > > > > > > > > > > > > aggressively > > > > > > >> > >> >> > > > > > > > > > > > > > > > high. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > But > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > the consequences > > are > > > > > risks > > > > > > >> of > > > > > > >> > >> >> overall > > > > > > >> > >> >> > > > > container > > > > > > >> > >> >> > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > > > > > > > exceeds > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > budget. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, > > 2019 > > > > at > > > > > > >> 9:39 AM > > > > > > >> > >> Till > > > > > > >> > >> >> > > > > Rohrmann < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for > > > proposing > > > > > > this > > > > > > >> > FLIP > > > > > > >> > >> >> > Xintong. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > All in all I > > think > > > > it > > > > > > >> already > > > > > > >> > >> >> looks > > > > > > >> > >> >> > > quite > > > > > > >> > >> >> > > > > > good. > > > > > > >> > >> >> > > > > > > > > > > > Concerning > > > > > > >> > >> >> > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > first > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > open > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > question about > > > > > > allocating > > > > > > >> > >> memory > > > > > > >> > >> >> > > > segments, > > > > > > >> > >> >> > > > > I > > > > > > >> > >> >> > > > > > > was > > > > > > >> > >> >> > > > > > > > > > > > wondering > > > > > > >> > >> >> > > > > > > > > > > > > > > > whether > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > strictly > > necessary > > > > to > > > > > do > > > > > > >> in > > > > > > >> > the > > > > > > >> > >> >> > context > > > > > > >> > >> >> > > > of > > > > > > >> > >> >> > > > > > this > > > > > > >> > >> >> > > > > > > > > FLIP > > > > > > >> > >> >> > > > > > > > > > or > > > > > > >> > >> >> > > > > > > > > > > > > > whether > > > > > > >> > >> >> > > > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be done as a > > > follow > > > > > up? > > > > > > >> > Without > > > > > > >> > >> >> > knowing > > > > > > >> > >> >> > > > all > > > > > > >> > >> >> > > > > > > > > details, > > > > > > >> > >> >> > > > > > > > > > I > > > > > > >> > >> >> > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > concerned > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > that we would > > > widen > > > > > the > > > > > > >> scope > > > > > > >> > >> of > > > > > > >> > >> >> this > > > > > > >> > >> >> > > > FLIP > > > > > > >> > >> >> > > > > > too > > > > > > >> > >> >> > > > > > > > much > > > > > > >> > >> >> > > > > > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > to touch all > the > > > > > > existing > > > > > > >> > call > > > > > > >> > >> >> sites > > > > > > >> > >> >> > of > > > > > > >> > >> >> > > > the > > > > > > >> > >> >> > > > > > > > > > > MemoryManager > > > > > > >> > >> >> > > > > > > > > > > > > > where > > > > > > >> > >> >> > > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > allocate > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > memory > segments > > > > (this > > > > > > >> should > > > > > > >> > >> >> mainly > > > > > > >> > >> >> > be > > > > > > >> > >> >> > > > > batch > > > > > > >> > >> >> > > > > > > > > > > operators). > > > > > > >> > >> >> > > > > > > > > > > > > The > > > > > > >> > >> >> > > > > > > > > > > > > > > > > addition > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the memory > > > > reservation > > > > > > >> call > > > > > > >> > to > > > > > > >> > >> the > > > > > > >> > >> >> > > > > > > MemoryManager > > > > > > >> > >> >> > > > > > > > > > should > > > > > > >> > >> >> > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > affected > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > this and I > would > > > > hope > > > > > > that > > > > > > >> > >> this is > > > > > > >> > >> >> > the > > > > > > >> > >> >> > > > only > > > > > > >> > >> >> > > > > > > point > > > > > > >> > >> >> > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > interaction > > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > streaming job > > > would > > > > > have > > > > > > >> with > > > > > > >> > >> the > > > > > > >> > >> >> > > > > > > MemoryManager. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the > > > > second > > > > > > open > > > > > > >> > >> >> question > > > > > > >> > >> >> > > about > > > > > > >> > >> >> > > > > > > setting > > > > > > >> > >> >> > > > > > > > > or > > > > > > >> > >> >> > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > setting > > > > > > >> > >> >> > > > > > > > > > > > > > > > a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > direct memory > > > > limit, I > > > > > > >> would > > > > > > >> > >> also > > > > > > >> > >> >> be > > > > > > >> > >> >> > > > > > interested > > > > > > >> > >> >> > > > > > > > why > > > > > > >> > >> >> > > > > > > > > > > Yang > > > > > > >> > >> >> > > > > > > > > > > > > Wang > > > > > > >> > >> >> > > > > > > > > > > > > > > > > thinks > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > leaving it > open > > > > would > > > > > be > > > > > > >> > best. > > > > > > >> > >> My > > > > > > >> > >> >> > > concern > > > > > > >> > >> >> > > > > > about > > > > > > >> > >> >> > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > would > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > be in a > similar > > > > > > situation > > > > > > >> as > > > > > > >> > we > > > > > > >> > >> >> are > > > > > > >> > >> >> > now > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > If > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > the different > > > memory > > > > > > pools > > > > > > >> > are > > > > > > >> > >> not > > > > > > >> > >> >> > > > clearly > > > > > > >> > >> >> > > > > > > > > separated > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > can > > > > > > >> > >> >> > > > > > > > > > > > > > > > spill > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > over > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > a different > > pool, > > > > then > > > > > > it > > > > > > >> is > > > > > > >> > >> quite > > > > > > >> > >> >> > hard > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > > understand > > > > > > >> > >> >> > > > > > > > > > > > what > > > > > > >> > >> >> > > > > > > > > > > > > > > > exactly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > causes a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > process to get > > > > killed > > > > > > for > > > > > > >> > using > > > > > > >> > >> >> too > > > > > > >> > >> >> > > much > > > > > > >> > >> >> > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > This > > > > > > >> > >> >> > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > then > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > easily > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > lead to a > > similar > > > > > > >> situation > > > > > > >> > >> what > > > > > > >> > >> >> we > > > > > > >> > >> >> > > have > > > > > > >> > >> >> > > > > with > > > > > > >> > >> >> > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > cutoff-ratio. > > > > > > >> > >> >> > > > > > > > > > > > > > > > So > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > why > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane > > > > default > > > > > > >> value > > > > > > >> > >> for > > > > > > >> > >> >> max > > > > > > >> > >> >> > > > direct > > > > > > >> > >> >> > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > giving > > > > > > >> > >> >> > > > > > > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > option to > > increase > > > > it > > > > > if > > > > > > >> he > > > > > > >> > >> runs > > > > > > >> > >> >> into > > > > > > >> > >> >> > > an > > > > > > >> > >> >> > > > > OOM. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how > > > would > > > > > > >> > >> alternative 2 > > > > > > >> > >> >> > lead > > > > > > >> > >> >> > > to > > > > > > >> > >> >> > > > > > lower > > > > > > >> > >> >> > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > utilization > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > than > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 > > > where > > > > we > > > > > > set > > > > > > >> > the > > > > > > >> > >> >> direct > > > > > > >> > >> >> > > > > memory > > > > > > >> > >> >> > > > > > > to a > > > > > > >> > >> >> > > > > > > > > > > higher > > > > > > >> > >> >> > > > > > > > > > > > > > value? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Till > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, > > > 2019 > > > > at > > > > > > >> 9:12 > > > > > > >> > AM > > > > > > >> > >> >> > Xintong > > > > > > >> > >> >> > > > > Song < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for > the > > > > > > feedback, > > > > > > >> > >> Yang. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding > your > > > > > > comments: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and > > > Direct > > > > > > >> Memory* > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > > setting > > > a > > > > > very > > > > > > >> > large > > > > > > >> > >> max > > > > > > >> > >> >> > > direct > > > > > > >> > >> >> > > > > > > memory > > > > > > >> > >> >> > > > > > > > > size > > > > > > >> > >> >> > > > > > > > > > > > > > > definitely > > > > > > >> > >> >> > > > > > > > > > > > > > > > > has > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > some > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. > > > E.g., > > > > we > > > > > > do > > > > > > >> not > > > > > > >> > >> >> worry > > > > > > >> > >> >> > > about > > > > > > >> > >> >> > > > > > > direct > > > > > > >> > >> >> > > > > > > > > OOM, > > > > > > >> > >> >> > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > don't > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > even > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > need > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate > > > > managed > > > > > / > > > > > > >> > network > > > > > > >> > >> >> > memory > > > > > > >> > >> >> > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > However, > there > > > are > > > > > > also > > > > > > >> > some > > > > > > >> > >> >> down > > > > > > >> > >> >> > > sides > > > > > > >> > >> >> > > > > of > > > > > > >> > >> >> > > > > > > > doing > > > > > > >> > >> >> > > > > > > > > > > this. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - One > > thing I > > > > can > > > > > > >> think > > > > > > >> > >> of is > > > > > > >> > >> >> > that > > > > > > >> > >> >> > > > if > > > > > > >> > >> >> > > > > a > > > > > > >> > >> >> > > > > > > task > > > > > > >> > >> >> > > > > > > > > > > > executor > > > > > > >> > >> >> > > > > > > > > > > > > > > > > container > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > killed > due > > to > > > > > > >> overusing > > > > > > >> > >> >> memory, > > > > > > >> > >> >> > it > > > > > > >> > >> >> > > > > could > > > > > > >> > >> >> > > > > > > be > > > > > > >> > >> >> > > > > > > > > hard > > > > > > >> > >> >> > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > use > > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > know > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > part > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > of the > > memory > > > > is > > > > > > >> > overused. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > - Another > > > down > > > > > side > > > > > > >> is > > > > > > >> > >> that > > > > > > >> > >> >> the > > > > > > >> > >> >> > > JVM > > > > > > >> > >> >> > > > > > never > > > > > > >> > >> >> > > > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > > GC > > > > > > >> > >> >> > > > > > > > > > > > > due > > > > > > >> > >> >> > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > reaching > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > max > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > direct > > memory > > > > > > limit, > > > > > > >> > >> because > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > limit > > > > > > >> > >> >> > > > > > is > > > > > > >> > >> >> > > > > > > > too > > > > > > >> > >> >> > > > > > > > > > high > > > > > > >> > >> >> > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > be > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > reached. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > That > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > means we > > kind > > > > of > > > > > > >> relay > > > > > > >> > on > > > > > > >> > >> >> heap > > > > > > >> > >> >> > > > memory > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > > GC > > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > release > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory. > > That > > > > > could > > > > > > >> be a > > > > > > >> > >> >> problem > > > > > > >> > >> >> > in > > > > > > >> > >> >> > > > > cases > > > > > > >> > >> >> > > > > > > > where > > > > > > >> > >> >> > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > more > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > direct > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > usage but > > not > > > > > > enough > > > > > > >> > heap > > > > > > >> > >> >> > activity > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > trigger > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > GC. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you > can > > > > share > > > > > > your > > > > > > >> > >> reasons > > > > > > >> > >> >> > for > > > > > > >> > >> >> > > > > > > preferring > > > > > > >> > >> >> > > > > > > > > > > > setting a > > > > > > >> > >> >> > > > > > > > > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > value, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > if there are > > > > > anything > > > > > > >> else > > > > > > >> > I > > > > > > >> > >> >> > > > overlooked. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory > > > > Calculation* > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > If there is > > any > > > > > > conflict > > > > > > >> > >> between > > > > > > >> > >> >> > > > multiple > > > > > > >> > >> >> > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > user > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly > > > > > specified, > > > > > > I > > > > > > >> > >> think we > > > > > > >> > >> >> > > should > > > > > > >> > >> >> > > > > > throw > > > > > > >> > >> >> > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > error. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > I think > doing > > > > > checking > > > > > > >> on > > > > > > >> > the > > > > > > >> > >> >> > client > > > > > > >> > >> >> > > > side > > > > > > >> > >> >> > > > > > is > > > > > > >> > >> >> > > > > > > a > > > > > > >> > >> >> > > > > > > > > good > > > > > > >> > >> >> > > > > > > > > > > > idea, > > > > > > >> > >> >> > > > > > > > > > > > > > so > > > > > > >> > >> >> > > > > > > > > > > > > > > > that > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can > > > > discover > > > > > > the > > > > > > >> > >> problem > > > > > > >> > >> >> > > before > > > > > > >> > >> >> > > > > > > > submitting > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > Flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > cluster, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > which > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > is always a > > good > > > > > > thing. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > But we can > not > > > > only > > > > > > >> rely on > > > > > > >> > >> the > > > > > > >> > >> >> > > client > > > > > > >> > >> >> > > > > side > > > > > > >> > >> >> > > > > > > > > > checking, > > > > > > >> > >> >> > > > > > > > > > > > > > because > > > > > > >> > >> >> > > > > > > > > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > standalone > > > cluster > > > > > > >> > >> TaskManagers > > > > > > >> > >> >> on > > > > > > >> > >> >> > > > > > different > > > > > > >> > >> >> > > > > > > > > > machines > > > > > > >> > >> >> > > > > > > > > > > > may > > > > > > >> > >> >> > > > > > > > > > > > > > > have > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > different > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > configurations > > > and > > > > > the > > > > > > >> > client > > > > > > >> > >> >> does > > > > > > >> > >> >> > > see > > > > > > >> > >> >> > > > > > that. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > What do you > > > think? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug > 8, > > > > 2019 > > > > > at > > > > > > >> 5:09 > > > > > > >> > >> PM > > > > > > >> > >> >> Yang > > > > > > >> > >> >> > > > Wang > > > > > > >> > >> >> > > > > < > > > > > > >> > >> >> > > > > > > > > > > > > > > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi > xintong, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for > > > your > > > > > > >> detailed > > > > > > >> > >> >> > proposal. > > > > > > >> > >> >> > > > > After > > > > > > >> > >> >> > > > > > > all > > > > > > >> > >> >> > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > introduced, > > it > > > > > will > > > > > > be > > > > > > >> > more > > > > > > >> > >> >> > > powerful > > > > > > >> > >> >> > > > to > > > > > > >> > >> >> > > > > > > > control > > > > > > >> > >> >> > > > > > > > > > the > > > > > > >> > >> >> > > > > > > > > > > > > flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > just have > > few > > > > > > >> questions > > > > > > >> > >> about > > > > > > >> > >> >> it. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - > Native > > > and > > > > > > Direct > > > > > > >> > >> Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not > > > > > > >> differentiate > > > > > > >> > >> user > > > > > > >> > >> >> > direct > > > > > > >> > >> >> > > > > > memory > > > > > > >> > >> >> > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > native > > > > > > >> > >> >> > > > > > > > > > > > > > > memory. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > They > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > are > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > all > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > included > in > > > task > > > > > > >> off-heap > > > > > > >> > >> >> memory. > > > > > > >> > >> >> > > > > Right? > > > > > > >> > >> >> > > > > > > So i > > > > > > >> > >> >> > > > > > > > > > don’t > > > > > > >> > >> >> > > > > > > > > > > > > think > > > > > > >> > >> >> > > > > > > > > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > could > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > not > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > set > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > the > > > > > > >> > -XX:MaxDirectMemorySize > > > > > > >> > >> >> > > > properly. I > > > > > > >> > >> >> > > > > > > > prefer > > > > > > >> > >> >> > > > > > > > > > > > leaving > > > > > > >> > >> >> > > > > > > > > > > > > > it a > > > > > > >> > >> >> > > > > > > > > > > > > > > > > very > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > large > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > - > Memory > > > > > > >> Calculation > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum > > of > > > > and > > > > > > >> > >> fine-grained > > > > > > >> > >> >> > > > > > > memory(network > > > > > > >> > >> >> > > > > > > > > > > memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > managed > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > memory, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger > > than > > > > > total > > > > > > >> > >> process > > > > > > >> > >> >> > > memory, > > > > > > >> > >> >> > > > > how > > > > > > >> > >> >> > > > > > do > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > deal > > > > > > >> > >> >> > > > > > > > > > > > > with > > > > > > >> > >> >> > > > > > > > > > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > situation? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > Do > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to > > > check > > > > > the > > > > > > >> > memory > > > > > > >> > >> >> > > > > configuration > > > > > > >> > >> >> > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > client? > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong > > Song < > > > > > > >> > >> >> > > [hidden email]> > > > > > > >> > >> >> > > > > > > > > > 于2019年8月7日周三 > > > > > > >> > >> >> > > > > > > > > > > > > > > 下午10:14写道: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi > > everyone, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would > > > like > > > > to > > > > > > >> start > > > > > > >> > a > > > > > > >> > >> >> > > discussion > > > > > > >> > >> >> > > > > > > thread > > > > > > >> > >> >> > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > "FLIP-49: > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Unified > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > Memory > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > Configuration > > > > > for > > > > > > >> > >> >> > > > TaskExecutors"[1], > > > > > > >> > >> >> > > > > > > where > > > > > > >> > >> >> > > > > > > > we > > > > > > >> > >> >> > > > > > > > > > > > > describe > > > > > > >> > >> >> > > > > > > > > > > > > > > how > > > > > > >> > >> >> > > > > > > > > > > > > > > > to > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > improve > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > > memory > > > > > > >> > >> >> > > configurations. > > > > > > >> > >> >> > > > > The > > > > > > >> > >> >> > > > > > > > FLIP > > > > > > >> > >> >> > > > > > > > > > > > document > > > > > > >> > >> >> > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > mostly > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > based > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > on > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > an > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > early > > design > > > > > > "Memory > > > > > > >> > >> >> Management > > > > > > >> > >> >> > > and > > > > > > >> > >> >> > > > > > > > > > Configuration > > > > > > >> > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > with > > updates > > > > > from > > > > > > >> > >> follow-up > > > > > > >> > >> >> > > > > discussions > > > > > > >> > >> >> > > > > > > > both > > > > > > >> > >> >> > > > > > > > > > > online > > > > > > >> > >> >> > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > offline. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > This > FLIP > > > > > > addresses > > > > > > >> > >> several > > > > > > >> > >> >> > > > > > shortcomings > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > current > > > > > > >> > >> >> > > > > > > > > > > > > > > (Flink > > > > > > >> > >> >> > > > > > > > > > > > > > > > > 1.9) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor > > > > > > memory > > > > > > >> > >> >> > > configuration. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Different > > > > > > >> > >> configuration > > > > > > >> > >> >> > for > > > > > > >> > >> >> > > > > > > Streaming > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > Batch. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Complex > > > > and > > > > > > >> > >> difficult > > > > > > >> > >> >> > > > > > configuration > > > > > > >> > >> >> > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > RocksDB > > > > > > >> > >> >> > > > > > > > > > > > > in > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > Streaming. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > > > Complicated, > > > > > > >> > >> uncertain > > > > > > >> > >> >> and > > > > > > >> > >> >> > > > hard > > > > > > >> > >> >> > > > > to > > > > > > >> > >> >> > > > > > > > > > > understand. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key > > changes > > > to > > > > > > solve > > > > > > >> > the > > > > > > >> > >> >> > problems > > > > > > >> > >> >> > > > can > > > > > > >> > >> >> > > > > > be > > > > > > >> > >> >> > > > > > > > > > > summarized > > > > > > >> > >> >> > > > > > > > > > > > > as > > > > > > >> > >> >> > > > > > > > > > > > > > > > > follows. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Extend > > > > > memory > > > > > > >> > >> manager > > > > > > >> > >> >> to > > > > > > >> > >> >> > > also > > > > > > >> > >> >> > > > > > > account > > > > > > >> > >> >> > > > > > > > > for > > > > > > >> > >> >> > > > > > > > > > > > memory > > > > > > >> > >> >> > > > > > > > > > > > > > > usage > > > > > > >> > >> >> > > > > > > > > > > > > > > > > by > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > state > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > backends. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > Modify > > > > how > > > > > > >> > >> TaskExecutor > > > > > > >> > >> >> > > memory > > > > > > >> > >> >> > > > > is > > > > > > >> > >> >> > > > > > > > > > > partitioned > > > > > > >> > >> >> > > > > > > > > > > > > > > > accounted > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > individual > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > memory > > > > > > >> reservations > > > > > > >> > >> and > > > > > > >> > >> >> > pools. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > - > > > Simplify > > > > > > memory > > > > > > >> > >> >> > > configuration > > > > > > >> > >> >> > > > > > > options > > > > > > >> > >> >> > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > calculations > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > logics. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please > > find > > > > more > > > > > > >> > details > > > > > > >> > >> in > > > > > > >> > >> >> the > > > > > > >> > >> >> > > > FLIP > > > > > > >> > >> >> > > > > > wiki > > > > > > >> > >> >> > > > > > > > > > > document > > > > > > >> > >> >> > > > > > > > > > > > > [1]. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please > > note > > > > > that > > > > > > >> the > > > > > > >> > >> early > > > > > > >> > >> >> > > design > > > > > > >> > >> >> > > > > doc > > > > > > >> > >> >> > > > > > > [2] > > > > > > >> > >> >> > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > out > > > > > > >> > >> >> > > > > > > > > > > > of > > > > > > >> > >> >> > > > > > > > > > > > > > > sync, > > > > > > >> > >> >> > > > > > > > > > > > > > > > > and > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > it > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > is > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > appreciated > > > to > > > > > > have > > > > > > >> the > > > > > > >> > >> >> > > discussion > > > > > > >> > >> >> > > > in > > > > > > >> > >> >> > > > > > > this > > > > > > >> > >> >> > > > > > > > > > > mailing > > > > > > >> > >> >> > > > > > > > > > > > > list > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > thread.) > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking > > > > forward > > > > > to > > > > > > >> your > > > > > > >> > >> >> > > feedbacks. > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank > you~ > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong > > Song > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > > >> > >> >> > > > > > > > > > > > >> > >> >> > > > > > > > > > > >> > >> >> > > > > > > > > > >> > >> >> > > > > > > > > >> > >> >> > > > > > > > >> > >> >> > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |