[jira] [Created] (FLINK-19056) Investigate multipart upload performance regression

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19056) Investigate multipart upload performance regression

Shang Yuanchun (Jira)
Chesnay Schepler created FLINK-19056:
----------------------------------------

             Summary: Investigate multipart upload performance regression
                 Key: FLINK-19056
                 URL: https://issues.apache.org/jira/browse/FLINK-19056
             Project: Flink
          Issue Type: Task
          Components: Runtime / REST
    Affects Versions: 1.12.0
            Reporter: Chesnay Schepler
             Fix For: 1.12.0


When using Netty 4.1.50 the multipart upload of files is more than a 100 times slower in the {{FileUploadHandlerTest}}.

This test has traditionally been somewhat heavy, since it repeatedly tests the upload of 60mb files.

On my machine this test currently finishes in 2-3 seconds, but with the upgraded Netty version it runs for several _minutes_ instead. I have not verified yet whether this is purely an issue of the test, but I would consider it unlikely.

This would make Flink effectively unusable when uploading larger jars or JobGraphs.

 

My theore is that is due to [this|https://github.com/netty/netty/pull/10226] change in Netty.

Before this change, the {{HttpPostMultipartRequestDecoder}} was always creating unpooled heap buffers for _something_; after the change the buffer type is dependent on the input buffer. The input buffer is a direct one, so my conclusion is that with the upgrade we ended up allocating more direct buffers than we did previously.

 

One solution I found was to explicitly create an {{UnpooledByteBufAllocator}} for the {{RestServerEndpoint}} that prefers heap buffers, which results in the input buffer to be a heap buffer, and thus we are never allocating direct ones.

However, this should also imply that we are creating more heap buffers than we did in the previously; I don't know how much of a problem that is. It would seem a reasonable thing to do since we at least should be able to skip a bunch of memory copies?

 

On a somewhat related note, we could think about increasing the chunkSize from 8kb to 64kb to reduce the GC pressure a bit, along with some arenas for the REST API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)