Hi everybody,
Does Flink do any kind of compression of the data that is send across the network? Best, Viktor |
Hi,
currently, we are not compressing the network data. However, it should be easy to add support for that. Ufuk can probably elaborate on the details, but he told me some time ago that we can add code to compress outgoing network buffers using snappy or other fast compression algorithms. In particular I/O intensive applications would benefit from such a change. Robert On Fri, Nov 21, 2014 at 1:12 PM, Viktor Rosenfeld < [hidden email]> wrote: > Hi everybody, > > Does Flink do any kind of compression of the data that is send across the > network? > > Best, > Viktor > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568.html > Sent from the Apache Flink (Incubator) Mailing List archive. mailing list > archive at Nabble.com. > |
On 23 Nov 2014, at 21:04, Robert Metzger <[hidden email]> wrote: > Ufuk can probably elaborate on the details, but he told me some time ago > that we can add code to compress outgoing network buffers using snappy or > other fast compression algorithms. In particular I/O intensive applications > would benefit from such a change. I agree and think that compressing shuffle data is generally a good idea (given codecs like Snappy et al.) I've opened a corresponding issue (https://issues.apache.org/jira/browse/FLINK-1275). The network layer is going through a set of big changes at the moment. I think it makes sense to wait for them to be finished/merged, before working on this. @Viktor: If you want to work on this, feel free to comment on the JIRA issue and we can discuss at which points you would have to extend the system. – Ufuk |
Will the compression Codec will be inserted in the Netty loops, or before
that? In any case, would it make sense to prototype this on the current code and forward port this to the new network stack later? I assume the code would mostly be similar, especially all the JNI vs. Java considerations and tricks. Am 23.11.2014 23:11 schrieb "Ufuk Celebi" <[hidden email]>: > > On 23 Nov 2014, at 21:04, Robert Metzger <[hidden email]> wrote: > > > Ufuk can probably elaborate on the details, but he told me some time ago > > that we can add code to compress outgoing network buffers using snappy or > > other fast compression algorithms. In particular I/O intensive > applications > > would benefit from such a change. > > I agree and think that compressing shuffle data is generally a good idea > (given codecs like Snappy et al.) > > I've opened a corresponding issue ( > https://issues.apache.org/jira/browse/FLINK-1275). The network layer is > going through a set of big changes at the moment. I think it makes sense to > wait for them to be finished/merged, before working on this. > > @Viktor: If you want to work on this, feel free to comment on the JIRA > issue and we can discuss at which points you would have to extend the > system. > > – Ufuk |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 What reduction in network traffic can we expect from using compression? - -s On 11/24/2014 10:20 AM, Stephan Ewen wrote: > Will the compression Codec will be inserted in the Netty loops, or > before that? > > In any case, would it make sense to prototype this on the current > code and forward port this to the new network stack later? I assume > the code would mostly be similar, especially all the JNI vs. Java > considerations and tricks. Am 23.11.2014 23:11 schrieb "Ufuk > Celebi" <[hidden email]>: > >> >> On 23 Nov 2014, at 21:04, Robert Metzger <[hidden email]> >> wrote: >> >>> Ufuk can probably elaborate on the details, but he told me some >>> time ago that we can add code to compress outgoing network >>> buffers using snappy or other fast compression algorithms. In >>> particular I/O intensive >> applications >>> would benefit from such a change. >> >> I agree and think that compressing shuffle data is generally a >> good idea (given codecs like Snappy et al.) >> >> I've opened a corresponding issue ( >> https://issues.apache.org/jira/browse/FLINK-1275). The network >> layer is going through a set of big changes at the moment. I >> think it makes sense to wait for them to be finished/merged, >> before working on this. >> >> @Viktor: If you want to work on this, feel free to comment on the >> JIRA issue and we can discuss at which points you would have to >> extend the system. >> >> – Ufuk > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJUcvjyAAoJEFWNjCUTnssiCYgQAK5QzsYPW9Q+C8BUJBxfeudg 61zFPW3cfydfD4L6UXghwz/j7addMm5zdY/xJsFgmN4NAXrtacDfr6+KPyB9glZ0 E8E201q8NN6Czek/TtWEG0SSLj+KXqRAwJBHfBCVaCV/lD7wEAjef2sSFpVL5RK9 0RyS5UeCBW/RxTp/fGXWfy2d9G0SSTIy78/u8Y9nxflD6KnZVGZYZA7jrP/h8Wis ry1ZWLPHKpEbaC+MNbYX1mMJaf9lH0liJ/HopCIUzrxfiwskub17NkfVC/0Uwwc3 r1+/pHFYNfO3FLIkXxWYRLgNyGW6LCE1zmM/BGUo1/C9/rtuiRl24io+G+4V/ac6 tZ9jxjwHQkWGFscoep2p7E8pGVezV1QqYv3prWXaZ2+TUKKzlbd6QCL8i2qmymeb ABy88cpN1U4c/p6GDrva1+llyBG1PK07awAfFmK3OA7VK3MLW1pyHuJiLjOZaFP1 r2K9T5/d2YZa7nPFaJs8bLXsV/WMzFJDAOaILcaCDymxF1xnNu7iwAyuJhg/i6ht KaOiM45JnqC++9MQ2D83zyt/+Yc/OT+06hVN7hdRaLFD+OqG7vNv7LZC+cfAWmFE KyJb9kXNVDxhJXAWyOyKN1nzBtDfaugghGBzvX0u3QP/2uGX1Gca2oqCJvNesOM4 dBlCMt88/gkYjWXadjnK =0q6A -----END PGP SIGNATURE----- |
Always depends on the data - we need to measure that. From what I have
heard in Hadoop, anywhere between none and a factor of 3-4 may happen. Especially if the records contain repetitive strings... Am 24.11.2014 10:24 schrieb "Sebastian Schelter" <[hidden email]>: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > What reduction in network traffic can we expect from using compression? > > - -s > > > > On 11/24/2014 10:20 AM, Stephan Ewen wrote: > > Will the compression Codec will be inserted in the Netty loops, or > > before that? > > > > In any case, would it make sense to prototype this on the current > > code and forward port this to the new network stack later? I assume > > the code would mostly be similar, especially all the JNI vs. Java > > considerations and tricks. Am 23.11.2014 23:11 schrieb "Ufuk > > Celebi" <[hidden email]>: > > > >> > >> On 23 Nov 2014, at 21:04, Robert Metzger <[hidden email]> > >> wrote: > >> > >>> Ufuk can probably elaborate on the details, but he told me some > >>> time ago that we can add code to compress outgoing network > >>> buffers using snappy or other fast compression algorithms. In > >>> particular I/O intensive > >> applications > >>> would benefit from such a change. > >> > >> I agree and think that compressing shuffle data is generally a > >> good idea (given codecs like Snappy et al.) > >> > >> I've opened a corresponding issue ( > >> https://issues.apache.org/jira/browse/FLINK-1275). The network > >> layer is going through a set of big changes at the moment. I > >> think it makes sense to wait for them to be finished/merged, > >> before working on this. > >> > >> @Viktor: If you want to work on this, feel free to comment on the > >> JIRA issue and we can discuss at which points you would have to > >> extend the system. > >> > >> – Ufuk > > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQIcBAEBAgAGBQJUcvjyAAoJEFWNjCUTnssiCYgQAK5QzsYPW9Q+C8BUJBxfeudg > 61zFPW3cfydfD4L6UXghwz/j7addMm5zdY/xJsFgmN4NAXrtacDfr6+KPyB9glZ0 > E8E201q8NN6Czek/TtWEG0SSLj+KXqRAwJBHfBCVaCV/lD7wEAjef2sSFpVL5RK9 > 0RyS5UeCBW/RxTp/fGXWfy2d9G0SSTIy78/u8Y9nxflD6KnZVGZYZA7jrP/h8Wis > ry1ZWLPHKpEbaC+MNbYX1mMJaf9lH0liJ/HopCIUzrxfiwskub17NkfVC/0Uwwc3 > r1+/pHFYNfO3FLIkXxWYRLgNyGW6LCE1zmM/BGUo1/C9/rtuiRl24io+G+4V/ac6 > tZ9jxjwHQkWGFscoep2p7E8pGVezV1QqYv3prWXaZ2+TUKKzlbd6QCL8i2qmymeb > ABy88cpN1U4c/p6GDrva1+llyBG1PK07awAfFmK3OA7VK3MLW1pyHuJiLjOZaFP1 > r2K9T5/d2YZa7nPFaJs8bLXsV/WMzFJDAOaILcaCDymxF1xnNu7iwAyuJhg/i6ht > KaOiM45JnqC++9MQ2D83zyt/+Yc/OT+06hVN7hdRaLFD+OqG7vNv7LZC+cfAWmFE > KyJb9kXNVDxhJXAWyOyKN1nzBtDfaugghGBzvX0u3QP/2uGX1Gca2oqCJvNesOM4 > dBlCMt88/gkYjWXadjnK > =0q6A > -----END PGP SIGNATURE----- > |
In reply to this post by Stephan Ewen
On Mon, Nov 24, 2014 at 10:20 AM, Stephan Ewen <[hidden email]> wrote:
> Will the compression Codec will be inserted in the Netty loops, or before > that? > In the current master, I would say that it makes sense to do it in the Netty loops during shuffling. The compression would then be totally transparent to the writers/readers. With the new network stack we will have to think this through a little bit as we have more options and scenarios due to the fact that results will possibly be consumed multiple times. > In any case, would it make sense to prototype this on the current code and > forward port this to the new network stack later? I assume the code would > mostly be similar, especially all the JNI vs. Java considerations and > tricks. > Yes. It makes sense to have this in any case. |
In reply to this post by Ufuk Celebi-2
Hi,
A codec like Snappy would work on an entire network buffer as one big blob, right? I was more thinking along the lines of compressing individual tuples fields by treating them as columns, e.g., using frame-of-reference encoding and bit backing. Compression on tuple fields should yield much better results than compressing the entire blob. Given that Flink controls the serialization process this should be transparent to other layers in the code. Not sure it is worth the effort though. Cheers, Viktor |
I would start with a simple compression of network buffers as a blob.
At some point, Flink's internal data layout may become columnar, which should also help the blob-style compression, because more similar strings will be within one window... On Tue, Nov 25, 2014 at 11:26 AM, Viktor Rosenfeld < [hidden email]> wrote: > Hi, > > A codec like Snappy would work on an entire network buffer as one big blob, > right? I was more thinking along the lines of compressing individual tuples > fields by treating them as columns, e.g., using frame-of-reference encoding > and bit backing. Compression on tuple fields should yield much better > results than compressing the entire blob. Given that Flink controls the > serialization process this should be transparent to other layers in the > code. Not sure it is worth the effort though. > > Cheers, > Viktor > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568p2607.html > Sent from the Apache Flink (Incubator) Mailing List archive. mailing list > archive at Nabble.com. > |
Hi,
if you really want to add compression on the data path, I would encourage you to choose something as lightweight as possible. 10 GBit Ethernet is becoming pretty much commodity these days in the server space and it is not easy to saturate such a link even without compression. Snappy is not a bad choice, but the fastest algorithm I’ve seen so far is LZ4: https://code.google.com/p/lz4/ Best regards, Daniel Am 25.11.2014 20:53, schrieb Stephan Ewen: > I would start with a simple compression of network buffers as a blob. > > At some point, Flink's internal data layout may become columnar, which > should also help the blob-style compression, because more similar strings > will be within one window... > > On Tue, Nov 25, 2014 at 11:26 AM, Viktor Rosenfeld < > [hidden email]> wrote: > >> Hi, >> >> A codec like Snappy would work on an entire network buffer as one big blob, >> right? I was more thinking along the lines of compressing individual tuples >> fields by treating them as columns, e.g., using frame-of-reference encoding >> and bit backing. Compression on tuple fields should yield much better >> results than compressing the entire blob. Given that Flink controls the >> serialization process this should be transparent to other layers in the >> code. Not sure it is worth the effort though. >> >> Cheers, >> Viktor >> >> >> >> -- >> View this message in context: >> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568p2607.html >> Sent from the Apache Flink (Incubator) Mailing List archive. mailing list >> archive at Nabble.com. >> |
In reply to this post by Stephan Ewen
Hi Stephan,
Compressing network buffers as a blob is probably the fastest/easiest way to achieve some measure of results. But I wonder if it would be possible to implement some operations on the serialized compressed form. For example, groupings and joins should be easy to implement. If the field is not accessed in the UDF then it won't have to be deserialized. A complication would be how the compression scheme is passed on to the next nodes in the computation chain. Cheers, Viktor |
Yes, that working on serialized data happens in parts right now and it
would be great to extend that. While it would be possible to work on a compact serialized representation, I can't think of a way to work on a snappy/lz4 compressed version. Am 26.11.2014 23:00 schrieb "Viktor Rosenfeld" < [hidden email]>: > Hi Stephan, > > Compressing network buffers as a blob is probably the fastest/easiest way > to > achieve some measure of results. > > But I wonder if it would be possible to implement some operations on the > serialized compressed form. For example, groupings and joins should be easy > to implement. If the field is not accessed in the UDF then it won't have to > be deserialized. > > A complication would be how the compression scheme is passed on to the next > nodes in the computation chain. > > Cheers, > Viktor > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568p2637.html > Sent from the Apache Flink (Incubator) Mailing List archive. mailing list > archive at Nabble.com. > |
Hi Stephan,
can you give me an example (or a few) where Flink is working on the serialized data? Cheers, Viktor |
Hey!
As an example, you can have a look at the normalized key sorter, which sorts and compares complete on serialized data. Stephan On Thu, Nov 27, 2014 at 2:25 PM, Viktor Rosenfeld < [hidden email]> wrote: > Hi Stephan, > > can you give me an example (or a few) where Flink is working on the > serialized data? > > Cheers, > Viktor > > > > -- > View this message in context: > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Compression-of-network-traffic-tp2568p2651.html > Sent from the Apache Flink (Incubator) Mailing List archive. mailing list > archive at Nabble.com. > |
Free forum by Nabble | Edit this page |