Hi
The flink-sequence-file module has a class named SerializableHadoopConfiguration[1] which is nothing but a wrapper class for Hadoop Configuration. I believe this class can be moved to a common module since this is not necessarily tightly coupled with sequence-file module, and also because it can be used by many other modules, for ex. flink-compress. Thoughts? - Sivaprasanna |
Hi Sivaprasanna,
we actually want to remove Hadoop from all core modules, so we could not place it in some very common place like flink-core. But I think the module flink-hadoop-fs could be a fitting place. On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna <[hidden email]> wrote: > Hi > > The flink-sequence-file module has a class named > SerializableHadoopConfiguration[1] which is nothing but a wrapper class for > Hadoop Configuration. I believe this class can be moved to a common module > since this is not necessarily tightly coupled with sequence-file module, > and also because it can be used by many other modules, for ex. > flink-compress. Thoughts? > > - > Sivaprasanna > |
Hi Arvid,
Thanks for the quick reply. Yes, it actually makes sense to avoid Hadoop dependencies from getting into Flink's core modules but I also wonder if it will be an overkill to add flink-hadoop-fs as a dependency just because we want to use a utility class from that module. - Sivaprasanna On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> wrote: > Hi Sivaprasanna, > > we actually want to remove Hadoop from all core modules, so we could not > place it in some very common place like flink-core. > > But I think the module flink-hadoop-fs could be a fitting place. > > On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna <[hidden email]> > wrote: > > > Hi > > > > The flink-sequence-file module has a class named > > SerializableHadoopConfiguration[1] which is nothing but a wrapper class > for > > Hadoop Configuration. I believe this class can be moved to a common > module > > since this is not necessarily tightly coupled with sequence-file module, > > and also because it can be used by many other modules, for ex. > > flink-compress. Thoughts? > > > > - > > Sivaprasanna > > > |
BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if any Flink
module is going to use Hadoop in any way, it will most probably include flink-shaded-hadoop-2 as a dependency. However, flink-shaded modules don't have any source files. Is that a strict convention that the community follows? - Sivaprasanna On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna <[hidden email]> wrote: > Hi Arvid, > > Thanks for the quick reply. Yes, it actually makes sense to avoid Hadoop > dependencies from getting into Flink's core modules but I also wonder if it > will be an overkill to add flink-hadoop-fs as a dependency just because we > want to use a utility class from that module. > > - > Sivaprasanna > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> wrote: > >> Hi Sivaprasanna, >> >> we actually want to remove Hadoop from all core modules, so we could not >> place it in some very common place like flink-core. >> >> But I think the module flink-hadoop-fs could be a fitting place. >> >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna <[hidden email]> >> wrote: >> >> > Hi >> > >> > The flink-sequence-file module has a class named >> > SerializableHadoopConfiguration[1] which is nothing but a wrapper class >> for >> > Hadoop Configuration. I believe this class can be moved to a common >> module >> > since this is not necessarily tightly coupled with sequence-file module, >> > and also because it can be used by many other modules, for ex. >> > flink-compress. Thoughts? >> > >> > - >> > Sivaprasanna >> > >> > |
Hi Sivaprasanna,
we don't upload the source jars for the flink-shaded modules. However you can build them yourself and install by cloning the flink-shaded repository [1] and then call `mvn package -Dshade-sources`. [1] https://github.com/apache/flink-shaded Cheers, Till On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna <[hidden email]> wrote: > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if any Flink > module is going to use Hadoop in any way, it will most probably include > flink-shaded-hadoop-2 as a dependency. > However, flink-shaded modules don't have any source files. Is that a strict > convention that the community follows? > > - > Sivaprasanna > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna <[hidden email]> > wrote: > > > Hi Arvid, > > > > Thanks for the quick reply. Yes, it actually makes sense to avoid Hadoop > > dependencies from getting into Flink's core modules but I also wonder if > it > > will be an overkill to add flink-hadoop-fs as a dependency just because > we > > want to use a utility class from that module. > > > > - > > Sivaprasanna > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> wrote: > > > >> Hi Sivaprasanna, > >> > >> we actually want to remove Hadoop from all core modules, so we could not > >> place it in some very common place like flink-core. > >> > >> But I think the module flink-hadoop-fs could be a fitting place. > >> > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna <[hidden email] > > > >> wrote: > >> > >> > Hi > >> > > >> > The flink-sequence-file module has a class named > >> > SerializableHadoopConfiguration[1] which is nothing but a wrapper > class > >> for > >> > Hadoop Configuration. I believe this class can be moved to a common > >> module > >> > since this is not necessarily tightly coupled with sequence-file > module, > >> > and also because it can be used by many other modules, for ex. > >> > flink-compress. Thoughts? > >> > > >> > - > >> > Sivaprasanna > >> > > >> > > > |
Do we have more cases of "common Hadoop Utils"?
If yes, does it make sense to create a "flink-hadoop-utils" module with exactly such classes? It would have an optional dependency on "flink-shaded-hadoop". On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email]> wrote: > Hi Sivaprasanna, > > we don't upload the source jars for the flink-shaded modules. However you > can build them yourself and install by cloning the flink-shaded repository > [1] and then call `mvn package -Dshade-sources`. > > [1] https://github.com/apache/flink-shaded > > Cheers, > Till > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna <[hidden email]> > wrote: > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if any > Flink > > module is going to use Hadoop in any way, it will most probably include > > flink-shaded-hadoop-2 as a dependency. > > However, flink-shaded modules don't have any source files. Is that a > strict > > convention that the community follows? > > > > - > > Sivaprasanna > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna <[hidden email]> > > wrote: > > > > > Hi Arvid, > > > > > > Thanks for the quick reply. Yes, it actually makes sense to avoid > Hadoop > > > dependencies from getting into Flink's core modules but I also wonder > if > > it > > > will be an overkill to add flink-hadoop-fs as a dependency just because > > we > > > want to use a utility class from that module. > > > > > > - > > > Sivaprasanna > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> > wrote: > > > > > >> Hi Sivaprasanna, > > >> > > >> we actually want to remove Hadoop from all core modules, so we could > not > > >> place it in some very common place like flink-core. > > >> > > >> But I think the module flink-hadoop-fs could be a fitting place. > > >> > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > [hidden email] > > > > > >> wrote: > > >> > > >> > Hi > > >> > > > >> > The flink-sequence-file module has a class named > > >> > SerializableHadoopConfiguration[1] which is nothing but a wrapper > > class > > >> for > > >> > Hadoop Configuration. I believe this class can be moved to a common > > >> module > > >> > since this is not necessarily tightly coupled with sequence-file > > module, > > >> > and also because it can be used by many other modules, for ex. > > >> > flink-compress. Thoughts? > > >> > > > >> > - > > >> > Sivaprasanna > > >> > > > >> > > > > > > |
Hi Stephen,
I guess it is a valid point to have something like 'flink-hadoop-utils'. Maybe a [DISCUSS] thread can be started to understand what the community thinks? On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> wrote: > Do we have more cases of "common Hadoop Utils"? > > If yes, does it make sense to create a "flink-hadoop-utils" module with > exactly such classes? It would have an optional dependency on > "flink-shaded-hadoop". > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email]> wrote: > > > Hi Sivaprasanna, > > > > we don't upload the source jars for the flink-shaded modules. However you > > can build them yourself and install by cloning the flink-shaded > repository > > [1] and then call `mvn package -Dshade-sources`. > > > > [1] https://github.com/apache/flink-shaded > > > > Cheers, > > Till > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna <[hidden email]> > > wrote: > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if any > > Flink > > > module is going to use Hadoop in any way, it will most probably include > > > flink-shaded-hadoop-2 as a dependency. > > > However, flink-shaded modules don't have any source files. Is that a > > strict > > > convention that the community follows? > > > > > > - > > > Sivaprasanna > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < > [hidden email]> > > > wrote: > > > > > > > Hi Arvid, > > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to avoid > > Hadoop > > > > dependencies from getting into Flink's core modules but I also wonder > > if > > > it > > > > will be an overkill to add flink-hadoop-fs as a dependency just > because > > > we > > > > want to use a utility class from that module. > > > > > > > > - > > > > Sivaprasanna > > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> > > wrote: > > > > > > > >> Hi Sivaprasanna, > > > >> > > > >> we actually want to remove Hadoop from all core modules, so we could > > not > > > >> place it in some very common place like flink-core. > > > >> > > > >> But I think the module flink-hadoop-fs could be a fitting place. > > > >> > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > > [hidden email] > > > > > > > >> wrote: > > > >> > > > >> > Hi > > > >> > > > > >> > The flink-sequence-file module has a class named > > > >> > SerializableHadoopConfiguration[1] which is nothing but a wrapper > > > class > > > >> for > > > >> > Hadoop Configuration. I believe this class can be moved to a > common > > > >> module > > > >> > since this is not necessarily tightly coupled with sequence-file > > > module, > > > >> > and also because it can be used by many other modules, for ex. > > > >> > flink-compress. Thoughts? > > > >> > > > > >> > - > > > >> > Sivaprasanna > > > >> > > > > >> > > > > > > > > > > |
We could merge the two modules into one?
sequence-files its another way of compressing files.. On 2020/03/05 13:02:46, Sivaprasanna <[hidden email]> wrote: > Hi Stephen, > > I guess it is a valid point to have something like 'flink-hadoop-utils'. > Maybe a [DISCUSS] thread can be started to understand what the community > thinks? > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> wrote: > > > Do we have more cases of "common Hadoop Utils"? > > > > If yes, does it make sense to create a "flink-hadoop-utils" module with > > exactly such classes? It would have an optional dependency on > > "flink-shaded-hadoop". > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email]> wrote: > > > > > Hi Sivaprasanna, > > > > > > we don't upload the source jars for the flink-shaded modules. However you > > > can build them yourself and install by cloning the flink-shaded > > repository > > > [1] and then call `mvn package -Dshade-sources`. > > > > > > [1] https://github.com/apache/flink-shaded > > > > > > Cheers, > > > Till > > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna <[hidden email]> > > > wrote: > > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if any > > > Flink > > > > module is going to use Hadoop in any way, it will most probably include > > > > flink-shaded-hadoop-2 as a dependency. > > > > However, flink-shaded modules don't have any source files. Is that a > > > strict > > > > convention that the community follows? > > > > > > > > - > > > > Sivaprasanna > > > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < > > [hidden email]> > > > > wrote: > > > > > > > > > Hi Arvid, > > > > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to avoid > > > Hadoop > > > > > dependencies from getting into Flink's core modules but I also wonder > > > if > > > > it > > > > > will be an overkill to add flink-hadoop-fs as a dependency just > > because > > > > we > > > > > want to use a utility class from that module. > > > > > > > > > > - > > > > > Sivaprasanna > > > > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> > > > wrote: > > > > > > > > > >> Hi Sivaprasanna, > > > > >> > > > > >> we actually want to remove Hadoop from all core modules, so we could > > > not > > > > >> place it in some very common place like flink-core. > > > > >> > > > > >> But I think the module flink-hadoop-fs could be a fitting place. > > > > >> > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > > > [hidden email] > > > > > > > > > >> wrote: > > > > >> > > > > >> > Hi > > > > >> > > > > > >> > The flink-sequence-file module has a class named > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a wrapper > > > > class > > > > >> for > > > > >> > Hadoop Configuration. I believe this class can be moved to a > > common > > > > >> module > > > > >> > since this is not necessarily tightly coupled with sequence-file > > > > module, > > > > >> > and also because it can be used by many other modules, for ex. > > > > >> > flink-compress. Thoughts? > > > > >> > > > > > >> > - > > > > >> > Sivaprasanna > > > > >> > > > > > >> > > > > > > > > > > > > > > > |
That also makes sense but that, I believe, would be a breaking/major
change. If we are okay with merging them together, we can name something like "flink-hadoop-compress" since SequenceFile is also a Hadoop format and the existing "flink-compress" module, as of now, deals with Hadoop based compression. On Fri, Mar 6, 2020 at 1:33 AM João Boto <[hidden email]> wrote: > We could merge the two modules into one? > sequence-files its another way of compressing files.. > > > On 2020/03/05 13:02:46, Sivaprasanna <[hidden email]> wrote: > > Hi Stephen, > > > > I guess it is a valid point to have something like 'flink-hadoop-utils'. > > Maybe a [DISCUSS] thread can be started to understand what the community > > thinks? > > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> wrote: > > > > > Do we have more cases of "common Hadoop Utils"? > > > > > > If yes, does it make sense to create a "flink-hadoop-utils" module with > > > exactly such classes? It would have an optional dependency on > > > "flink-shaded-hadoop". > > > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email]> > wrote: > > > > > > > Hi Sivaprasanna, > > > > > > > > we don't upload the source jars for the flink-shaded modules. > However you > > > > can build them yourself and install by cloning the flink-shaded > > > repository > > > > [1] and then call `mvn package -Dshade-sources`. > > > > > > > > [1] https://github.com/apache/flink-shaded > > > > > > > > Cheers, > > > > Till > > > > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna < > [hidden email]> > > > > wrote: > > > > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if > any > > > > Flink > > > > > module is going to use Hadoop in any way, it will most probably > include > > > > > flink-shaded-hadoop-2 as a dependency. > > > > > However, flink-shaded modules don't have any source files. Is that > a > > > > strict > > > > > convention that the community follows? > > > > > > > > > > - > > > > > Sivaprasanna > > > > > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < > > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Hi Arvid, > > > > > > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to avoid > > > > Hadoop > > > > > > dependencies from getting into Flink's core modules but I also > wonder > > > > if > > > > > it > > > > > > will be an overkill to add flink-hadoop-fs as a dependency just > > > because > > > > > we > > > > > > want to use a utility class from that module. > > > > > > > > > > > > - > > > > > > Sivaprasanna > > > > > > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise <[hidden email]> > > > > wrote: > > > > > > > > > > > >> Hi Sivaprasanna, > > > > > >> > > > > > >> we actually want to remove Hadoop from all core modules, so we > could > > > > not > > > > > >> place it in some very common place like flink-core. > > > > > >> > > > > > >> But I think the module flink-hadoop-fs could be a fitting place. > > > > > >> > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > > > > [hidden email] > > > > > > > > > > > >> wrote: > > > > > >> > > > > > >> > Hi > > > > > >> > > > > > > >> > The flink-sequence-file module has a class named > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a > wrapper > > > > > class > > > > > >> for > > > > > >> > Hadoop Configuration. I believe this class can be moved to a > > > common > > > > > >> module > > > > > >> > since this is not necessarily tightly coupled with > sequence-file > > > > > module, > > > > > >> > and also because it can be used by many other modules, for ex. > > > > > >> > flink-compress. Thoughts? > > > > > >> > > > > > > >> > - > > > > > >> > Sivaprasanna > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > |
Hi Sivaprasanna,
do you want to collect the set of Hadoop utility classes which could be moved to a flink-hadoop-utils module and start a discuss thread about it? I think this could be a first good step into cleaning up the module structure a bit. Cheers, Till On Fri, Mar 6, 2020 at 7:27 AM Sivaprasanna <[hidden email]> wrote: > That also makes sense but that, I believe, would be a breaking/major > change. If we are okay with merging them together, we can name something > like "flink-hadoop-compress" since SequenceFile is also a Hadoop format and > the existing "flink-compress" module, as of now, deals with Hadoop based > compression. > > On Fri, Mar 6, 2020 at 1:33 AM João Boto <[hidden email]> wrote: > > > We could merge the two modules into one? > > sequence-files its another way of compressing files.. > > > > > > On 2020/03/05 13:02:46, Sivaprasanna <[hidden email]> wrote: > > > Hi Stephen, > > > > > > I guess it is a valid point to have something like > 'flink-hadoop-utils'. > > > Maybe a [DISCUSS] thread can be started to understand what the > community > > > thinks? > > > > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> wrote: > > > > > > > Do we have more cases of "common Hadoop Utils"? > > > > > > > > If yes, does it make sense to create a "flink-hadoop-utils" module > with > > > > exactly such classes? It would have an optional dependency on > > > > "flink-shaded-hadoop". > > > > > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email]> > > wrote: > > > > > > > > > Hi Sivaprasanna, > > > > > > > > > > we don't upload the source jars for the flink-shaded modules. > > However you > > > > > can build them yourself and install by cloning the flink-shaded > > > > repository > > > > > [1] and then call `mvn package -Dshade-sources`. > > > > > > > > > > [1] https://github.com/apache/flink-shaded > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna < > > [hidden email]> > > > > > wrote: > > > > > > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, if > > any > > > > > Flink > > > > > > module is going to use Hadoop in any way, it will most probably > > include > > > > > > flink-shaded-hadoop-2 as a dependency. > > > > > > However, flink-shaded modules don't have any source files. Is > that > > a > > > > > strict > > > > > > convention that the community follows? > > > > > > > > > > > > - > > > > > > Sivaprasanna > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < > > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hi Arvid, > > > > > > > > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to > avoid > > > > > Hadoop > > > > > > > dependencies from getting into Flink's core modules but I also > > wonder > > > > > if > > > > > > it > > > > > > > will be an overkill to add flink-hadoop-fs as a dependency just > > > > because > > > > > > we > > > > > > > want to use a utility class from that module. > > > > > > > > > > > > > > - > > > > > > > Sivaprasanna > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise < > [hidden email]> > > > > > wrote: > > > > > > > > > > > > > >> Hi Sivaprasanna, > > > > > > >> > > > > > > >> we actually want to remove Hadoop from all core modules, so we > > could > > > > > not > > > > > > >> place it in some very common place like flink-core. > > > > > > >> > > > > > > >> But I think the module flink-hadoop-fs could be a fitting > place. > > > > > > >> > > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > > > > > [hidden email] > > > > > > > > > > > > > >> wrote: > > > > > > >> > > > > > > >> > Hi > > > > > > >> > > > > > > > >> > The flink-sequence-file module has a class named > > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a > > wrapper > > > > > > class > > > > > > >> for > > > > > > >> > Hadoop Configuration. I believe this class can be moved to a > > > > common > > > > > > >> module > > > > > > >> > since this is not necessarily tightly coupled with > > sequence-file > > > > > > module, > > > > > > >> > and also because it can be used by many other modules, for > ex. > > > > > > >> > flink-compress. Thoughts? > > > > > > >> > > > > > > > >> > - > > > > > > >> > Sivaprasanna > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Hi Till,
Sure. I'll take a look and start a discuss thread soon. Thanks, Sivaprasanna On Mon, Mar 16, 2020 at 4:01 PM Till Rohrmann <[hidden email]> wrote: > Hi Sivaprasanna, > > do you want to collect the set of Hadoop utility classes which could be > moved to a flink-hadoop-utils module and start a discuss thread about it? I > think this could be a first good step into cleaning up the module structure > a bit. > > Cheers, > Till > > On Fri, Mar 6, 2020 at 7:27 AM Sivaprasanna <[hidden email]> > wrote: > > > That also makes sense but that, I believe, would be a breaking/major > > change. If we are okay with merging them together, we can name something > > like "flink-hadoop-compress" since SequenceFile is also a Hadoop format > and > > the existing "flink-compress" module, as of now, deals with Hadoop based > > compression. > > > > On Fri, Mar 6, 2020 at 1:33 AM João Boto <[hidden email]> wrote: > > > > > We could merge the two modules into one? > > > sequence-files its another way of compressing files.. > > > > > > > > > On 2020/03/05 13:02:46, Sivaprasanna <[hidden email]> > wrote: > > > > Hi Stephen, > > > > > > > > I guess it is a valid point to have something like > > 'flink-hadoop-utils'. > > > > Maybe a [DISCUSS] thread can be started to understand what the > > community > > > > thinks? > > > > > > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> > wrote: > > > > > > > > > Do we have more cases of "common Hadoop Utils"? > > > > > > > > > > If yes, does it make sense to create a "flink-hadoop-utils" module > > with > > > > > exactly such classes? It would have an optional dependency on > > > > > "flink-shaded-hadoop". > > > > > > > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann <[hidden email] > > > > > wrote: > > > > > > > > > > > Hi Sivaprasanna, > > > > > > > > > > > > we don't upload the source jars for the flink-shaded modules. > > > However you > > > > > > can build them yourself and install by cloning the flink-shaded > > > > > repository > > > > > > [1] and then call `mvn package -Dshade-sources`. > > > > > > > > > > > > [1] https://github.com/apache/flink-shaded > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna < > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, > if > > > any > > > > > > Flink > > > > > > > module is going to use Hadoop in any way, it will most probably > > > include > > > > > > > flink-shaded-hadoop-2 as a dependency. > > > > > > > However, flink-shaded modules don't have any source files. Is > > that > > > a > > > > > > strict > > > > > > > convention that the community follows? > > > > > > > > > > > > > > - > > > > > > > Sivaprasanna > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < > > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Arvid, > > > > > > > > > > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to > > avoid > > > > > > Hadoop > > > > > > > > dependencies from getting into Flink's core modules but I > also > > > wonder > > > > > > if > > > > > > > it > > > > > > > > will be an overkill to add flink-hadoop-fs as a dependency > just > > > > > because > > > > > > > we > > > > > > > > want to use a utility class from that module. > > > > > > > > > > > > > > > > - > > > > > > > > Sivaprasanna > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise < > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > > > >> Hi Sivaprasanna, > > > > > > > >> > > > > > > > >> we actually want to remove Hadoop from all core modules, so > we > > > could > > > > > > not > > > > > > > >> place it in some very common place like flink-core. > > > > > > > >> > > > > > > > >> But I think the module flink-hadoop-fs could be a fitting > > place. > > > > > > > >> > > > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < > > > > > > [hidden email] > > > > > > > > > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >> > Hi > > > > > > > >> > > > > > > > > >> > The flink-sequence-file module has a class named > > > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a > > > wrapper > > > > > > > class > > > > > > > >> for > > > > > > > >> > Hadoop Configuration. I believe this class can be moved > to a > > > > > common > > > > > > > >> module > > > > > > > >> > since this is not necessarily tightly coupled with > > > sequence-file > > > > > > > module, > > > > > > > >> > and also because it can be used by many other modules, for > > ex. > > > > > > > >> > flink-compress. Thoughts? > > > > > > > >> > > > > > > > > >> > - > > > > > > > >> > Sivaprasanna > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Till, Stephen, & Others,
I have created a discuss thread a few days back. Attaching the link here. Appreciate if you could take a look. https://lists.apache.org/thread.html/rf885987160bede5911a7f61923307a6d5ae07f850da0a90555728e5f%40%3Cdev.flink.apache.org%3E Please let me know if you want me to improve/edit the content to make it better. Thanks, Sivaprasanna On Tue, Mar 17, 2020 at 8:22 PM Sivaprasanna <[hidden email]> wrote: > Hi Till, > > Sure. I'll take a look and start a discuss thread soon. > > Thanks, > Sivaprasanna > > On Mon, Mar 16, 2020 at 4:01 PM Till Rohrmann <[hidden email]> > wrote: > >> Hi Sivaprasanna, >> >> do you want to collect the set of Hadoop utility classes which could be >> moved to a flink-hadoop-utils module and start a discuss thread about it? >> I >> think this could be a first good step into cleaning up the module >> structure >> a bit. >> >> Cheers, >> Till >> >> On Fri, Mar 6, 2020 at 7:27 AM Sivaprasanna <[hidden email]> >> wrote: >> >> > That also makes sense but that, I believe, would be a breaking/major >> > change. If we are okay with merging them together, we can name something >> > like "flink-hadoop-compress" since SequenceFile is also a Hadoop format >> and >> > the existing "flink-compress" module, as of now, deals with Hadoop based >> > compression. >> > >> > On Fri, Mar 6, 2020 at 1:33 AM João Boto <[hidden email]> wrote: >> > >> > > We could merge the two modules into one? >> > > sequence-files its another way of compressing files.. >> > > >> > > >> > > On 2020/03/05 13:02:46, Sivaprasanna <[hidden email]> >> wrote: >> > > > Hi Stephen, >> > > > >> > > > I guess it is a valid point to have something like >> > 'flink-hadoop-utils'. >> > > > Maybe a [DISCUSS] thread can be started to understand what the >> > community >> > > > thinks? >> > > > >> > > > On Thu, Mar 5, 2020 at 4:22 PM Stephan Ewen <[hidden email]> >> wrote: >> > > > >> > > > > Do we have more cases of "common Hadoop Utils"? >> > > > > >> > > > > If yes, does it make sense to create a "flink-hadoop-utils" module >> > with >> > > > > exactly such classes? It would have an optional dependency on >> > > > > "flink-shaded-hadoop". >> > > > > >> > > > > On Wed, Mar 4, 2020 at 9:12 AM Till Rohrmann < >> [hidden email]> >> > > wrote: >> > > > > >> > > > > > Hi Sivaprasanna, >> > > > > > >> > > > > > we don't upload the source jars for the flink-shaded modules. >> > > However you >> > > > > > can build them yourself and install by cloning the flink-shaded >> > > > > repository >> > > > > > [1] and then call `mvn package -Dshade-sources`. >> > > > > > >> > > > > > [1] https://github.com/apache/flink-shaded >> > > > > > >> > > > > > Cheers, >> > > > > > Till >> > > > > > >> > > > > > On Tue, Mar 3, 2020 at 6:29 PM Sivaprasanna < >> > > [hidden email]> >> > > > > > wrote: >> > > > > > >> > > > > > > BTW, can we leverage flink-shaded-hadoop-2? Reason why I ask, >> if >> > > any >> > > > > > Flink >> > > > > > > module is going to use Hadoop in any way, it will most >> probably >> > > include >> > > > > > > flink-shaded-hadoop-2 as a dependency. >> > > > > > > However, flink-shaded modules don't have any source files. Is >> > that >> > > a >> > > > > > strict >> > > > > > > convention that the community follows? >> > > > > > > >> > > > > > > - >> > > > > > > Sivaprasanna >> > > > > > > >> > > > > > > On Tue, Mar 3, 2020 at 10:48 PM Sivaprasanna < >> > > > > [hidden email]> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Hi Arvid, >> > > > > > > > >> > > > > > > > Thanks for the quick reply. Yes, it actually makes sense to >> > avoid >> > > > > > Hadoop >> > > > > > > > dependencies from getting into Flink's core modules but I >> also >> > > wonder >> > > > > > if >> > > > > > > it >> > > > > > > > will be an overkill to add flink-hadoop-fs as a dependency >> just >> > > > > because >> > > > > > > we >> > > > > > > > want to use a utility class from that module. >> > > > > > > > >> > > > > > > > - >> > > > > > > > Sivaprasanna >> > > > > > > > >> > > > > > > > On Tue, Mar 3, 2020 at 4:17 PM Arvid Heise < >> > [hidden email]> >> > > > > > wrote: >> > > > > > > > >> > > > > > > >> Hi Sivaprasanna, >> > > > > > > >> >> > > > > > > >> we actually want to remove Hadoop from all core modules, >> so we >> > > could >> > > > > > not >> > > > > > > >> place it in some very common place like flink-core. >> > > > > > > >> >> > > > > > > >> But I think the module flink-hadoop-fs could be a fitting >> > place. >> > > > > > > >> >> > > > > > > >> On Tue, Mar 3, 2020 at 11:25 AM Sivaprasanna < >> > > > > > [hidden email] >> > > > > > > > >> > > > > > > >> wrote: >> > > > > > > >> >> > > > > > > >> > Hi >> > > > > > > >> > >> > > > > > > >> > The flink-sequence-file module has a class named >> > > > > > > >> > SerializableHadoopConfiguration[1] which is nothing but a >> > > wrapper >> > > > > > > class >> > > > > > > >> for >> > > > > > > >> > Hadoop Configuration. I believe this class can be moved >> to a >> > > > > common >> > > > > > > >> module >> > > > > > > >> > since this is not necessarily tightly coupled with >> > > sequence-file >> > > > > > > module, >> > > > > > > >> > and also because it can be used by many other modules, >> for >> > ex. >> > > > > > > >> > flink-compress. Thoughts? >> > > > > > > >> > >> > > > > > > >> > - >> > > > > > > >> > Sivaprasanna >> > > > > > > >> > >> > > > > > > >> >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > |
Free forum by Nabble | Edit this page |