Re: Provide Hadoop pre-build Hadoop 2.4 and Hadoop 2.6 binaries

Posted by Ufuk Celebi-2 on
URL: http://deprecated-apache-flink-mailing-list-archive.368.s1.nabble.com/Provide-Hadoop-pre-build-Hadoop-2-4-and-Hadoop-2-6-binaries-tp6657p6660.html

I think this is a very good idea and very urgent (because of the issues you outlined and for the user experience of *not* having to compile your own version). Big +1.

On 24 Jun 2015, at 11:45, Robert Metzger <[hidden email]> wrote:

> Hi,
>
> I am aware of at least two Flink users which were facing various issues
> with HDFS when using Flink.
>
> *Issues observed:*
> - HDFS client trying to connect to the standby Namenode
> "org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
> Operation category READ is not supported in state standby"
> -  java.io.IOException: Bad response ERROR for block
> BP-1335380477-172.22.5.37-1424696786673:blk_1107843111_34301064 from
> datanode 172.22.5.81:50010
> at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:732)
>
> - Caused by:
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException):
> 0
>        at
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:478)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6039)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6002)
>
>
> I've added the exceptions to the email so that users facing these issues
> can find a solution for them.
> I suspect that all these issues are caused by the Hadoop 2.2.0 client we
> are packing into the binary releases.
>
> Upgrading the HDFS client to the same version as the HDFS installation
> (say, for example 2.4.1) resolved all issues.
>
> Therefore, I propose to provide Hadoop 2.4.0 and Hadoop 2.6.0 binaries on
> the Flink download page.
> For the 0.9.0 release, I would do another VOTE on providing these two
> binaries.
>
> I've also filed a JIRA to provide a Flink build which doesn't include
> Hadoop at all (relying on the version provided by the user through the
> classpath): https://issues.apache.org/jira/browse/FLINK-2268
>
>
> Let me know what you think!
>
> Robert