Hey everyone,
I'm still trying to finish the Hadoop Compatibility PR (https://github.com/stratosphere/stratosphere/pull/777), however, I always get an ClassNotFoundException for my HCatalog InputFormat on the cluster. While searching for potential ClassLoader bugs, I found the following lines in UserCodeClassWrapper: @Override public T getUserCodeObject(Class<? super T> superClass, ClassLoader cl) { return InstantiationUtil.instantiate(userCodeClass, superClass); } Why is the given "ClassLoader cl" argument never used? Looks for me like a bug... What do you think? Thanks, Timo |
Yep, this looks like a bug, I agree.
|
Meanwhile I don't think that it is a bug, because the getStubWrapper()
method in TaskConfig uses the classloader. userCodeClass already contains the loaded class, so no need for the classloader: @Override public T getUserCodeObject(Class<? super T> superClass, ClassLoader cl) { return InstantiationUtil.instantiate(userCodeClass, superClass); } Anyways, I'm still having the problem, that my InputFormat does not find the user class HCatInputFormat. How can I access the user code class loader from within an input format? Or do I have to serialize my class within the InputFormat in my custom writeObject() method? Here is my exception, maybe someone could give me a hint: eu.stratosphere.client.program.ProgramInvocationException: The program execution failed: eu.stratosphere.nephele.executiongraph.GraphConversionException: java.lang.RuntimeException: Unable to instantiate the hadoop input format at eu.stratosphere.hadoopcompatibility.mapreduce.HadoopInputFormat.readObject(HadoopInputFormat.java:315) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1001) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1892) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1914) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1797) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1349) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at eu.stratosphere.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:239) at eu.stratosphere.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:227) at eu.stratosphere.pact.runtime.task.util.TaskConfig.getStubWrapper(TaskConfig.java:262) at eu.stratosphere.pact.runtime.task.RegularPactTask.instantiateUserCode(RegularPactTask.java:1487) at eu.stratosphere.pact.runtime.task.DataSourceTask.initInputFormat(DataSourceTask.java:313) at eu.stratosphere.pact.runtime.task.DataSourceTask.registerInputOutput(DataSourceTask.java:100) at eu.stratosphere.nephele.execution.RuntimeEnvironment.<init>(RuntimeEnvironment.java:192) at eu.stratosphere.nephele.executiongraph.ExecutionGroupVertex.<init>(ExecutionGroupVertex.java:229) at eu.stratosphere.nephele.executiongraph.ExecutionGraph.createVertex(ExecutionGraph.java:493) at eu.stratosphere.nephele.executiongraph.ExecutionGraph.constructExecutionGraph(ExecutionGraph.java:275) at eu.stratosphere.nephele.executiongraph.ExecutionGraph.<init>(ExecutionGraph.java:174) at eu.stratosphere.nephele.jobmanager.JobManager.submitJob(JobManager.java:499) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at eu.stratosphere.nephele.ipc.RPC$Server.call(RPC.java:417) at eu.stratosphere.nephele.ipc.Server$Handler.run(Server.java:951) Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.mapreduce.HCatInputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at eu.stratosphere.hadoopcompatibility.mapreduce.HadoopInputFormat.readObject(HadoopInputFormat.java:313) ... 31 more On 17.06.2014 16:42, Stephan Ewen wrote: > Yep, this looks like a bug, I agree. > |
You can always do "Thread.currentThread().getContextClassLoader()"
|
I already tried that but it resulted in the same exception. Where should the ContextClassLoader be set when I deserialize an InputFormat on an other node in the cluster?
On 18.06.2014 14:24, Stephan Ewen wrote: > You can always do "Thread.currentThread().getContextClassLoader()" > |
It should be set in the DataSourceTask, I would assume. Can you try to
trace your way from "registerInputOutput()" and "invoke()" to where the input format is called? On Wed, Jun 18, 2014 at 2:39 PM, Timo Walther <[hidden email]> wrote: > I already tried that but it resulted in the same exception. Where should > the ContextClassLoader be set when I deserialize an InputFormat on an other > node in the cluster? > > > > > > On 18.06.2014 14:24, Stephan Ewen wrote: > >> You can always do "Thread.currentThread().getContextClassLoader()" >> >> > |
In reply to this post by Timo Walther
You have to set the ClassLoader in the readObject() method when calling
Class.forName(..) On Wed, Jun 18, 2014 at 2:39 PM, Timo Walther <[hidden email]> wrote: > I already tried that but it resulted in the same exception. Where should > the ContextClassLoader be set when I deserialize an InputFormat on an other > node in the cluster? > > > > > > On 18.06.2014 14:24, Stephan Ewen wrote: > >> You can always do "Thread.currentThread().getContextClassLoader()" >> >> > |
I think I know where the issue is:
In "InstantiationUtil", can you change the method public static Object deserializeObject(byte[] bytes, ClassLoader cl) throws IOException, ClassNotFoundException { ObjectInputStream oois = null; try { oois = new ClassLoaderObjectInputStream(new ByteArrayInputStream(bytes), cl); return oois.readObject(); } finally { if (oois != null) { oois.close(); } } } to something like public static Object deserializeObject(byte[] bytes, ClassLoader cl) throws IOException, ClassNotFoundException { ObjectInputStream oois = null; final ClassLoader old = Thread.currentThread().getContextClassLoader(); try { oois = new ClassLoaderObjectInputStream(new ByteArrayInputStream(bytes), cl); return oois.readObject(); } finally { Thread.currentThread().setContextClassLoader(old); if (oois != null) { oois.close(); } } } |
Also, if it works, make a path for it ;-)
|
Free forum by Nabble | Edit this page |