The problem seems to be that the reflection analysis cannot determine the
type of the TypeSerializerInputFormat. One possible solution is to add the ResultTypeQueryable interface and force clients to explicitly set the TypeInformation. This might break code which relies on automatic type inference, but at the moment I cannot find any other usages of the TypeSerializerInputFormat except from the unit test. ---------- Forwarded message ---------- From: Alexander Alexandrov <[hidden email]> Date: 2015-01-29 12:04 GMT+01:00 Subject: TypeSerializerInputFormat cannot determine its type automatically To: [hidden email] I am trying to use the TypeSerializer IO formats to write temp data to disk. A gist with a minimal example can be found here: https://gist.github.com/aalexandrov/90bf21f66bf604676f37 However, with the current setting I get the following error with the TypeSerializerInputFormat: Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The type returned by the input format could not be automatically determined. Please specify the TypeInformation of the produced type explicitly. at org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341) at SerializedFormatExample$.main(SerializedFormatExample.scala:48) at SerializedFormatExample.main(SerializedFormatExample.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) I think that the typeInformation instance at line 43 should be somehow passed to the TypeSerializerInputFormat, but I cannot find a way to do it. Any suggestions? Thanks, A. |
As a quickfix I implmeented ResultTypeQueryable for the
TypeSerializerInputFormat. A PR for the 0.8 branch can be found here https://github.com/apache/flink/pull/349 Please check and let me know if there is a way to fix the problem without breaking the 0.8 line API. 2015-01-29 14:54 GMT+01:00 Alexander Alexandrov < [hidden email]>: > The problem seems to be that the reflection analysis cannot determine the > type of the TypeSerializerInputFormat. > > One possible solution is to add the ResultTypeQueryable interface and > force clients to explicitly set the TypeInformation. > > This might break code which relies on automatic type inference, but at the > moment I cannot find any other usages of the TypeSerializerInputFormat > except from the unit test. > > > > ---------- Forwarded message ---------- > From: Alexander Alexandrov <[hidden email]> > Date: 2015-01-29 12:04 GMT+01:00 > Subject: TypeSerializerInputFormat cannot determine its type automatically > To: [hidden email] > > > I am trying to use the TypeSerializer IO formats to write temp data to > disk. A gist with a minimal example can be found here: > > https://gist.github.com/aalexandrov/90bf21f66bf604676f37 > > However, with the current setting I get the following error with the > TypeSerializerInputFormat: > > Exception in thread "main" > org.apache.flink.api.common.InvalidProgramException: The type returned by > the input format could not be automatically determined. Please specify the > TypeInformation of the produced type explicitly. > at > org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341) > at SerializedFormatExample$.main(SerializedFormatExample.scala:48) > at SerializedFormatExample.main(SerializedFormatExample.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) > > I think that the typeInformation instance at line 43 should be somehow > passed to the TypeSerializerInputFormat, but I cannot find a way to do it. > > Any suggestions? > > Thanks, > A. > > |
In reply to this post by aalexandrov
Hey Alexander,
I have looked into your issue. You can simply use env.createInput(InputFormat,TypeInformation) instead of env.readFile() then you can pass TypeInformation manually without implementing ResultTypeQueryable. Regards, Timo On 29.01.2015 14:54, Alexander Alexandrov wrote: > The problem seems to be that the reflection analysis cannot determine the > type of the TypeSerializerInputFormat. > > One possible solution is to add the ResultTypeQueryable interface and force > clients to explicitly set the TypeInformation. > > This might break code which relies on automatic type inference, but at the > moment I cannot find any other usages of the TypeSerializerInputFormat > except from the unit test. > > > ---------- Forwarded message ---------- > From: Alexander Alexandrov <[hidden email]> > Date: 2015-01-29 12:04 GMT+01:00 > Subject: TypeSerializerInputFormat cannot determine its type automatically > To: [hidden email] > > > I am trying to use the TypeSerializer IO formats to write temp data to > disk. A gist with a minimal example can be found here: > > https://gist.github.com/aalexandrov/90bf21f66bf604676f37 > > However, with the current setting I get the following error with the > TypeSerializerInputFormat: > > Exception in thread "main" > org.apache.flink.api.common.InvalidProgramException: The type returned by > the input format could not be automatically determined. Please specify the > TypeInformation of the produced type explicitly. > at > org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341) > at SerializedFormatExample$.main(SerializedFormatExample.scala:48) > at SerializedFormatExample.main(SerializedFormatExample.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) > > I think that the typeInformation instance at line 43 should be somehow > passed to the TypeSerializerInputFormat, but I cannot find a way to do it. > > Any suggestions? > > Thanks, > A. > |
Alight, thanks for the hint.
I suggest to close PR 349 and refine the exception with a hint HOW exactly to pass the TypeInformation instance, e.g. The type returned by the input format could not be automatically determined. Please pass the TypeInformation of the produced type explicitly via 'env.createInput(...)'. I knew what I had to do, but I couldn't find the right point of entry to do is because the IO system is so generic. 2015-01-29 16:07 GMT+01:00 Timo Walther <[hidden email]>: > Hey Alexander, > > I have looked into your issue. You can simply use > env.createInput(InputFormat,TypeInformation) instead of env.readFile() > then you can pass TypeInformation manually without implementing > ResultTypeQueryable. > > Regards, > Timo > > > > > On 29.01.2015 14:54, Alexander Alexandrov wrote: > >> The problem seems to be that the reflection analysis cannot determine the >> type of the TypeSerializerInputFormat. >> >> One possible solution is to add the ResultTypeQueryable interface and >> force >> clients to explicitly set the TypeInformation. >> >> This might break code which relies on automatic type inference, but at the >> moment I cannot find any other usages of the TypeSerializerInputFormat >> except from the unit test. >> >> >> ---------- Forwarded message ---------- >> From: Alexander Alexandrov <[hidden email]> >> Date: 2015-01-29 12:04 GMT+01:00 >> Subject: TypeSerializerInputFormat cannot determine its type automatically >> To: [hidden email] >> >> >> I am trying to use the TypeSerializer IO formats to write temp data to >> disk. A gist with a minimal example can be found here: >> >> https://gist.github.com/aalexandrov/90bf21f66bf604676f37 >> >> However, with the current setting I get the following error with the >> TypeSerializerInputFormat: >> >> Exception in thread "main" >> org.apache.flink.api.common.InvalidProgramException: The type returned by >> the input format could not be automatically determined. Please specify the >> TypeInformation of the produced type explicitly. >> at >> org.apache.flink.api.java.ExecutionEnvironment.readFile( >> ExecutionEnvironment.java:341) >> at SerializedFormatExample$.main(SerializedFormatExample.scala:48) >> at SerializedFormatExample.main(SerializedFormatExample.scala) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke( >> NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke( >> DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at com.intellij.rt.execution.application.AppMain.main( >> AppMain.java:134) >> >> I think that the typeInformation instance at line 43 should be somehow >> passed to the TypeSerializerInputFormat, but I cannot find a way to do it. >> >> Any suggestions? >> >> Thanks, >> A. >> >> > |
You don't have to close the PR. The change makes sense anyways.
You are right, the exception message could be improved at this point. On 29.01.2015 16:21, Alexander Alexandrov wrote: > Alight, thanks for the hint. > > I suggest to close PR 349 and refine the exception with a hint HOW exactly > to pass the TypeInformation instance, e.g. > > The type returned by the input format could not be automatically > determined. Please pass the TypeInformation of the produced type explicitly > via 'env.createInput(...)'. > > I knew what I had to do, but I couldn't find the right point of entry to do > is because the IO system is so generic. > > > 2015-01-29 16:07 GMT+01:00 Timo Walther <[hidden email]>: > >> Hey Alexander, >> >> I have looked into your issue. You can simply use >> env.createInput(InputFormat,TypeInformation) instead of env.readFile() >> then you can pass TypeInformation manually without implementing >> ResultTypeQueryable. >> >> Regards, >> Timo >> >> >> >> >> On 29.01.2015 14:54, Alexander Alexandrov wrote: >> >>> The problem seems to be that the reflection analysis cannot determine the >>> type of the TypeSerializerInputFormat. >>> >>> One possible solution is to add the ResultTypeQueryable interface and >>> force >>> clients to explicitly set the TypeInformation. >>> >>> This might break code which relies on automatic type inference, but at the >>> moment I cannot find any other usages of the TypeSerializerInputFormat >>> except from the unit test. >>> >>> >>> ---------- Forwarded message ---------- >>> From: Alexander Alexandrov <[hidden email]> >>> Date: 2015-01-29 12:04 GMT+01:00 >>> Subject: TypeSerializerInputFormat cannot determine its type automatically >>> To: [hidden email] >>> >>> >>> I am trying to use the TypeSerializer IO formats to write temp data to >>> disk. A gist with a minimal example can be found here: >>> >>> https://gist.github.com/aalexandrov/90bf21f66bf604676f37 >>> >>> However, with the current setting I get the following error with the >>> TypeSerializerInputFormat: >>> >>> Exception in thread "main" >>> org.apache.flink.api.common.InvalidProgramException: The type returned by >>> the input format could not be automatically determined. Please specify the >>> TypeInformation of the produced type explicitly. >>> at >>> org.apache.flink.api.java.ExecutionEnvironment.readFile( >>> ExecutionEnvironment.java:341) >>> at SerializedFormatExample$.main(SerializedFormatExample.scala:48) >>> at SerializedFormatExample.main(SerializedFormatExample.scala) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke( >>> NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke( >>> DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at com.intellij.rt.execution.application.AppMain.main( >>> AppMain.java:134) >>> >>> I think that the typeInformation instance at line 43 should be somehow >>> passed to the TypeSerializerInputFormat, but I cannot find a way to do it. >>> >>> Any suggestions? >>> >>> Thanks, >>> A. >>> >>> |
Free forum by Nabble | Edit this page |