Hi,
I have a streaming integration test with two operators. A source that emits records and watermarks, and a sink that collects the records. The topology runs in embedded mode and the results are collected in memory. Now, in addition to the records, I also want to verify that watermarks have been emitted. What's the recommended way of doing that? Thanks |
Hi Thomas,
some test cases in JoinHarnessTest <https://github.com/apache/flink/blob/release-1.4/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala> show how to verify the emitted watermarks. Hope this helps. Best, Xingcan > On 21 Feb 2018, at 2:09 PM, Thomas Weise <[hidden email]> wrote: > > Hi, > > I have a streaming integration test with two operators. A source that emits > records and watermarks, and a sink that collects the records. The topology > runs in embedded mode and the results are collected in memory. Now, in > addition to the records, I also want to verify that watermarks have been > emitted. What's the recommended way of doing that? > > Thanks |
Hi Xingcan,
thanks, this is a good way of testing an individual operator. I had written my own mock code to intercept source context and collect the results, this is a much better approach for operator testing. I wonder how I can verify with an embedded Flink cluster though. Even though my single operator test passes, the results are not emitted as expected within a topology (not observed in the attached sink). What's the test approach there? Thanks, Thomas On Wed, Feb 21, 2018 at 12:43 AM, Xingcan Cui <[hidden email]> wrote: > Hi Thomas, > > some test cases in JoinHarnessTest <https://github.com/apache/ > flink/blob/release-1.4/flink-libraries/flink-table/src/ > test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala> > show how to verify the emitted watermarks. > > Hope this helps. > > Best, > Xingcan > > > On 21 Feb 2018, at 2:09 PM, Thomas Weise <[hidden email]> wrote: > > > > Hi, > > > > I have a streaming integration test with two operators. A source that > emits > > records and watermarks, and a sink that collects the records. The > topology > > runs in embedded mode and the results are collected in memory. Now, in > > addition to the records, I also want to verify that watermarks have been > > emitted. What's the recommended way of doing that? > > > > Thanks > > |
Hi Thomas,
generally speaking, if you want to test a whole job, just run the pipeline in your test case with a collection-based source and a result collecting sink. If your single operator tests passes while the integration test fails, maybe you should first check the timestamp / watermark assigners or the partitioning mechanisms used. Best, Xingcan > On 28 Feb 2018, at 5:46 AM, Thomas Weise <[hidden email]> wrote: > > Hi Xingcan, > > thanks, this is a good way of testing an individual operator. I had written > my own mock code to intercept source context and collect the results, this > is a much better approach for operator testing. > > I wonder how I can verify with an embedded Flink cluster though. Even > though my single operator test passes, the results are not emitted as > expected within a topology (not observed in the attached sink). What's the > test approach there? > > Thanks, > Thomas > > > On Wed, Feb 21, 2018 at 12:43 AM, Xingcan Cui <[hidden email]> wrote: > >> Hi Thomas, >> >> some test cases in JoinHarnessTest <https://github.com/apache/ >> flink/blob/release-1.4/flink-libraries/flink-table/src/ >> test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala> >> show how to verify the emitted watermarks. >> >> Hope this helps. >> >> Best, >> Xingcan >> >>> On 21 Feb 2018, at 2:09 PM, Thomas Weise <[hidden email]> wrote: >>> >>> Hi, >>> >>> I have a streaming integration test with two operators. A source that >> emits >>> records and watermarks, and a sink that collects the records. The >> topology >>> runs in embedded mode and the results are collected in memory. Now, in >>> addition to the records, I also want to verify that watermarks have been >>> emitted. What's the recommended way of doing that? >>> >>> Thanks >> >> |
In reply to this post by Thomas Weise
Hi Thomas,
generally speaking, if you want to test a whole job, just run the pipeline in your test case with a collection-based source and a result collecting sink. If your single operator tests passes while the integration test fails, maybe you should first check the timestamp / watermark assigners or the partitioning mechanisms used. Best, Xingcan > On 28 Feb 2018, at 5:46 AM, Thomas Weise <[hidden email]> wrote: > > Hi Xingcan, > > thanks, this is a good way of testing an individual operator. I had written > my own mock code to intercept source context and collect the results, this > is a much better approach for operator testing. > > I wonder how I can verify with an embedded Flink cluster though. Even > though my single operator test passes, the results are not emitted as > expected within a topology (not observed in the attached sink). What's the > test approach there? > > Thanks, > Thomas > > > On Wed, Feb 21, 2018 at 12:43 AM, Xingcan Cui <[hidden email]> wrote: > >> Hi Thomas, >> >> some test cases in JoinHarnessTest <https://github.com/apache/ >> flink/blob/release-1.4/flink-libraries/flink-table/src/ >> test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala> >> show how to verify the emitted watermarks. >> >> Hope this helps. >> >> Best, >> Xingcan >> >>> On 21 Feb 2018, at 2:09 PM, Thomas Weise <[hidden email]> wrote: >>> >>> Hi, >>> >>> I have a streaming integration test with two operators. A source that >> emits >>> records and watermarks, and a sink that collects the records. The >> topology >>> runs in embedded mode and the results are collected in memory. Now, in >>> addition to the records, I also want to verify that watermarks have been >>> emitted. What's the recommended way of doing that? >>> >>> Thanks >> >> |
Hi,
I had sorted out how to run the topology in embedded mode. What wasn't clear to me is how I can verify the watermark, but as per following thread that can be done by inserting a process function: https://lists.apache.org/thread.html/9a47cb2284032527c1b63d35beb9bd2d4bdc36197849b84f5f69b768@%3Cdev.flink.apache.org%3E Thanks, Thomas On Wed, Feb 28, 2018 at 4:35 AM, Xingcan Cui <[hidden email]> wrote: > Hi Thomas, > > generally speaking, if you want to test a whole job, just run the pipeline > in your test case with a collection-based source and a result collecting > sink. If your single operator tests passes while the integration test > fails, maybe you should first check the timestamp / watermark assigners or > the partitioning mechanisms used. > > Best, > Xingcan > > > On 28 Feb 2018, at 5:46 AM, Thomas Weise <[hidden email]> wrote: > > > > Hi Xingcan, > > > > thanks, this is a good way of testing an individual operator. I had > written > > my own mock code to intercept source context and collect the results, > this > > is a much better approach for operator testing. > > > > I wonder how I can verify with an embedded Flink cluster though. Even > > though my single operator test passes, the results are not emitted as > > expected within a topology (not observed in the attached sink). What's > the > > test approach there? > > > > Thanks, > > Thomas > > > > > > On Wed, Feb 21, 2018 at 12:43 AM, Xingcan Cui <[hidden email]> > wrote: > > > >> Hi Thomas, > >> > >> some test cases in JoinHarnessTest <https://github.com/apache/ > >> flink/blob/release-1.4/flink-libraries/flink-table/src/ > >> test/scala/org/apache/flink/table/runtime/harness/ > JoinHarnessTest.scala> > >> show how to verify the emitted watermarks. > >> > >> Hope this helps. > >> > >> Best, > >> Xingcan > >> > >>> On 21 Feb 2018, at 2:09 PM, Thomas Weise <[hidden email]> wrote: > >>> > >>> Hi, > >>> > >>> I have a streaming integration test with two operators. A source that > >> emits > >>> records and watermarks, and a sink that collects the records. The > >> topology > >>> runs in embedded mode and the results are collected in memory. Now, in > >>> addition to the records, I also want to verify that watermarks have > been > >>> emitted. What's the recommended way of doing that? > >>> > >>> Thanks > >> > >> > > |
Free forum by Nabble | Edit this page |