[DISCUSS] Iterative streaming example

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Iterative streaming example

Szabó Péter
Does everyone know of a good, simple and realistic streaming iteration
example? The current example tests a random generator, but it should be
replaced by something deterministic in order to be testable.

Peter
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Iterative streaming example

Stephan Ewen
I think that the Samoa people have quite a few nice examples along the
lines of model training with feedback.

@Paris: What would be the simplest example?

On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <[hidden email]>
wrote:

> Does everyone know of a good, simple and realistic streaming iteration
> example? The current example tests a random generator, but it should be
> replaced by something deterministic in order to be testable.
>
> Peter
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Iterative streaming example

Paris Carbone
Hello Peter,

Streaming machine learning algorithms make use of iterations quite widely. One simple example is implementing distributed stream learners. There, in many cases you need some central model aggregator, distributed estimators to offload the central node and of course feedback loops to merge everything back to the main aggregator periodically. One such example in the Vertical Hoeffding Tree Classifier (VFDT) [1] that is implemented in Samoa.

Iterative streams are also useful for optimisation techniques as in batch processing (eg. trying different parameters to estimate a variable, getting back the accuracy from an evaluator and repeating until a condition is achieved).

I hope this helps to get a general idea of where iterations can be used.

[1] https://github.com/yahoo/samoa/wiki/Vertical-Hoeffding-Tree-Classifier


On 23 Feb 2015, at 12:13, Stephan Ewen <[hidden email]<mailto:[hidden email]>> wrote:

I think that the Samoa people have quite a few nice examples along the
lines of model training with feedback.

@Paris: What would be the simplest example?

On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <[hidden email]<mailto:[hidden email]>>
wrote:

Does everyone know of a good, simple and realistic streaming iteration
example? The current example tests a random generator, but it should be
replaced by something deterministic in order to be testable.

Peter


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Iterative streaming example

Szabó Péter
Nice. Thank you guys!

@Paris
Are there any Flink implementations of this model? The GitHub doc is quite
general.

Peter

2015-02-23 14:05 GMT+01:00 Paris Carbone <[hidden email]>:

> Hello Peter,
>
> Streaming machine learning algorithms make use of iterations quite widely.
> One simple example is implementing distributed stream learners. There, in
> many cases you need some central model aggregator, distributed estimators
> to offload the central node and of course feedback loops to merge
> everything back to the main aggregator periodically. One such example in
> the Vertical Hoeffding Tree Classifier (VFDT) [1] that is implemented in
> Samoa.
>
> Iterative streams are also useful for optimisation techniques as in batch
> processing (eg. trying different parameters to estimate a variable, getting
> back the accuracy from an evaluator and repeating until a condition is
> achieved).
>
> I hope this helps to get a general idea of where iterations can be used.
>
> [1] https://github.com/yahoo/samoa/wiki/Vertical-Hoeffding-Tree-Classifier
>
>
> On 23 Feb 2015, at 12:13, Stephan Ewen <[hidden email]<mailto:
> [hidden email]>> wrote:
>
> I think that the Samoa people have quite a few nice examples along the
> lines of model training with feedback.
>
> @Paris: What would be the simplest example?
>
> On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <[hidden email]
> <mailto:[hidden email]>>
> wrote:
>
> Does everyone know of a good, simple and realistic streaming iteration
> example? The current example tests a random generator, but it should be
> replaced by something deterministic in order to be testable.
>
> Peter
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Iterative streaming example

Paris Carbone
We haven’t yet implemented any of these machine learning models directly on the Flink api but we have run them through the existing Samoa tasks, using Flink Streaming as a backend. Apart from it we have a student looking into machine learning pipelines on Flink Streaming with a focus on iterative jobs so we will have many more use cases coming soon. Are you also considering looking into something similar? Perhaps I can help more if you have some specific use case in mind.

Paris


On 23 Feb 2015, at 14:29, Szabó Péter <[hidden email]<mailto:[hidden email]>> wrote:

Nice. Thank you guys!

@Paris
Are there any Flink implementations of this model? The GitHub doc is quite
general.

Peter

2015-02-23 14:05 GMT+01:00 Paris Carbone <[hidden email]<mailto:[hidden email]>>:

Hello Peter,

Streaming machine learning algorithms make use of iterations quite widely.
One simple example is implementing distributed stream learners. There, in
many cases you need some central model aggregator, distributed estimators
to offload the central node and of course feedback loops to merge
everything back to the main aggregator periodically. One such example in
the Vertical Hoeffding Tree Classifier (VFDT) [1] that is implemented in
Samoa.

Iterative streams are also useful for optimisation techniques as in batch
processing (eg. trying different parameters to estimate a variable, getting
back the accuracy from an evaluator and repeating until a condition is
achieved).

I hope this helps to get a general idea of where iterations can be used.

[1] https://github.com/yahoo/samoa/wiki/Vertical-Hoeffding-Tree-Classifier


On 23 Feb 2015, at 12:13, Stephan Ewen <[hidden email]<mailto:[hidden email]><mailto:
[hidden email]<mailto:[hidden email]>>> wrote:

I think that the Samoa people have quite a few nice examples along the
lines of model training with feedback.

@Paris: What would be the simplest example?

On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <[hidden email]<mailto:[hidden email]>
<mailto:[hidden email]>>
wrote:

Does everyone know of a good, simple and realistic streaming iteration
example? The current example tests a random generator, but it should be
replaced by something deterministic in order to be testable.

Peter




Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Iterative streaming example

Szabó Péter
Cool! At the moment I don't have any good use cases, but I will read some
literature about it in the near future. The first priority for me is to
make a good streaming iteration example, and Márton liked the
machine-learning idea. That, and there is a group in SZTAKI that develops
recommendation systems and we'd like to cooperate in order to implement
some of their algorithms in Flink Streaming.

Peter

2015-02-26 23:30 GMT+01:00 Paris Carbone <[hidden email]>:

> We haven’t yet implemented any of these machine learning models directly
> on the Flink api but we have run them through the existing Samoa tasks,
> using Flink Streaming as a backend. Apart from it we have a student looking
> into machine learning pipelines on Flink Streaming with a focus on
> iterative jobs so we will have many more use cases coming soon. Are you
> also considering looking into something similar? Perhaps I can help more if
> you have some specific use case in mind.
>
> Paris
>
>
> On 23 Feb 2015, at 14:29, Szabó Péter <[hidden email]<mailto:
> [hidden email]>> wrote:
>
> Nice. Thank you guys!
>
> @Paris
> Are there any Flink implementations of this model? The GitHub doc is quite
> general.
>
> Peter
>
> 2015-02-23 14:05 GMT+01:00 Paris Carbone <[hidden email]<mailto:
> [hidden email]>>:
>
> Hello Peter,
>
> Streaming machine learning algorithms make use of iterations quite widely.
> One simple example is implementing distributed stream learners. There, in
> many cases you need some central model aggregator, distributed estimators
> to offload the central node and of course feedback loops to merge
> everything back to the main aggregator periodically. One such example in
> the Vertical Hoeffding Tree Classifier (VFDT) [1] that is implemented in
> Samoa.
>
> Iterative streams are also useful for optimisation techniques as in batch
> processing (eg. trying different parameters to estimate a variable, getting
> back the accuracy from an evaluator and repeating until a condition is
> achieved).
>
> I hope this helps to get a general idea of where iterations can be used.
>
> [1] https://github.com/yahoo/samoa/wiki/Vertical-Hoeffding-Tree-Classifier
>
>
> On 23 Feb 2015, at 12:13, Stephan Ewen <[hidden email]<mailto:
> [hidden email]><mailto:
> [hidden email]<mailto:[hidden email]>>> wrote:
>
> I think that the Samoa people have quite a few nice examples along the
> lines of model training with feedback.
>
> @Paris: What would be the simplest example?
>
> On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter <[hidden email]
> <mailto:[hidden email]>
> <mailto:[hidden email]>>
> wrote:
>
> Does everyone know of a good, simple and realistic streaming iteration
> example? The current example tests a random generator, but it should be
> replaced by something deterministic in order to be testable.
>
> Peter
>
>
>
>
>