(DEPRECATED) Apache Flink Mailing List archive.

FlinkML NeuralNetworks

Classic

List

Threaded

5 messages Options

Trevor Grant

FlinkML NeuralNetworks

Hey all,

I had a post a while ago about needing neural networks. We specifically
need a very special type that are good for time series/sensors called
LSTM. We had a talk about pros/cons of using deeplearning4j for this use
case and eventually decided it made more sense to implement in native Flink
for our use case.

So, this is somewhat relevant to what Theodore just said, but different
enough that I wanted a separate thread.

"Focusing on Flink does well and implement algorithms built around inherent
advantages..."

One thing that jumps to mind is doing online learning. The batch nature of
all of the other 'big boys' means that they are by definition going to
always be offline modes.

Also, even though LTSMs are somewhat of a corner case in the NN world, the
streaming nature of Flink (a sequence of data) makes fairly relevant to
people who would be using Flink in the first place (? IMHO)

Finally, there should be some positive externalities that come from this
such as a back propegation algorithm, which should then be reusable for
things like HMMs.

So at any rate, the research Spike for me started earlier this week- I hope
to start cutting some scala code over the weekend or beginning of next
week. Also I'm asking to check out FLINK-2259 because I need some sort of
functionality like that before I get started, and I could use the git
practice.

Idk if there is any interest in adding this or if you want to make a JIRA
for LTSM neural nets (or if I should write one, with appropriate papers
cited, as seems to be the fashion), or maybe wait and see what I end up
with?

Also- I'll probably be blowing you up with questions.

Best,

tg

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*

Suneel Marthi

Re: FlinkML NeuralNetworks

On Fri, Feb 12, 2016 at 8:45 AM, Trevor Grant <[hidden email]>
wrote:

> Hey all,
>
> I had a post a while ago about needing neural networks. We specifically
> need a very special type that are good for time series/sensors called
> LSTM. We had a talk about pros/cons of using deeplearning4j for this use
> case and eventually decided it made more sense to implement in native Flink
> for our use case.
>
> So, this is somewhat relevant to what Theodore just said, but different
> enough that I wanted a separate thread.
>
> "Focusing on Flink does well and implement algorithms built around inherent
> advantages..."
>
> One thing that jumps to mind is doing online learning. The batch nature of
> all of the other 'big boys' means that they are by definition going to
> always be offline modes.
>
> Also, even though LTSMs are somewhat of a corner case in the NN world, the
> streaming nature of Flink (a sequence of data) makes fairly relevant to
> people who would be using Flink in the first place (? IMHO)
>
> Finally, there should be some positive externalities that come from this
> such as a back propegation algorithm, which should then be reusable for
> things like HMMs.
>
> So at any rate, the research Spike for me started earlier this week- I hope
> to start cutting some scala code over the weekend or beginning of next
> week. Also I'm asking to check out FLINK-2259 because I need some sort of
> functionality like that before I get started, and I could use the git
> practice.
>
> Idk if there is any interest in adding this or if you want to make a JIRA
> for LTSM neural nets (or if I should write one, with appropriate papers
> cited, as seems to be the fashion), or maybe wait and see what I end up
> with?
>
> It would be good if we also supported Bidirectional LSTMs.

http://www.cs.toronto.edu/~graves/asru_2013.pdf

http://www.cs.toronto.edu/~graves/phd.pdf

> Also- I'll probably be blowing you up with questions.
>
> Best,
>
> tg
>
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things." -Virgil*
>

Trevor Grant

Re: FlinkML NeuralNetworks

Agreed. Our reasoning for for contributing straight to Flink was we plan on
doing a lot of wierd monkey-ing around with these things, and were going to
have to get our hands dirty with some code eventually anyway. The LSTM
isn't *that* difficult to implement, and it seems easier to write our own
than to understand someone else's insanity.

The plan is to get a 'basic' version going, then start tweaking the special
cases. We have a use case for bi-directional, but it's not our primary
motivation. I have no problem exposing new flavors as we make them.

tg

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*

On Fri, Feb 12, 2016 at 7:51 AM, Suneel Marthi <[hidden email]>
wrote:

> On Fri, Feb 12, 2016 at 8:45 AM, Trevor Grant <[hidden email]>
> wrote:
>
> > Hey all,
> >
> > I had a post a while ago about needing neural networks. We specifically
> > need a very special type that are good for time series/sensors called
> > LSTM. We had a talk about pros/cons of using deeplearning4j for this use
> > case and eventually decided it made more sense to implement in native
> Flink
> > for our use case.
> >
> > So, this is somewhat relevant to what Theodore just said, but different
> > enough that I wanted a separate thread.
> >
> > "Focusing on Flink does well and implement algorithms built around
> inherent
> > advantages..."
> >
> > One thing that jumps to mind is doing online learning. The batch nature
> of
> > all of the other 'big boys' means that they are by definition going to
> > always be offline modes.
> >
> > Also, even though LTSMs are somewhat of a corner case in the NN world,
> the
> > streaming nature of Flink (a sequence of data) makes fairly relevant to
> > people who would be using Flink in the first place (? IMHO)
> >
> > Finally, there should be some positive externalities that come from this
> > such as a back propegation algorithm, which should then be reusable for
> > things like HMMs.
> >
> > So at any rate, the research Spike for me started earlier this week- I
> hope
> > to start cutting some scala code over the weekend or beginning of next
> > week. Also I'm asking to check out FLINK-2259 because I need some sort of
> > functionality like that before I get started, and I could use the git
> > practice.
> >
> > Idk if there is any interest in adding this or if you want to make a JIRA
> > for LTSM neural nets (or if I should write one, with appropriate papers
> > cited, as seems to be the fashion), or maybe wait and see what I end up
> > with?
> >
> > It would be good if we also supported Bidirectional LSTMs.
>
> http://www.cs.toronto.edu/~graves/asru_2013.pdf
>
> http://www.cs.toronto.edu/~graves/phd.pdf
>
>
>
>
> > Also- I'll probably be blowing you up with questions.
> >
> > Best,
> >
> > tg
> >
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things." -Virgil*
> >
>

Simone Robutti

Re: FlinkML NeuralNetworks

Asking as someone that never did NN on Flink, would you implement it using
JCuda? And would you implement it with model parallelization? Is there any
theoretical limit to implement "model and data parallelism" in Flink? If
you don't use GPUs and you don't parallelize models and data at the same
time, what is your motivation to do such a thing on Flink instead of a
local enviroment that would probably be more performant on a certain degree?

2016-02-12 14:58 GMT+01:00 Trevor Grant <[hidden email]>:

> Agreed. Our reasoning for for contributing straight to Flink was we plan on
> doing a lot of wierd monkey-ing around with these things, and were going to
> have to get our hands dirty with some code eventually anyway. The LSTM
> isn't *that* difficult to implement, and it seems easier to write our own
> than to understand someone else's insanity.
>
> The plan is to get a 'basic' version going, then start tweaking the special
> cases. We have a use case for bi-directional, but it's not our primary
> motivation. I have no problem exposing new flavors as we make them.
>
> tg
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things." -Virgil*
>
>
> On Fri, Feb 12, 2016 at 7:51 AM, Suneel Marthi <[hidden email]>
> wrote:
>
> > On Fri, Feb 12, 2016 at 8:45 AM, Trevor Grant <[hidden email]>
> > wrote:
> >
> > > Hey all,
> > >
> > > I had a post a while ago about needing neural networks. We
> specifically
> > > need a very special type that are good for time series/sensors called
> > > LSTM. We had a talk about pros/cons of using deeplearning4j for this
> use
> > > case and eventually decided it made more sense to implement in native
> > Flink
> > > for our use case.
> > >
> > > So, this is somewhat relevant to what Theodore just said, but different
> > > enough that I wanted a separate thread.
> > >
> > > "Focusing on Flink does well and implement algorithms built around
> > inherent
> > > advantages..."
> > >
> > > One thing that jumps to mind is doing online learning. The batch
> nature
> > of
> > > all of the other 'big boys' means that they are by definition going to
> > > always be offline modes.
> > >
> > > Also, even though LTSMs are somewhat of a corner case in the NN world,
> > the
> > > streaming nature of Flink (a sequence of data) makes fairly relevant to
> > > people who would be using Flink in the first place (? IMHO)
> > >
> > > Finally, there should be some positive externalities that come from
> this
> > > such as a back propegation algorithm, which should then be reusable for
> > > things like HMMs.
> > >
> > > So at any rate, the research Spike for me started earlier this week- I
> > hope
> > > to start cutting some scala code over the weekend or beginning of next
> > > week. Also I'm asking to check out FLINK-2259 because I need some sort
> of
> > > functionality like that before I get started, and I could use the git
> > > practice.
> > >
> > > Idk if there is any interest in adding this or if you want to make a
> JIRA
> > > for LTSM neural nets (or if I should write one, with appropriate papers
> > > cited, as seems to be the fashion), or maybe wait and see what I end up
> > > with?
> > >
> > > It would be good if we also supported Bidirectional LSTMs.
> >
> > http://www.cs.toronto.edu/~graves/asru_2013.pdf
> >
> > http://www.cs.toronto.edu/~graves/phd.pdf
> >
> >
> >
> >
> > > Also- I'll probably be blowing you up with questions.
> > >
> > > Best,
> > >
> > > tg
> > >
> > >
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things." -Virgil*
> > >
> >
>

Trevor Grant

Re: FlinkML NeuralNetworks

JCuda: No, I'm not willing to rely on servers having NVidia cards (some one
who is more familiar with server hardware may correct me, in which case
I'll say, "No, because *my* servers don't have NVidia cards- someone else
can add").

Paralleization: Yes.Admittedly, very clever use of Python could probably be
used to solve this problem depending on how we cut it up (I anticipate
cursing myself for not going this route several times in the weeks to
come). The motivation for Flink over Python is a solution that is the hope
for a more general and reusable approach. Neural networks in general are
solvable so long as you have some decent linear algebra backing you up.
(However, I'm also toying with the idea of additionally putting in an
evolutionary algorithm approach as an alternative to back propagation
through time)

The thought guiding this, to borrow a term from American auto racing, is
"there is no replacement for displacement" - meaning, a reasonably
functional 7 liter engine will be powerful than a performance tuned 1.6
liter engine. In this case- an OK implementation in Flink spread over lots
and lots of processors being more powerful than a local 'sport-tuned'
implementation with clever algorithms and GPUs etc, etc. (The arguments
against evolutionary algorithms in solving neural networks normally
revolves around the concept of efficiency, however doing several
generations on each node then reporting best parameter sets to be 'bred'
then re broadcasting parameter sets is a natural fit for distributed
systems. More of an academic exercise, but interesting conceptually- I know
there are some grad students reading this who are itching for thesis
projects; Olcay Akman and I did something similar for an implementation in
R, see my github repo IRENE for a very ugly implementation)

The motivation for Flink over an alternative big-data platform (see
SPARK-2352) is A) online learning and sequences intuitively seems to be a
better fit for Flink's streaming architecture, B) I don't know much about
SparkML code base so I'd there would be an additional learning curve, C)
I'd have to spend the rest of my life looking over my shoulder to maker
sure Slim wasn't going jump out and get me (we live in the same city, the
fear is real).

tg

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*

On Fri, Feb 12, 2016 at 8:04 AM, Simone Robutti <
[hidden email]> wrote:

> Asking as someone that never did NN on Flink, would you implement it using
> JCuda? And would you implement it with model parallelization? Is there any
> theoretical limit to implement "model and data parallelism" in Flink? If
> you don't use GPUs and you don't parallelize models and data at the same
> time, what is your motivation to do such a thing on Flink instead of a
> local enviroment that would probably be more performant on a certain
> degree?
>
> 2016-02-12 14:58 GMT+01:00 Trevor Grant <[hidden email]>:
>
> > Agreed. Our reasoning for for contributing straight to Flink was we plan
> on
> > doing a lot of wierd monkey-ing around with these things, and were going
> to
> > have to get our hands dirty with some code eventually anyway. The LSTM
> > isn't *that* difficult to implement, and it seems easier to write our own
> > than to understand someone else's insanity.
> >
> > The plan is to get a 'basic' version going, then start tweaking the
> special
> > cases. We have a use case for bi-directional, but it's not our primary
> > motivation. I have no problem exposing new flavors as we make them.
> >
> > tg
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things." -Virgil*
> >
> >
> > On Fri, Feb 12, 2016 at 7:51 AM, Suneel Marthi <[hidden email]>
> > wrote:
> >
> > > On Fri, Feb 12, 2016 at 8:45 AM, Trevor Grant <
> [hidden email]>
> > > wrote:
> > >
> > > > Hey all,
> > > >
> > > > I had a post a while ago about needing neural networks. We
> > specifically
> > > > need a very special type that are good for time series/sensors called
> > > > LSTM. We had a talk about pros/cons of using deeplearning4j for this
> > use
> > > > case and eventually decided it made more sense to implement in native
> > > Flink
> > > > for our use case.
> > > >
> > > > So, this is somewhat relevant to what Theodore just said, but
> different
> > > > enough that I wanted a separate thread.
> > > >
> > > > "Focusing on Flink does well and implement algorithms built around
> > > inherent
> > > > advantages..."
> > > >
> > > > One thing that jumps to mind is doing online learning. The batch
> > nature
> > > of
> > > > all of the other 'big boys' means that they are by definition going
> to
> > > > always be offline modes.
> > > >
> > > > Also, even though LTSMs are somewhat of a corner case in the NN
> world,
> > > the
> > > > streaming nature of Flink (a sequence of data) makes fairly relevant
> to
> > > > people who would be using Flink in the first place (? IMHO)
> > > >
> > > > Finally, there should be some positive externalities that come from
> > this
> > > > such as a back propegation algorithm, which should then be reusable
> for
> > > > things like HMMs.
> > > >
> > > > So at any rate, the research Spike for me started earlier this week-
> I
> > > hope
> > > > to start cutting some scala code over the weekend or beginning of
> next
> > > > week. Also I'm asking to check out FLINK-2259 because I need some
> sort
> > of
> > > > functionality like that before I get started, and I could use the git
> > > > practice.
> > > >
> > > > Idk if there is any interest in adding this or if you want to make a
> > JIRA
> > > > for LTSM neural nets (or if I should write one, with appropriate
> papers
> > > > cited, as seems to be the fashion), or maybe wait and see what I end
> up
> > > > with?
> > > >
> > > > It would be good if we also supported Bidirectional LSTMs.
> > >
> > > http://www.cs.toronto.edu/~graves/asru_2013.pdf
> > >
> > > http://www.cs.toronto.edu/~graves/phd.pdf
> > >
> > >
> > >
> > >
> > > > Also- I'll probably be blowing you up with questions.
> > > >
> > > > Best,
> > > >
> > > > tg
> > > >
> > > >
> > > >
> > > > Trevor Grant
> > > > Data Scientist
> > > > https://github.com/rawkintrevo
> > > > http://stackexchange.com/users/3002022/rawkintrevo
> > > > http://trevorgrant.org
> > > >
> > > > *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > > >
> > >
> >
>