(DEPRECATED) Apache Flink Mailing List archive.

Increasing MSE with additional iterations

Classic

List

Threaded

2 messages Options

Trevor Grant

Increasing MSE with additional iterations

Is it just me or does MSE tend to increase with more iterations of Linear
Regression?

Using 1.0.2 (or 1.1)

%flink
import org.apache.flink.ml.optimization.SimpleGradientDescent
import org.apache.flink.ml.optimization.LearningRateMethod
import org.apache.flink.ml.regression.MultipleLinearRegression
import org.apache.flink.ml.common.LabeledVector
import org.apache.flink.ml.math.DenseVector

val survival = env.readCsvFile[(String, String, String,
String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
val survivalLV = survival
.map{tuple =>
val list = tuple.productIterator.toList
val numList = list.map(_.asInstanceOf[String].toDouble)
LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
}

val mlr_default = MultipleLinearRegression()
.setIterations(5)

mlr_default.fit(survivalLV)

val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()

val mlr_default = MultipleLinearRegression()
.setIterations(10)

mlr_default.fit(survivalLV)

val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
println(mse1 , mse2 )

Results in :

(Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things." -Virgil*

Till Rohrmann

Re: Increasing MSE with additional iterations

Hi Trevor,

the multiple linear regression implementation is quite sensitive to the
initial learning rate. If the value is not set right, it might be the case
that the algorithm alternates between ever increasing values left and right
of the minimum. Could you try to set a smaller initial learning rate? If
the error should still persist, then we should file a JIRA issue for that.

Cheers,
Till

On Tue, May 3, 2016 at 4:58 PM, Trevor Grant <[hidden email]>
wrote:

> Is it just me or does MSE tend to increase with more iterations of Linear
> Regression?
>
> Using 1.0.2 (or 1.1)
>
> %flink
> import org.apache.flink.ml.optimization.SimpleGradientDescent
> import org.apache.flink.ml.optimization.LearningRateMethod
> import org.apache.flink.ml.regression.MultipleLinearRegression
> import org.apache.flink.ml.common.LabeledVector
> import org.apache.flink.ml.math.DenseVector
>
> val survival = env.readCsvFile[(String, String, String,
> String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
> val survivalLV = survival
> .map{tuple =>
> val list = tuple.productIterator.toList
> val numList = list.map(_.asInstanceOf[String].toDouble)
> LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
> }
>
>
> val mlr_default = MultipleLinearRegression()
> .setIterations(5)
>
>
> mlr_default.fit(survivalLV)
>
> val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()
>
>
> val mlr_default = MultipleLinearRegression()
> .setIterations(10)
>
> mlr_default.fit(survivalLV)
>
> val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
> println(mse1 , mse2 )
>
>
> Results in :
>
> (Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))
>
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things." -Virgil*
>