Increasing MSE with additional iterations

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Increasing MSE with additional iterations

Trevor Grant
Is it just me or does MSE tend to increase with more iterations of Linear
Regression?

Using 1.0.2 (or 1.1)

%flink
import org.apache.flink.ml.optimization.SimpleGradientDescent
import org.apache.flink.ml.optimization.LearningRateMethod
import org.apache.flink.ml.regression.MultipleLinearRegression
import org.apache.flink.ml.common.LabeledVector
import org.apache.flink.ml.math.DenseVector

val survival = env.readCsvFile[(String, String, String,
String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
val survivalLV = survival
  .map{tuple =>
    val list = tuple.productIterator.toList
    val numList = list.map(_.asInstanceOf[String].toDouble)
    LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
  }


val mlr_default = MultipleLinearRegression()
                            .setIterations(5)


mlr_default.fit(survivalLV)

val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()


val mlr_default = MultipleLinearRegression()
                            .setIterations(10)

mlr_default.fit(survivalLV)

val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
println(mse1 , mse2 )


Results in :

(Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*
Reply | Threaded
Open this post in threaded view
|

Re: Increasing MSE with additional iterations

Till Rohrmann
Hi Trevor,

the multiple linear regression implementation is quite sensitive to the
initial learning rate. If the value is not set right, it might be the case
that the algorithm alternates between ever increasing values left and right
of the minimum. Could you try to set a smaller initial learning rate? If
the error should still persist, then we should file a JIRA issue for that.

Cheers,
Till

On Tue, May 3, 2016 at 4:58 PM, Trevor Grant <[hidden email]>
wrote:

> Is it just me or does MSE tend to increase with more iterations of Linear
> Regression?
>
> Using 1.0.2 (or 1.1)
>
> %flink
> import org.apache.flink.ml.optimization.SimpleGradientDescent
> import org.apache.flink.ml.optimization.LearningRateMethod
> import org.apache.flink.ml.regression.MultipleLinearRegression
> import org.apache.flink.ml.common.LabeledVector
> import org.apache.flink.ml.math.DenseVector
>
> val survival = env.readCsvFile[(String, String, String,
> String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
> val survivalLV = survival
>   .map{tuple =>
>     val list = tuple.productIterator.toList
>     val numList = list.map(_.asInstanceOf[String].toDouble)
>     LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
>   }
>
>
> val mlr_default = MultipleLinearRegression()
>                             .setIterations(5)
>
>
> mlr_default.fit(survivalLV)
>
> val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()
>
>
> val mlr_default = MultipleLinearRegression()
>                             .setIterations(10)
>
> mlr_default.fit(survivalLV)
>
> val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
> println(mse1 , mse2 )
>
>
> Results in :
>
> (Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))
>
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>