Wednesday, June 27, 2012

Tip: Linear regression easily from your linux commandline with gnuplot


Create a tab separated file with your data, xy.tab:
#   X       Y
    1100     88000
    1300    104000
    1400    112000
    1800    144000
    1900    152000
    2400    192000

Then run gnuplot:
$ gnuplot

Then,

gnuplot> plot "xy.tab"



gnuplot> f(x)=m*x+c
gnuplot> fit f(x) "xy.tab" via m, c

[...] many lines removed [...]

resultant parameter values

m               = 80
c               = 2.44655e-11
************************
After 8 iterations the fit converged.
final sum of squares of residuals : 6.35275e-22
rel. change during last iteration : 0

degrees of freedom    (FIT_NDF)                        : 4
rms of residuals      (FIT_STDFIT) = sqrt(WSSR/ndf)    : 1.26023e-11
variance of residuals (reduced chisquare) = WSSR/ndf   : 1.58819e-22

         Singular matrix in Invert_RtR

gnuplot> replot f(x)


You have now a plot of your data + the linear regression, along with the constants (in our example the x coefficient, m, is 80 and the c is practically 0.

Have fun!!