spark-timeseries icon indicating copy to clipboard operation
spark-timeseries copied to clipboard

support for ARIMAX - adding multiple exogeneous variables

Open tom-data opened this issue 9 years ago • 21 comments

Hi all, I am currently modeling time-series data of channel sales using auto-ARIMA. I need to add exogeneous variables to the ARIMA model. The variables are inflation, unemployment rate. I don't see the current auto-ARIMA model supports exogeneous variables. In R, the exogeneous variable can be added as newxreg to the forecast or predict function. Are we going to support ARIMAX?

Thanks!

tom-data avatar Nov 20 '15 23:11 tom-data

Hi @tom-data,

Yeah, this is something I'd like to library to support, although it's not being worked on currently. @josepablocam published an ARX patch a little while ago (#51) (see the AutoregressionX class) that started to lay the foundation for this.

sryza avatar Nov 23 '15 07:11 sryza

Thank you very much for the information. We probably will contribute to the library in this area.

Tom

---- On Sun, 22 Nov 2015 23:14:08 -0800 Sandy Ryza<[email protected]> wrote ----

Hi @tom-data, Yeah, this is something I'd like to library to support, although it's not being worked on currently. @josepablocam published an ARX patch a little while ago (#51) (see the AutoregressionX class) that started to lay the foundation for this. — Reply to this email directly or view it on GitHub.

tom-data avatar Nov 28 '15 03:11 tom-data

I also would like to find feature in the library , referring to http://robjhyndman.com/hyndsight/arimax/ There are two ways to do so 1-ARIMAX formula 2-Regression with ARIMA errors if no-one is working on it , i can investigate more on that issue , break it to smaller issues and start working on it

mbaddar2 avatar Dec 06 '15 16:12 mbaddar2

@mbaddar2 I don't believe anyone is working on that at the moment.

sryza avatar Dec 16 '15 03:12 sryza

@sryza , based on the following link

http://robjhyndman.com/hyndsight/arimax/ https://www.otexts.org/fpp/9/1 http://robjhyndman.com/talks/RevolutionR/11-Dynamic-Regression.pdf

We have 3 ways to implement this 1-Regression with ARMA errors 2-ARMAX model 3-Transfer function model *- Handling non stationary errors

I think this issue is quite complex , so i suggest the following implementation plan (short term)

1.1-Regression with AR errors 1.2-Regression with ARMA errors 1.3-Regression with ARIMA errors 1.4-Regression with Auto.arima

if we succeeded in that implementation we can plan to implement methods 2 and 3 , any suggestions ?

mbaddar2 avatar Dec 21 '15 20:12 mbaddar2

That sounds like a good approach to me

sryza avatar Dec 21 '15 22:12 sryza

#97 #117 are related

mbaddar1 avatar Jan 24 '16 14:01 mbaddar1

Hi all, I am going to implement ARIMAX model. @sryza, is anyone working on that at the moment? What is current state of ARIMAX issues? Thanks so much!

ekote avatar Jul 18 '16 23:07 ekote

Hi @ekote, I don't believe anyone is working on these at the moment, so any implementations would be greatly welcomed! The closest we have right now is RegressionARIMA: https://github.com/sryza/spark-timeseries/blob/master/src/main/scala/com/cloudera/sparkts/models/RegressionARIMA.scala

sryza avatar Jul 19 '16 00:07 sryza

Hello I also having some questions with ARIMA in spark, I want to find residual vector of ARIMA forecast vector and then put that residual vector as input to Durbin-Watson test is it possible. Because I want to check auto correlation exist in my ARIMA forcast vector. Please tell if possible.

devanshi7 avatar Dec 15 '16 10:12 devanshi7

@ekote I am going to implement ARIMAX in java spark, as ARIMAX.scala is available but it can not be accessible under com.cloudera.sparkts.models path when I am importing in eclipse. I have been used ARIMA and Holt Winter's from spark-ts. Please share alternative or suggestions if any. Thank you.

devanshi7 avatar Dec 20 '16 06:12 devanshi7

Hi @111992-07. Sorry for the delay. I used only Intellij and there everything works. ARIMAX is based on ARIMA so it should work. Please, check your imports and try to build repo via maven.

ekote avatar Jan 04 '17 14:01 ekote

Hi @ekote thanks for reply. I have been used directly ARIMAX.scala classes from my ARIMAX.java, I hope this issue will be solve with this approach.

devanshi7 avatar Jan 16 '17 05:01 devanshi7

@sryza @ekote I have csv for input data which contain categorical variables, I want to pass this categorical to xreg(exogenous values) variable and I have been follow this process,

  1. from csv build dataset
  2. StringIndexer and OneHotEncoder use for categorical variable to dummy variable
  3. store this encoded value to dataset
  4. take double[] to pass in DenseMatrix
  5. Create breeze.linalg.DenseMatrix xreg from this double[] data values
  6. pass this xreg to ARIMAX.fitmodel But I didn't found proper forecast..... So my question is, This is the proper way to generate xreg(exogenous values) variable or not. I will be very grateful if any suggestion to get ARIMAX forecast...!!! Please help useful approach.

devanshi7 avatar Jan 24 '17 10:01 devanshi7

@111992-07, when you are passing this xreg to ARIMAX.fitmodel, based on what you define p, d, q and other parameters?

ekote avatar Feb 02 '17 12:02 ekote

Hi @ekote thanks for reply,

Here, this line for prepare ARIMAX model ARIMAXModel model = ARIMAX.fitModel(0, 1, 1, tsvector, xreg, 1, false, false, userInitParams);

I tried all the different combination of p,d,q(here 0, 1, 1) and includeOriginalXreg(here false),includeIntercept(here false) and lag remains same for all(here lag=1)

and the main code is

Vector tsvector = Vectors.dense(values); System.out.println("Ts vector:" + tsvector.toString()); // vector which forecast

	double[] val = {1.0,1.0,1.0,1.0};
	org.apache.spark.mllib.linalg.DenseVector time = new org.apache.spark.mllib.linalg.DenseVector(val);
	Option<double[]> userInitParams = org.apache.spark.mllib.linalg.DenseVector.unapply(time);
		
	ARIMAXModel model = ARIMAX.fitModel(1, 0, 1, tsvector, xreg, 1, false, false, userInitParams);
	
	double[] coefficients = model.coefficients();
	for (double d : coefficients) {
		System.out.println(d+",");
	}
	System.out.println("coefficients  "+coefficients.length);
	System.out.println("model.xregMaxLag() : "+model.xregMaxLag() );
	System.out.println("p : "+model.p());
	System.out.println("d : "+model.d());
	System.out.println("q : "+model.q());
	
	DenseVector<Object> timeSeries = new DenseVector<Object>(tsvector.toArray());
	
	DenseVector<Object> forcast = model.forecast(timeSeries, xreg);    //vector for forecast and xreg is dense matrix for exogeneous variable here in my input product category and state(both categorical parameter) use for generating Xreg

	System.out.println("Forcast:" +forcast);

Now I am stuck with parameters because based on that forecast will be generate.

Waiting for your guidance

Here is my java code, JavaARIMAX.txt

Input file Input_ARIMAX sample data 1k.xlsx

devanshi7 avatar Feb 06 '17 08:02 devanshi7

has ARIMAX been added as spark libraries, please please !?

elexira avatar Mar 15 '17 15:03 elexira

@Sadrpour , ARIMAX exists a part of spar-timeseries library.

ekote avatar Mar 17 '17 14:03 ekote

@111992-07 as I see, you use ARIMAX.fitModel(1, 0, 1, tsvector, xreg, 1, false, false, userInitParams); so the d value is equal to 0 ("I - Integrated"). ARIMAX assumes that p, d, q are >= 1.

ekote avatar Mar 17 '17 14:03 ekote

@111992-07 , I have just looked at your input file. Why have you used ARIMAX if you don't have any exogenous variables inside you data?

ekote avatar Mar 17 '17 14:03 ekote

hi all,

I had a query. I am using ARIMAX.scala file to build model. After the model is built, i am using forecast function to get the forecasts. In forecast function we pass time series vector and xreg as arguments and we get number of forecasts which equals to number of values in time series vector.

My query is that for e.g. if i have a time series vector with 1 year sales data (365 data points) and i want to get 10 forecasted values for sales, how to achieve that with forecast function of ARIMAX.scala file.

Jalpa08 avatar Oct 11 '17 12:10 Jalpa08