tetrad icon indicating copy to clipboard operation
tetrad copied to clipboard

TS Algos not working from Command Line

Open HarshitaMeena opened this issue 6 years ago • 19 comments

Ran the following command from terminal java -jar causal-cmd-0.3.2-jar-with-dependencies.jar --algorithm ts-imgs --data-type discrete --dataset data1.txt_1.txt,data1.txt_2.txt --delimiter tab --score BDeu

The execution stops and gives following trace -

Running version 0.3.2 but unable to contact latest version server. To disable checking use the skip-latest option. $$$$$ Entering returnSimilarPairs method with x,y = X2:1, X2 Exception in thread "main" java.lang.IndexOutOfBoundsException: Index -1 out-of-bounds for length 0 at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) at java.base/java.util.Objects.checkIndex(Objects.java:372) at java.base/java.util.ArrayList.get(ArrayList.java:440) at edu.cmu.tetrad.data.Knowledge2.getTier(Knowledge2.java:351) at edu.cmu.tetrad.search.TsFges2.returnSimilarPairs(TsFges2.java:1977) at edu.cmu.tetrad.search.TsFges2.addSimilarEdges(TsFges2.java:2061) at edu.cmu.tetrad.search.TsFges2.insert(TsFges2.java:1464) at edu.cmu.tetrad.search.TsFges2.fes(TsFges2.java:883) at edu.cmu.tetrad.search.TsFges2.search(TsFges2.java:241) at edu.cmu.tetrad.search.TsGFci.search(TsGFci.java:160) at edu.cmu.tetrad.algcomparison.algorithm.oracle.pag.TsImages.search(TsImages.java:162) at edu.pitt.dbmi.causal.cmd.tetrad.TetradAlgorithmRunner.runAlgorithm(TetradAlgorithmRunner.java:100) at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.runTetrad(TetradRunner.java:67) at edu.pitt.dbmi.causal.cmd.CausalCmdApplication.main(CausalCmdApplication.java:83)

I am getting the same alert box (Index -1 out-of-bounds for length 0) when running with the UI version.

data1.txt_2.txt

HarshitaMeena avatar Jul 04 '18 14:07 HarshitaMeena

data1.txt_1.txt Attaching all the files

HarshitaMeena avatar Jul 04 '18 14:07 HarshitaMeena

The TS algorithms require the appropriate Knowledge files to run -- that error is coming from the fact that it cannot find the knowledge file. I haven't used the command line functionality in a while... Joe, maybe you can advise on how to incorporate the knowledge file on the command line?

To see the knowledge file format, you can run things in the GUI and save the knowledge file from the knowledge box. Since you will probably be using the same knowledge structure over and over again, you can probably just reuse this file.

On Wed, Jul 4, 2018 at 10:18 AM, HarshitaMeena [email protected] wrote:

data1.txt_1.txt https://github.com/cmu-phil/tetrad/files/2163525/data1.txt_1.txt Attaching all the files

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/855#issuecomment-402492685, or mute the thread https://github.com/notifications/unsubscribe-auth/AOqQdi1OK153MP76PYyZBbem-QTYi_aUks5uDM7OgaJpZM4VCqXY .

dmalinsk avatar Jul 17 '18 17:07 dmalinsk

In command line the knowledge file is input using the flag

--knowledge

The format of the file is the next. Write it in a txt file and save it with the extension .prior

/knowledge

addtemporal 1 X1 X2 X3 2 X4 X5 3 X6 X7 4* X8

forbiddirect x3 x4

requiredirect x1 x2

For example --knowledge myknowledge.prior

The first line of the prior knowledge file must say /knowledge. And a prior knowledge file consists of three sections:

  • addtemporal - tiers of variables where the first tier preceeds the last. Adding a asterisk (*) next to the tier id prohibits edges between tier variables
  • forbiddirect - forbidden edges indicated by a list of pairs of variables
  • requireddirect - required edges indicated by a list of pairs of variables

From: dmalinsk [email protected] Sent: Tuesday, July 17, 2018 1:37 PM To: cmu-phil/tetrad Cc: Subscribed Subject: Re: [cmu-phil/tetrad] TS Algos not working from Command Line (#855)

The TS algorithms require the appropriate Knowledge files to run -- that error is coming from the fact that it cannot find the knowledge file. I haven't used the command line functionality in a while... Joe, maybe you can advise on how to incorporate the knowledge file on the command line?

To see the knowledge file format, you can run things in the GUI and save the knowledge file from the knowledge box. Since you will probably be using the same knowledge structure over and over again, you can probably just reuse this file.

On Wed, Jul 4, 2018 at 10:18 AM, HarshitaMeena [email protected] wrote:

data1.txt_1.txt https://github.com/cmu-phil/tetrad/files/2163525/data1.txt_1.txt Attaching all the files

You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/855#issuecomment-402492685, or mute the thread https://github.com/notifications/unsubscribe-auth/AOqQdi1OK153MP76PYyZBbem-QTYi_aUks5uDM7OgaJpZM4VCqXY .

You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/cmu-phil/tetrad/issues/855#issuecomment-405664924, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AL7wFlS2CHjYWN_cVbH6m4XlEOM7vAKNks5uHiDGgaJpZM4VCqXY.

rubencmu avatar Jul 17 '18 17:07 rubencmu

I tried including the knowledge file. Do let me know if I am doing something wrong here. I ran the command - java -jar causal-cmd-0.3.2-jar-with-dependencies.jar --algorithm ts-imgs --data-type discrete --dataset data1.txt_1.txt,data1.txt_2.txt --delimiter tab --score BDeu --knowledge knowledge.prior

my knowledge file is like - /knowledge

addtemporal 1 X1 X2 2 X1:1 X2:1

forbiddirect

requiredirect

I am still getting this error. Running version 0.3.2 but unable to contact latest version server. To disable checking use the skip-latest option. $$$$$ Entering returnSimilarPairs method with x,y = X2:1, X2 Exception in thread "main" java.lang.IndexOutOfBoundsException: Index -1 out-of-bounds for length 0 at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) at java.base/java.util.Objects.checkIndex(Objects.java:372) at java.base/java.util.ArrayList.get(ArrayList.java:440) at edu.cmu.tetrad.data.Knowledge2.getTier(Knowledge2.java:351) at edu.cmu.tetrad.search.TsFges2.returnSimilarPairs(TsFges2.java:1977) at edu.cmu.tetrad.search.TsFges2.addSimilarEdges(TsFges2.java:2061) at edu.cmu.tetrad.search.TsFges2.insert(TsFges2.java:1464) at edu.cmu.tetrad.search.TsFges2.fes(TsFges2.java:883) at edu.cmu.tetrad.search.TsFges2.search(TsFges2.java:241) at edu.cmu.tetrad.search.TsGFci.search(TsGFci.java:160) at edu.cmu.tetrad.algcomparison.algorithm.oracle.pag.TsImages.search(TsImages.java:162) at edu.pitt.dbmi.causal.cmd.tetrad.TetradAlgorithmRunner.runAlgorithm(TetradAlgorithmRunner.java:100) at edu.pitt.dbmi.causal.cmd.tetrad.TetradRunner.runTetrad(TetradRunner.java:67) at edu.pitt.dbmi.causal.cmd.CausalCmdApplication.main(CausalCmdApplication.java:83)

HarshitaMeena avatar Jul 17 '18 20:07 HarshitaMeena

Sorry.. The knowledge file i used is this

/knowledge

addtemporal 1 X1:1 X2:1 2 X1 X2

forbiddirect

requiredirect

HarshitaMeena avatar Jul 18 '18 15:07 HarshitaMeena

Unfortunately due to our misunderstanding of the interface requirements for the ts* algorithms work there is only only way to get these to work - through Tetrad. The workflow is to load data (data box), apply data manipulation called Convert Time Lag Data (data box), apply search (search box).

Please see #876 for an example.

espinoj avatar Aug 07 '18 16:08 espinoj

Thanks a lot Jeremy. I applied the Convert time lag to my data and it worked. Seems like it takes the knowledge box implicitly when i do that.

HarshitaMeena avatar Aug 07 '18 17:08 HarshitaMeena

Just a note: This just came up for me again in a different context, this time for somebody analyzing climate data. They had prepared a time-lagged dataset with knowledge and wanted to apply TsImages, but it wouldn't work. Not sure why. I'll look at it.

jdramsey avatar Aug 24 '18 17:08 jdramsey

It was fixed in this branch or the #901 pull request.

chirayukong avatar Aug 30 '18 19:08 chirayukong

@kvb2univpitt This is another issue that could be handled in one of two ways:

  1. Add a parameter to all algorithms that could be used for time series to generate time lag data first before proceeding (my issue), or
  2. Add a flag to the command line to generate time lag data before running the algorithm (your issue).

Personally I think (2) is much better, but which way do you think is best?

jdramsey avatar Mar 25 '22 22:03 jdramsey

I'm more and more thinking that there should be a flag in causal-cmd for this, that makes causal-cmd extract the time lag data for the dataset before passing the data to the algorithm, something like --timelag 5, for a time lag of 5. The implementation should be straightforward, I think.

jdramsey avatar Mar 26 '22 03:03 jdramsey

@jdramsey I haven't work much with time series data. I have to look back into it. Yes, I agree that causal-cmd should provide a way to generate time lag data. Is there a function in Tetrad-lib to do that? If so, I can have causal-cmd execute that function whenever --timelag param is pass via command-line.

kvb2univpitt avatar Mar 28 '22 16:03 kvb2univpitt

@kvb2univpitt Yes, there is a library function in Tetrad to do it, and it's very easy. I think think will be a very easy fix. Let me find it just now...

Here is the script (in edu.cmu.tetradapp.model.datamanip.TimeSeriesWrapper):

        DataSet dataSet = (DataSet) dataModel;
        DataSet timeSeries = TimeSeriesUtils.createLagData(dataSet, params.getInt("numTimeLags", 1));
        if (dataSet.getName() != null) {
            timeSeries.setName(dataSet.getName());
        }
        knowledge = timeSeries.getKnowledge();

You need a number of time lags, so you'd need a flag for that, and you'd need a flag to convert the time series data into lagged data, but then you just need to grab the time lagged dataset and knowledge using the above script and pass those to the algorithm. Very easy I think.

jdramsey avatar Mar 29 '22 04:03 jdramsey

Also it's well-tested and works fine.

jdramsey avatar Mar 29 '22 04:03 jdramsey

So I completely agree with your plan.

jdramsey avatar Mar 29 '22 04:03 jdramsey

Thanks, Joe, for the code snippet. It's been a while since I touch time series.

kvb2univpitt avatar Mar 30 '22 19:03 kvb2univpitt

@kvb2univpitt HOLD ON...before you do this let me try to do it on the algorithm side by just adding a parameter to the relevant algorithms. I think it will just show up as a parameter for these algorithms, right? I'll add the parameter:

--timelags 3

way for 3 time lags. The default will be

--timelags 0

which will do the search in the usual way, without time lag knowledge.

Let me try this first. It's more elegant.

jdramsey avatar Apr 05 '22 04:04 jdramsey

I am going to sleep!

On Tue, Apr 5, 2022 at 12:24 AM Joseph Ramsey @.***> wrote:

@kvb2univpitt https://github.com/kvb2univpitt HOLD ON...before you do this let me try to do it on the algorithm side by just adding a parameter to the relevant algorithms. I think it will just show up as a parameter for these algorithms, right? I'll add the parameter:

--timelags 3

way for 3 time lags. The default will be

--timelags 0

which will do the search in the usual way, without time lag knowledge.

Let me try this first. It's more elegant.

— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/855#issuecomment-1088253912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3ONI2MJZMUFIEPTKMSTVDO56TANCNFSM4FIKUXMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cg09 avatar Apr 05 '22 05:04 cg09

OK, I've added this parameter I think to all of the relevant algorithms. I believe that once this is pushed to development the causal cmd problem with time series should be solved. You'll just need to set the timeLag parameter to the number of time lags you want.

jdramsey avatar Apr 06 '22 07:04 jdramsey