tetrad
tetrad copied to clipboard
Make test/score that will work for algebraically defined nonlinear models.
Loading the attached csv throws the following exception:
Infer demiliter for file: 20_nodes_normal.csv
Exception in thread "AWT-EventQueue-0" java.lang.NoSuchMethodError: java.nio.ByteBuffer.clear()Ljava/nio/ByteBuffer;
at edu.pitt.dbmi.data.reader.util.TextFileUtils.inferDelimiter(TextFileUtils.java:135)
at edu.cmu.tetradapp.editor.LoadDataSettings.getInferredDelimiter(LoadDataSettings.java:882)
at edu.cmu.tetradapp.editor.LoadDataSettings.basicSettings(LoadDataSettings.java:503)
at edu.cmu.tetradapp.editor.LoadDataDialog.showDataLoaderDialog(LoadDataDialog.java:165)
at edu.cmu.tetradapp.editor.LoadDataAction.actionPerformed(LoadDataAction.java:91)
at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2022)
at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2348)
at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
at javax.swing.AbstractButton.doClick(AbstractButton.java:376)
at javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:842)
at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:886)
at java.awt.Component.processMouseEvent(Component.java:6539)
at javax.swing.JComponent.processMouseEvent(JComponent.java:3324)
at java.awt.Component.processEvent(Component.java:6304)
at java.awt.Container.processEvent(Container.java:2239)
at java.awt.Component.dispatchEventImpl(Component.java:4889)
at java.awt.Container.dispatchEventImpl(Container.java:2297)
at java.awt.Component.dispatchEvent(Component.java:4711)
at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4904)
at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4535)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4476)
at java.awt.Container.dispatchEventImpl(Container.java:2283)
at java.awt.Window.dispatchEventImpl(Window.java:2746)
at java.awt.Component.dispatchEvent(Component.java:4711)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:760)
at java.awt.EventQueue.access$500(EventQueue.java:97)
at java.awt.EventQueue$3.run(EventQueue.java:709)
at java.awt.EventQueue$3.run(EventQueue.java:703)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:84)
at java.awt.EventQueue$4.run(EventQueue.java:733)
at java.awt.EventQueue$4.run(EventQueue.java:731)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:730)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
[20_nodes_normal.csv](https://github.com/cmu-phil/tetrad/files/12176485/20_nodes_normal.csv)
Actually your file didn't come through; you may need to zip it before attaching it (I've found)...
One second I found your link...
Ah. It's not a covariance matrix. You can load it as tabular data--see the picture I took.
Hold on, sorry, you didn't actually say it was a covariance matrix. But huh, it loads for me..... can you tell me more about how you're trying to load it?
@uvnikgupta What version of Java are you using?
Java version:
openjdk version "1.8.0_332"
OpenJDK Runtime Environment (Temurin)(build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (Temurin)(build 25.332-b09, mixed mode)
I am launching the jar using :
java -Xmx2G -jar tetrad-gui-7.4.0-launch.jar
Thanks for the update. Sorry, I was multitasking yesterday. This is a bug we know about (thanks @kvb2univpitt). The issue (if you want to know) is that Oracle changed the implementation of the ByteBuffer class so that it's incompatible between version 1.8 and versions > 1.8. It's this bug:
https://www.morling.dev/blog/bytebuffer-and-the-dreaded-nosuchmethoderror/
except in your case it's the clear() method that's the problem and not the position() method. You're using OpenJDK 1.8, I'm guessing on a Linux box? (Actually can you confirm that?) What I'll do (sorry just trying different things here) is the casting they suggest in the article to see if it will work in OpenJDK1.8 for me. (It needs to work both for 1.8 and for > 1.8 unfortunately, which is the issue.) Unfortunately I'm on a Mac at the moment and the only JDK 1.8 I can get anymore is Amazon's, and it's not a problem there. When I get back home today I'll try installing OpenJDK 1.8 on my Windows laptop (I think I can still do that, though I can no longer get it from M$) and test it there. But really what I need to do is test it on Linux, using OpenJDK 1.8, and I don't have a Linux box currently.
If I made you a version (or maybe two versions) to test, would you be willing to try them out on your machine? That would help a lot.
@jdramsey, Thanks a lot for explaining the issue. I am using Widows 10. Yes, I am ok to try the test versions
maybe you could get the open jdk 8 from here : https://www.openlogic.com/openjdk-downloads?field_java_parent_version_target_id=416&field_operating_system_target_id=436&field_architecture_target_id=391&field_java_package_target_id=396
Awesome--Let me grab the Mac version now and test it, and then I can download the Windows one later and test it there. Fingers crossed! We (well @kvb2univpitt) were thinking of rewriting that section of code without using ByteBuffer, but hopefully this fixes it without that effort.
Actually they're not providing any Mac options--it's in their selector but you only get Windows options in the list. I'm at the office right now but can do this later when I get home; my Windows laptop is there.
I just tested it using Amazon's Corretto 1.8 on Mac and it works there, though I suspect Amazon may have gone in and fixed the issue internally.
Oh hold on, they did have it! It's just that their dropdown was broken; I had to select "all" and then the Mac options showed up. I test it--it works! That gives me some confidence that it will work on Windows as well using the a Windows 1.8 download from this site, but I can test it later.
The problem goes away if you use Java 11 and above.
@kvb2univpitt I am motivated to figure it out because we have users who are not in a position to grab a newer version of Java. I may have figured it out though--I'll let you know! I'm going to test it now on Windiows.
@kvb2univpitt I am motivated to figure it out because we have users who are not in a position to grab a newer version of Java. I may have figured it out though--I'll let you know! I'm going to test it now on Windiows.
I am one of those in that group :)
@jdramsey We definitely need to get rid of the ByteBuffer. By "we" I mean "me".
@uvnikgupta @kvb2univpitt Could you both try to break this version? I.e., launch it, try to load a dataset...
https://s01.oss.sonatype.org/content/repositories/snapshots/io/github/cmu-phil/tetrad-gui/7.4.0-SNAPSHOT/tetrad-gui-7.4.0-20230728.001143-5-launch.jar
If it works I will tell you what I did.
@uvnikgupta @kvb2univpitt Could you both try to break this version? I.e., launch it, try to load a dataset...
https://s01.oss.sonatype.org/content/repositories/snapshots/io/github/cmu-phil/tetrad-gui/7.4.0-SNAPSHOT/tetrad-gui-7.4.0-20230728.001143-5-launch.jar
If it works I will tell you what I did.
Sure. On it :)
Tried different datasets and it seems to work pretty fine now 👍 Thanks for the quick fix
Tried a few more and data loading + Search works flawlessly. The only issue now is the the resulting graph is nowhere close to the actual graph :( I guess that is state of the existing discovery algorithms due to the nature of the problem.
I'm very curious what experience Kevin has. I compiled this under Corretto 1.8 and have no trouble running under 1.8 or 11 on my Mac, so if you have no trouble on Windows, I'll try under 11 under Windows.
Not sure what to say about the content. Maybe if you tell me the general nature of the problem and what you've tried I could comment?
I am loading the data and connecting to the search box. Then executing search using different algorithms. Finally comparing the result with the actual DAG. The data and the actual DAG is attached for your reference
20_nodes_normal.csv
BTW, I encountered a Null pointer issue when I tried to use the "Regression"
Are these Gaussian variables? With what sample size?
On Thu, Jul 27, 2023 at 9:28 PM kelearin @.***> wrote:
I am loading the data and connecting to the search box. Then executing search using different algorithms. Finally comparing the result with the actual DAG. The data and the actual DAG is attached for your reference 20_nodes_normal.csv https://github.com/cmu-phil/tetrad/files/12189230/20_nodes_normal.csv [image: image] https://user-images.githubusercontent.com/20485662/256699118-c585c8fe-048a-4e90-bbe4-969c12ddf0b8.png
BTW, I encountered a Null pointer issue when I tried to use the "Regression" [image: image] https://user-images.githubusercontent.com/20485662/256699297-6ae9f1fb-2b24-46e3-ae23-fff1296432f0.png
— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1669#issuecomment-1654840984, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3OON3557TVEGSNGP7KLXSMISRANCNFSM6AAAAAA2ZBCWL4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
not able to attach my data generator .py file. So below is the formulae:
"A1": "0.0",
"A2": "0.0",
"A3": "0.0",
"A4": "0.0",
"A5": "0.0",
"A6": "0.0",
"A7": "0.0",
"A8": "0.0",
"B1": 'data_2["A1"]**2',
"B2": 'data_2["A1"]',
"C2": 'np.sqrt(np.abs(data_2["B1"]))',
"C3": 'data_2["B1"] * data_2["B2"]',
"D2": 'data_2["C2"]**2 + data_2["C3"] - data_2["A2"]**2',
"C4": 'data_2["B2"]**3',
"D3": 'np.sqrt(np.abs(data_2["C4"]))',
"B3": 'data_2["A4"]**2 + data_2["A5"]',
"C1": 'data_2["B3"]**2',
"D1": 'np.round(np.mod(1000data_2["C1"], 10), 3)',
"E1": 'np.abs(data_2["A3"])**2/(data_2["D1"] + .001)',
"F1": '2data_2["D2"] + data_2["D3"] - data_2["E1"]data_2["A6"] + 8data_2["A7"]/data_2["A8"]'
I add np.random.normal(loc=5, scale=1, size=self.size) to each of the variables above
They are not terribly Gaussian. By the way @uvnikgupta if you'd like to switch to email I'm happy. @cg09 if you load up the data that was sent in the version of Tetrad given above and use the Plot Matrix tool you can see the distributions of the variables.
They are not terribly Gaussian. By the way @uvnikgupta if you'd like to switch to email I'm happy. @cg09 if you load up the data that was sent in the version of Tetrad given above and use the Plot Matrix tool you can see the distributions of the variables.
yes, I can share my data generation python code then. Please DM me at
That's what I thought--nonlinear algebraic functions generated them...You know we were just thinking of how to incorporate this sort of nonlinear additivity into a fast score...
What sort of "non-linear algebraic" functions?
On Thu, Jul 27, 2023 at 10:08 PM Joseph Ramsey @.***> wrote:
That's what I thought--nonlinear algebraic functions generated them...You know we were just thinking of how to incorporate this sort of nonlinear additivity into a fast score...
— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1669#issuecomment-1654870838, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD4Y3OJHDQAYRO4CSF42AHLXSMNJDANCNFSM6AAAAAA2ZBCWL4 . You are receiving this because you were mentioned.Message ID: @.***>
The formula simplifies to : 2D2 + D3 - E1A6 + 8*A7/A8
I hope you received the python scripts I shared. You can generate data of any size with that script. Just modify the size parameter in the instantiations of the DataGenerator class under if name == "main", create a data folder and run "python data_generator.py". Of course you have to pip install pandas, numpy and scipy.
On Fri, Jul 28, 2023 at 12:31 AM Joseph Ramsey @.***> wrote:
b1 = a2^2
==> log(b1) - 2 ln a2
b2 = a1
Singularity, you'll need to remove one of two columns or teach you algorithm to deal with it. But you can't use regression here in any form. (This is why the regression check is failing, above, BTW).
c2 = sqrt(abs(b1))
==> Hmmm... you need to check a symmetric function here of b1 to find the dependency.
c3 = b1 * b2
==>ln(c3) = ln(b1) + ln(b2)
c2^2 + c3 - a2^2
==> Logging won't help here for the entire function! But logging c2 and logging b2 separately would help if you knew to do that! Hmmm...
c4 = b2^3
==> ln(c4) = 3 * ln(b2)...no problem.
sqrt(|c4|)
==> Another symmetric function.
b3 = a4^2 + a5
==> Logging a4 separately would have helped.
c1 = b3^2
==> Logging solves this.
"D1": 'np.round(np.mod(1000data_2["C1"], 10), 3)',
Not sure how to describe this one in words yet, I'll come back to it.
==> NO HELP HERE! You need to resort to a generalized score I think!!! Ugh, slow!!!
"E1": 'np.abs(data_2["A3"])**2/(data_2["D1"] + .001)',
abs{a3)^2 / d1 + 0.001.
==> Heuristically I would still log this :-) 2 * ln(abs(a3)) - ln(d1) + ln(0.001)
"F1": '2data_2["D2"] + data_2["D3"] - data_2["E1"]data_2["A6"] + 8data_2["A7"]/data_2["A8"]'
2 * d2 + d3 - .... what is that? e1 a6?? + 8 a7 / a8? I have to check what concatenating variables in Python does... string concatenation?????!
==> I still have no idea what this even means yet, lol!!! :-D
— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1669#issuecomment-1655046346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4JMHW2S3AACDGNWJMBZV3XSNFC3ANCNFSM6AAAAAA2ZBCWL4 . You are receiving this because you were mentioned.Message ID: @.***>
Sorry I haven't gotten back to you--we're all at the UAI conference here in Pittsburgh. I thought about the 1.8 issue and think the thing to do is to publish a separate version compiled under 1.8. I'm going to try to get this done today.
Yes, I was starting to wonder :) The agenda for the UAI conference sounds really cool. I have never attended any of its conferences but I can imagine the energy in that environment. I hope I am able to attend some day.
Coming back to the topic, I already have your working version for 1.8 so I am not really waiting for an official release. I am now more interested in figuring out why the algorithms are not performing well and how to tweak the data or the algorithm parameters to reproduce most of the DAG, if not fully.
Regards Uvnik
On Thu, Aug 3, 2023 at 12:49 PM Joseph Ramsey @.***> wrote:
Sorry I haven't gotten back to you--we're all at the UAI conference here in Pittsburgh. I thought about the 1.8 issue and think the thing to do is to publish a separate version compiled under 1.8. I'm going to try to get this done today.
— Reply to this email directly, view it on GitHub https://github.com/cmu-phil/tetrad/issues/1669#issuecomment-1664390490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4JMHVC5ULUM3SYVRB6P5LXTPQA7ANCNFSM6AAAAAA2ZBCWL4 . You are receiving this because you were mentioned.Message ID: @.***>