Oscar.jl
Oscar.jl copied to clipboard
Minimal associated primes
An attempt to make the specialized functionality in Singular for zero dimensional ideals available.
This seemed to be useful for @simonbrandhorst in some examples, but now I can't even get the tests to terminate. Let's see what the CI says.
@wdecker : The documentation reads as if only QQ
was allowed as a coefficient ring. Do you remember whether this is the case? Because it seems to have run also over number fields.
@ederc : Could you have a look at the tests? I don't understand what's going wrong. But maybe minimal_primes
now called singular_generators
before and that lead to some false caching?
Thanks for wrapping this @HechtiDerLachs !
Well, the tests fail because you try to apply assPrimes
from Singular on a zero dimensional ideal over a finite field, but assPrimes
assumes that the ideal lives over QQ
. AFAIK there is a modular std
call behind the scenes in assPrimes
.
The problem I meant occured in an earlier test run. Some call to dim
was complaining that singular_groebner_generators
did not return a groebner basis. But it doesn't seem to be reproduced anymore; sorry for calling you.
But what you say about minAss
is good to know. I will restrict the cases where it's called.
@ederc : Now the error is back, see this test.
There was a missing caching of the isGB
flag if a GB was computed during small_generating_set
. I pushed a fix to your branch.
is there anything holding this PR up @HechtiDerLachs ?
The tests were still failing. I just had a look and it seems there is another bug in minimal_primes
(due to me, unfortunately). I will have a look.
@ederc : I'm sorry, but it looks like I accidentally overwrote your fixes to this branch when doing a rebase. Do you still have them somewhere? If yes, could you push them here again? Thx!
I do not have this code anymore, we need to look again where the isGB
flag needs to be set.
@ederc : I'm sorry, but it looks like I accidentally overwrote your fixes to this branch when doing a rebase. Do you still have them somewhere? If yes, could you push them here again? Thx!
The Github UI shows HechtiDerLachs force-pushed the minimal_associated_primes branch from 9aa5177 to 2cd696e
, and clicking on that 9aa5177 commit should get you to the last commit before the force push, and the history is also available. With that commit hash known you can also try doing git log 9aa5177
locally since git will keep lost commits for a few days.
You could also look through git reflog
locally.
@HechtiDerLachs Caching isGB
is back in.
Thanks a lot @ederc and @benlorenz ! I was hoping that something like this was possible, but didn't know how.
Unfortunately it seems that a lot of tests time out. Or something else goes wrong which I do not fully understand, yet.
Edit: I checked two of the failing tests locally around the point where they were cancelled and they run just fine on my machine.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 84.59%. Comparing base (
4d1d833
) to head (aa41ef1
).
Additional details and impacted files
@@ Coverage Diff @@
## master #3705 +/- ##
=======================================
Coverage 84.59% 84.59%
=======================================
Files 631 631
Lines 84829 84842 +13
=======================================
+ Hits 71758 71769 +11
- Misses 13071 13073 +2
Files with missing lines | Coverage Δ | |
---|---|---|
src/Rings/mpoly-ideals.jl | 94.19% <100.00%> (+0.13%) |
:arrow_up: |
I really can't make sense of the failing tests. Everything that I tried on my local machine about things where the CI gets stuck really goes through for me. And in some cases I can't even find the error messages.
Triage talked about it yesterday and we suspect that it's the same problem as in #3905 i.e. Singular tries to start subprocesses and gets stuck. So hopefully @hannes14 can take a look when he's back from vacation.
Its still under Hans aegis:
- it is fixed (in Singular)
- it needs the release cascade
- (maybe)
@hannes14 updated Singular_jll and just told me that Singular.jl should automatically use that. So I've now restarted the tests, let's see if it helps.
I looked into the errors / warnings a bit, so far I have found the following:
│ Can't call method "instance_for_owner" on an undefined value at /Users/aaruni/Desktop/oscar-runners/runner-4/julia-depot/scratchspaces/d720cf60-89b5-51f5-aff5-213f193123e7/polymake_15999920327334887436_1.10_depstree_v2/share/polymake/perllib/Polymake/Core/BigObject.pm line 775 during global destruction.
This is triggered from an atexit handler from the perl code which tries to clean up polymake objects. Not totally sure why this error doesn't appear during normal process exit, maybe because some other finalizers need to run first but these are not triggered correctly for these processes.
polymake: WARNING: Automatic update of the interface definition file /home/runner/.julia/scratchspaces/d720cf60-89b5-51f5-aff5-213f193123e7/polymake_11299506659249835626_1.10_userdir/wrappers.0/apps/fan/cpperl/wrap-check_fan.cpperl refused:
2024-09-04T09:36:57.9086549Z Timestamp file /home/runner/.julia/scratchspaces/d720cf60-89b5-51f5-aff5-213f193123e7/polymake_11299506659249835626_1.10_userdir/wrappers.0/build/Opt/.apps.built is missing unxpectedly.
This is also during process exit, maybe because multiple processes are trying to modify the same files at the same time (and this is not really a place where we can add a pidlock). All these processes are trying to save the same new auto-generated code that was probably triggered in the original process.
From worker 4: libc++abi: terminating due to uncaught exception of type std::__1::bad_function_call: std::exception
From worker 4:
From worker 4: [68228] signal (6): Abort trap: 6
From worker 4: in expression starting at /Users/aaruni/Desktop/oscar-runners/runner-3/_work/Oscar.jl/Oscar.jl/test/AlgebraicGeometry/Schemes/resolutions.jl:1
From worker 4: __pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
From worker 4: Allocations: 509000079 (Pool: 508678466; Big: 321613); GC: 90
This is in a macos job and I don't really know what exactly triggers this, but this could also be due to code running at exit without some julia exit handlers running before this, maybe related to this: https://github.com/oscar-system/Polymake.jl/blob/master/src/Polymake.jl#L317-L322
Note: All these subprocesses dying during exit should not really cause the tests to fail (only the doctests due to the printing).
Many of the CI jobs also timed out, something got a lot slower here or got stuck somewhere?
I ran the Oscar tests on this branch locally and after a while it also got stuck with many processes (22 processes including the main one), all of them seem to be sleeping / waiting. (The PC has 12 Cores / 24 Threads, AMD Ryzen 9 5900X)
The main process (28895) is deep inside some singular code, called from minimal_primes
:
Test Summary: | Pass Total Time
mpolyquo-localizations.jl | 37 37 2.1s
======================================================================================
Information request received. A stacktrace will print followed by a 1.0 second profile
======================================================================================
cmd: /net/site-local.linux64/julia/julia-1.10.5/bin/julia 28895 running 1 of 1
signal (10): User defined signal 1
__poll at /lib64/libc.so.6 (unknown line)
_Z12slStatusSsiLP6slistsiPi at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z10jjWAIT1ST1P6sleftvS0_ at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z15iiExprArith1TabP6sleftvS0_iPK8sValCmd1iPK13sConvertTypes at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z7yyparsev at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z10iiAllStartP8procinfoPKc13feBufferTypesi at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z8iiPStartP5idrecP6sleftv at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z11iiMake_procP5idrecP11sip_packageP6sleftv at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z6jjPROCP6sleftvS0_S0_ at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_ZL21iiExprArith2TabInternP6sleftvS0_iS0_iPK8sValCmd2iiPK13sConvertTypes.part.85 at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
...
_Z10iiAllStartP8procinfoPKc13feBufferTypesi at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z8iiPStartP5idrecP6sleftv at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z11iiMake_procP5idrecP11sip_packageP6sleftv at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z15ii_CallLibProcMPKcPPvPiP8ip_sringRi at /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so (unknown line)
_Z31call_singular_library_procedureNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP8ip_sringN5jlcxx8ArrayRefIP11_jl_value_tLi1EEE at /home/datastore/lorenz/software/julia/depot/artifacts/9e729ade26a239c52713155b6c560dd4615b31a5/lib/libsingular_julia.so (unknown line)
_ZNSt17_Function_handlerIFP11_jl_value_tNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP8ip_sringN5jlcxx8ArrayRefIS1_Li1EEEEPSD_E9_M_invokeERKSt9_Any_dataOS7_OS9_OSC_ at /home/datastore/lorenz/software/julia/depot/artifacts/9e729ade26a239c52713155b6c560dd4615b31a5/lib/libsingular_julia.so (unknown line)
_ZN5jlcxx6detail11CallFunctorIP11_jl_value_tJNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEP8ip_sringNS_8ArrayRefIS3_Li1EEEEE5applyEPKvNS_13WrappedCppPtrESH_P10jl_array_t at /home/datastore/lorenz/software/julia/depot/artifacts/9e729ade26a239c52713155b6c560dd4615b31a5/lib/libsingular_julia.so (unknown line)
call_singular_library_procedure at /home/datastore/lorenz/software/julia/depot/packages/CxxWrap/5IZvn/src/CxxWrap.jl:624
unknown function (ip: 0x148fccf9d9e0)
_jl_invoke at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:3077
low_level_caller at /home/datastore/lorenz/software/julia/depot/packages/Singular/A4riV/src/caller.jl:399
assPrimes at /home/datastore/lorenz/software/julia/depot/packages/Singular/A4riV/src/Meta.jl:44
unknown function (ip: 0x148fccf9d2b5)
_jl_invoke at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:3077
#minimal_primes#346 at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpoly-ideals.jl:1072
minimal_primes at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpoly-ideals.jl:1068 [inlined]
__compute_is_prime__ at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpoly-ideals.jl:1663
#373 at /home/datastore/lorenz/software/julia/depot/packages/AbstractAlgebra/sxDV6/src/Attributes.jl:357 [inlined]
get! at ./dict.jl:479
get_attribute! at /home/datastore/lorenz/software/julia/depot/packages/AbstractAlgebra/sxDV6/src/Attributes.jl:230 [inlined]
is_prime at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpoly-ideals.jl:1662
unknown function (ip: 0x148fca651095)
_jl_invoke at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:3077
__compute_is_prime__ at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpolyquo-localizations.jl:2023
#1001 at /home/datastore/lorenz/software/julia/depot/packages/AbstractAlgebra/sxDV6/src/Attributes.jl:357 [inlined]
get! at ./dict.jl:479
get_attribute! at /home/datastore/lorenz/software/julia/depot/packages/AbstractAlgebra/sxDV6/src/Attributes.jl:230 [inlined]
is_prime at /home/datastore/lorenz/software/julia/Oscar.jl/src/Rings/mpolyquo-localizations.jl:2022
unknown function (ip: 0x148fa7ea52f5)
_jl_invoke at /cache/build/builder-amdci4-4/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
One Process (1803) is in a wait call:
0x0000148febf0fc1f in wait4 () from /lib64/libc.so.6
#0 0x0000148febf0fc1f in wait4 () from /lib64/libc.so.6
#1 0x0000148f53324606 in ssiClose(sip_link*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#2 0x0000148f533232e4 in slClose(sip_link*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#3 0x0000148f532c9445 in iiExprArith1Tab(sleftv*, sleftv*, int, sValCmd1 const*, int, sConvertTypes const*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#4 0x0000148f532a0977 in yyparse() () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#5 0x0000148f532e414c in iiAllStart(procinfo*, char const*, feBufferTypes, int) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
...
One Process (1843) is in a futex_wait (called from some openmp exit handlers?):
Using host libthread_db library "/lib64/libthread_db.so.1".
futex_wait (val=120, addr=0x14b2b2044) at /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/x86/futex.h:97
97 /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/x86/futex.h: No such file or directory.
#0 futex_wait (val=120, addr=0x14b2b2044) at /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/x86/futex.h:97
#1 do_wait (val=120, addr=<optimized out>) at /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/wait.h:67
#2 gomp_team_barrier_wait_end (bar=0x14b2b2040, state=120) at /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/bar.c:112
#3 0x0000148feaa8c00e in gomp_team_barrier_wait_final (bar=bar@entry=0x14b2b2040) at /workspace/srcdir/gcc-13.2.0/libgomp/config/linux/bar.c:136
#4 0x0000148feaa8a96d in gomp_team_end () at /workspace/srcdir/gcc-13.2.0/libgomp/team.c:956
#5 0x0000148febe7bbd9 in __run_exit_handlers () from /lib64/libc.so.6
#6 0x0000148febe7bd6a in exit () from /lib64/libc.so.6
#7 0x0000148f53313bc2 in m2_end () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#8 0x0000148f53328e10 in ssiRead1(sip_link*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#9 0x0000148f5332c404 in ssiOpen(sip_link*, short, sleftv*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#10 0x0000148f53323174 in slOpen(sip_link*, short, sleftv*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#11 0x0000148f532c9445 in iiExprArith1Tab(sleftv*, sleftv*, int, sValCmd1 const*, int, sConvertTypes const*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
And finally there are further 19 processes which are waiting in a read like this:
0x0000148fec03a754 in read () from /lib64/libpthread.so.0
#0 0x0000148fec03a754 in read () from /lib64/libpthread.so.0
#1 0x0000148f53ed2f89 in s_getc(s_buff_s*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libpolys.so
#2 0x0000148f53ed31e4 in s_readint(s_buff_s*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libpolys.so
#3 0x0000148f53328dae in ssiRead1(sip_link*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#4 0x0000148f53323421 in slRead(sip_link*, sleftv*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#5 0x0000148f532af29e in jjREAD(sleftv*, sleftv*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
#6 0x0000148f532c9445 in iiExprArith1Tab(sleftv*, sleftv*, int, sValCmd1 const*, int, sConvertTypes const*) () from /home/datastore/lorenz/software/julia/depot/artifacts/178cf93ab258b57deb473bfa44f78c5994f8628c/lib/libSingular.so
...
I can keep those processes alive for a while if there is anything I should check, they don't need much memory and all are idle anyway.
The atexit
related crashes are fixed in @hannes14 most recent update from last week. But there are still errors / crashes, just again different ones
There are still exit handlers running
│ Can't call method "flags" on an undefined value at /Users/aaruni/Desktop/oscar-runners/runner-2/julia-depot/scratchspaces/d720cf60-89b5-51f5-aff5-213f193123e7/polymake_6360604600323823479_1.10_depstree_v2/share/polymake/perllib/Polymake/Core/BigObject.pm line 1247 during global destruction.
which I think is problematic. But at least I don't see any stuck jobs that ran into a timeout.
Many of the failing jobs don't have any logs for the Run tests
step, some show an exit code of 143 (sigterm) which might indicate the workers running out of memory? (due to multiple processes?)
@hannes14 so I guess that ball is in your court again ;-).
Out of curiosity, how does Singular determine how many processes to fork? Does it take the number of cores and/or RAM into account?
Singular uses sysconf(_SC_NPROCESSORS_ONLN) resp. sysconf(_SC_NPROCESSORS_CONF), RAM is not considered
Tests pass now, but test/AlgebraicGeometry/Schemes/Resolution_structure.jl
in Julia 1.10 went from ~64 seconds to ~370 seconds