PCG
PCG copied to clipboard
Integrate C DO 3D code from POY code base
- [x] Double check for segfaults
- [x] valgrind
- [x] Export an alias to the existing FFI binding
- [ ] Test FFI
- [x] Decide how/if this code will be used in one of our traversals
- [ ] Perform black box testing against POY 4/5
- [ ] Document
- [ ] Push memoized TCM into 3DO code.
- [ ] Switch 3DO from
intthat accepts up to 64 character states to unlimited by counting leading zeroes on input bit vectors. - [ ] Add traversal code for 3DO
- [x] Pseudocode
- [ ] Actual code
- [x] What to do with gap character
- [ ] Decide where to disambiguate. If on C side use builtin:
__builtin_clz, which only works on unsigned - [x] Send
characters_tas input as well as output toalgn_3d_powell - [x] Get actual sequences back from 3DO alignment
- [x] Update Python test harness
- [ ] Make sure my casting, from above, doesn't cause any problems.
- [ ] Consider moving all types to common library, as they're used all over.
- [x] Gapped and ungapped outputs are too short. Why?
- [x] Put asserts after all allocations.
Smoke test 3D DO using QuickCheck. If it doesn't SEGFAULT, we are good to go for now.
These are the steps I took to move code forward from Powell's original code here. This list also exists in ffi/external_direct_optimization/Changelog.md
#Changelog
Note: In all following, changes were made in both ukkCommon.c and ukkCommon_for_python.c, the latter of which is needed to do debugging using Python scripts. ukk_common.h and ukk.checkp.c are shared across both.
17/10/25
- Cloned
- Removed several unneeded compilation units.
- Removed
inlinedirective - Used modern
limitslibrary vs.values - Edited
makefile- Added new make command:
test_powell - Removed extra cruft
- Changed warnings
- Updated to c11
- Added new make command:
- Got it to compile
- Tested with several dynamic characters that had failed my code. It ran.
17/10/27
- More code cleanup for legibility
- Added Python wrapper code for testing
- Originally,
Astr,Bstr&Cstrwere globals, but I was unable to accept input strings due to allocation problems. So it forced me to makeAstr,Bstr&Cstrlocal variables, with allocation happening inpowell_3D_align. I put the three input characters into a struct, which I then passed around. - Removed original limit on input charactersโ lengths.
- Originally,
17/10/27โ31
- Valgrinded C code. Fixed all segfaults and bad memory reads. They were cause by my not sending the "global" struct through to
traceback. Not sure why the compiler wasnโt bitching about that, but it wasnโt. - Fixed segfaults when using Python ffi.
step()exists in bothGLIBCand Powell's code. The FFI needsGLIBCto work. - Removed
#FIXED_NUM_PLANES, which was always set to same value.
17/11/1
- Started to move to interface.
- Deleted
dynamic_character_ttype from 3DO code, as it's no longer needed.
17/11/3
- Updated all 3DO types in
test_interfaceto usecharinstead ofdynamic_char. - Changed name of
ukk.checkp.ctoukkCheckPoint.c. - All code compiles including
test_interface.
17/11/6
- Removed
MAX_STR, but had to add it back in, as two array allocations (cost_vectorandstate_vector) rely on it, as well as the globalMAX_COST. -
test_interfacekept failing, either segfaulting or assertion errors. Step I took:- Built a separate file,
test_c_only.cto runukkCommonwithout the C interface. - Realized the problem had to do with
align_io, which haselem_t, whereas Powellโs code haschar. - Okay, so I can't convert from
inttochar. That conversion in C only allows for ASCII characters, so 33โ126, inclusive. Can I just convert toint? - Gap character was hard coded as
'-'. That's now been changed to(1 << alphabetSize) - 1. - Added
gap_char_gglobal.
- Built a separate file,
17/11/7โ8
- Finished converting char strings to int arrays.
17/11/9
- First value in any alphabet is \(1 = 2^{0}\), i.e. first bit set.
- Set gap char, which is still global, to
1 << alphabetsize - 1. - To get lengths of output characters need to return their values, which means passing a pointer to a data structure. Moving outputs to
characters_t*since it has all the fields I need. I could usealignIO_t, but this gives me the option to move to a smaller type, likeunsigned shortlater, to save space. At the same time, it would be cheaper computationally to usealignIO_t. That should be an easy change in future. - Changed input type of
algn_get_cost_medians_3dto acceptcharacters_t*.
17/11/13
- Pushed through type change on input and output arrays from
inttoelem_t(unsigned int). This was to make translation fromalign_iotocharacters_teasier. - In conjunction with that, was forced to use
memcpyrather than just changing pointers on that translation. - Cast several
void *values inukkCommon.htochar *to get rid of annoying warnings.- In
allocFinal:flag,topandblock. - In
getPtrinreturnstatement. - In
getPtr:memset
- In
- Renamed
Astr, etc.,Aidx, etc. andAlen, etc. toseq1,lenSeq1andidxSeq1, respectively. - Added new fn to reverse
elem_tarrays. - Cleared warnings in both
gccandclang
17/11/14
- Removed dead code
- Updated
algn_3d_powellto acceptcharacters_tas input - Finally resolved annoying valgrind errors caused by bad indices in prints in
ukkCheckPoint.c->traceBack(). - Started getting actual sequences back from Powell. Actually, they were already coming back, but the lengths were set to 0, so they weren't getting copied correctly, and neither were gapped or ungapped getting computed.
17/11/16โ17
- Issue with short gapped and ungapped chars was because of copying to/from align_io. 2D necessitates opening gap removal. 3D doesn't.