p4-fusion
p4-fusion copied to clipboard
`p4-fusion` 1.13 core dumps with certain configs
Occasional core dumps with 1.13, most of them SIGSEGV
core dumps do not happen all of the time, but this command loops five times; a core dump usually happens several times:
P4USER=<perforce user>
P4PASSWD=<user ticket>
P4PORT=<perforce host>
DEPOT=<depot name>
export P4USER P4PASSWD P4PORT
for x in 1 2 3 4 5
do
tmpdir=${TMP}/test_p4-fusion-${RANDOM}
rm -rf "${tmpdir}"
mkdir -p "${tmpdir}"
echo "attempt ${x}"
p4-fusion-binary \
--path ${DEPOT}... \
--client "" \
--user ${P4USER} \
--src "${tmpdir}/.git" \
--networkThreads 16 \
--printBatch 10 \
--port ${P4PORT} \
--lookAhead 16 \
--maxChanges 32 \
--retries 100 \
--refresh 100000 \
--includeBinaries false \
--fsyncEnable true \
--noColor true 2>&1 | { grep "core dumped" || true; }
rc=$?
echo "return code: ${rc}"
rm -rf "${tmpdir}"
done
Note that <perforce user> needs to be replaced with a value user for the Perforce instance, <user ticket> needs to be replaced with the correct ticket for the user, <perforce host> replaced with the actual Perforce host + port #, and <depot name> with the depot name or path, including the leading // and trailing /.
Test plan
Connect to gitserver via a terminal and run the above command. If using p4-fusion 1.13, it will fail some or all of the time with core dumps.
Example output:
attempt 1
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted
return code: 134
attempt 2
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
return code: 139
attempt 3
return code: 0
attempt 4
return code: 0
attempt 5
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
return code: 139
If using p4-fusion 1.12, the output will be all return code: 0, like this one:
attempt 1
return code: 0
attempt 2
return code: 0
attempt 3
return code: 0
attempt 4
return code: 0
attempt 5
return code: 0
Not that it helps here, but changing lookAhead to 1 seems to stop the core dumps, at the cost of cloning speed.
changing lookAhead to 1 seems to stop the core dumps
Making the lookahead only 1 in total may suggest a problem with memory accesses during the scheduling and sharing of memory across multiple tasks running at the same time. Might have to look deeper with a stack trace or attach a debugger and see what exactly is exploding here.
Once I find time for this issue, I would start with the seeing the difference in execution across threads with lookAhead set to 1 VS 2, and try to find any anomalies. (Note to self)
I con corroborate this. Using the following config I get a crash with lookAhead=2000 and no crash with lookAhead=1. The crash does NOT appear to affect the repo creation. If I run it again with the same arguments, it picks up where it left off and builds the git repo correctly.
~/p4-fusion-1.13/build/p4-fusion/p4-fusion \
--path //ACU_Common-Components/... \
--user xxxxxxxxxxxx\
--port xxxxxxxxxxxxxxxxxxx \
--client xxxxxxxxxxxxx \
--lookAhead 2000 \
--includeBinaries true \
--src ~/git/proplib_w_branches/.git \
--branch Propagator_Library_mainline:main \
--branch Propagator_Library_Release_1.0:release_1.0 \
--branch Propagator_Library_Release_1.1:release_1.1 \
--branch Propagator_Library_Release_1.2:release_1.2 \
--branch Propagator_Library_Release_1.3:release_1.3 \
--branch Propagator_Library_Release_1.4:release_1.4 \
--branch Propagator_Library_Release_1.5:release_1.5 \
--branch Propagator_Library_Release_1.5.1:release_1.5.1 \
--branch Propagator_Library_Release_1.6:release_1.6 \
--branch Propagator_Library_Release_1.7:release_1.7 \
--branch Propagator_Library_Release_1.8:release_1.8 \
--branch Propagator_Library_Release_1.9:release_1.9 \
--branch Propagator_Library_Release_1.10:release_1.10 \
--branch Propagator_Library_Release_2.0:release_2.0 \
--branch Propagator_Library_Release_2.1:release_2.1 \
--branch Propagator_Library_Release_2.2:release_2.2 \
--branch Propagator_Library_Release_2.3:release_2.3