BAPCtools
BAPCtools copied to clipboard
rethink `.gitignore`
Current situation:
Setting gitignore_generated: true
in generators.yaml
writes data/.gitignore
when generating, which contains all generated cases, i.e.:
.gitignore
secret/1-min.*
secret/2-max.*
bad/empty.*
This works nicely, and allows having sample files directly in data/sample/*
.
New situation:
With the new generator spec, having 'inline' source files directly in data/
is discouraged. It would much simpler and much more consistent to simply ignore all data
directories in their entirety, and generate warnings for any files in data/
that are not generated.
A) When to ignore files?
- Always
- Only when
gitignore_generated: true
is set
My preference: A1, always, since that's what we always use in practice anyway, and then we can drop the gitignore_generated
flag and simplify code since there are fewer different usages.
B) Some approaches for ignoring files:
- Let user put
problems/*/data/
in toplevel.gitignore
manually. - Generate
problem/.gitignore
containingdata/
- Generate
problem/data/.gitignore
containing*
; using the file we currently also use.
My preference: B3, to have automation, but not mess around with a possible problem/.gitignore
file that could already exist. We can claim ownership of problem/data/.gitignore
without issues IMO.
C) Some approaches for warning about non-generated files:
- Always
- When
gitignore_generate: true
is set - When
problem/data/.gitignore
contains*
, or some specific BAPCtools-generated string/comment.
My preference: C1, again to simplify the implementation and possible use cases.
B1. (I have no strong opinions. I’m a decoupler and put version control issues into a different conceptual box from “what a problem generation should worry about”. BAPCtools explicitly has the opposite mindset, it wants to do everything, from version control to slack integration to solution presentation. My opinion here shouldn’t count for anything.)
A1. and remove the flag B3. we want to claim ownership of the whole data dir anyway? C1. maybe even always "delete" them (move them outside of data)
(although not completely independent I think A, Band C each need their own answer ;)
My preferences are the same as Ragnar and mzuenni: A1, B3, C1 :slightly_smiling_face:
EDIT: but perhaps it is indeed convenient to not ignore the samples (see below) :thinking:
One things that we may want to decide on separately is whether to gitignore samples. It's kinda convenient to have them committed when building PDFs, so they don't have to be re-generated each time bt pdf
is run somewhere. See also #309
Ok so my thinking changed a bit here. I now prefer B1, that users put */data/secret
(and data/invalid_inputs
) in a contest-level .gitignore
. That is more explicit than automatically generating data/.gitignore
files that ignore themselves and hence never show up in git status
in the first place and are 'completely hidden' to the user.
Note that I explicitly think that samples should not be ignored. Changes to them are important to be reviewed, and sample/1.ans
files are regularly modified to e.g. have a sensible number of digits for floating point output.
This makes A void:
- we can drop the
gititnore_generated: true
fromgenerators.yaml
- we can drop the code that generates these
.gitignore
files :) - the user explicitly acknowledges that
data/secret
only exists locally, making it clear that nothing there is permanent.
I would still say C1 with this:
- we reserve the right to trash or completely delete anything in
data/
at any point. (Note: all ofdata/
, not justdata/secret
. You havegit
to revert important samples.). - Then we could also make
generate
do-f
'by default', instead of the current situation where I constantly dogenerate -f
optionally followed bygenerate -f --clean
.
So basically this moves us into the direction where we completely ignore what is in data/
before calling bt generate
and effectively remove everything and regenerate it to be up-to-date with whatever generators.yaml
dictates (just possibly implemented more efficiently, only updating things that actually need changing and such.)
My current view is:
- When we generate stuff we always want to gitignore those file i.e. A1. and remove the flag
- B2. But if
data/*
or!data/secret/*
is not in there we add it (you can comment if if you don't want to ignore it) - C1. we want to own that directory. Always warn and remove (move to /generators/removed/) files that are not generated
I've set up the repository for the Freshmen Programming Contests with:
-
A1 (i.e. I removed the
generate_gitignore
flag, which defaults toFalse
anyway) -
B: I've added
data/invalid_*
anddata/secret
to.gitignore
manually now (B1), but it would be my preference to automate adding these rules to some.gitignore
, either:- during
bt new_contest
(see https://github.com/RagnarGrootKoerkamp/BAPCtools/blob/master/skel/contest/.gitignore) (basically B1, but automated); or - by generating a
.gitignore
file indata/
which excludes everything except the samples (see https://github.com/RagnarGrootKoerkamp/BAPCtools/issues/309#issuecomment-1787268603) (B3).
I disagree with B2 (adding a
.gitignore
to every problem), because typically, you'd like an entire contest to use the same configuration, right? - during
Regarding C (warning about non-generated files), I agree with you guys that data/
should be completely owned by BAPCtools. However, I do not like the idea of moving removed files to /generators/removed/
, if that means that old/renumbered test cases will end up in there every time you add new test cases halfway in generators.yaml
...
regarding B: I think it has to be a problem specific setting since i dont want that behaviour if i dont use generators?
Interesting... I do think that a problem should either consistently use generators (giving BAPCtools full control over the data/
folder) or consistently not use generators (so BAPCtools should not touch anything in the data/
folder, and warn when .ans
files do not contain a correct answer, but not overwrite them). Taking that into account, we could let BAPCtools create data/.gitignore
if and only if the folder generators/
exists?
Specifically, this could be useful when running BAPCtools on problems that have not been created using BAPCtools, and therefore do not have a generators/
folder, or when e.g. running BAPCtools on Kattis-exported problem packages. For creating new problems, I cannot think of a scenario where a user of BAPCtools would not make use of generators, but please share if you think such a scenario does exist :slightly_smiling_face:
yeno, we should only do something with a .gitignore
if the generators.yaml
exists and it should therefore be on problem level, but i would prefer to do this is in the more transparent way and write it in the .gitignore
of the problem directory.
But if we want this to be transparent, I'd rather set ignore-rules at the level of a contest, because checking in .gitignore
files for all problems with (roughly) the same ignore-rules feels redundant to me :stuck_out_tongue: And even though this means we don't automate setting ignore-rules on the problem level based on the existence of generators.yaml
, we could still use the existence of this file to decide whether BAPCtools is allowed to overwrite anything in data/
.
IMO there are only two options since contest level should not handle this problem specific setting:
- write a
.gitignore
in/data
which ignores itself - add something to the problem
.gitignore
And i simply prefer 2. since self ignoring .gitignore files are weird but i can live with that ^^
Fixed by #348.
choices
- A1: we always ignore files
- B2: we add to
problem/.gitignore
- C: Instead of warning about unknown files, we move anything old/unknown to
/tmp
andlog
this see here.