Argus icon indicating copy to clipboard operation
Argus copied to clipboard

Add INCIDENT_RESTART event

Open jladd-geant opened this issue 7 months ago • 8 comments

Scope and purpose

Fixes #1502.

Depends on #1506.

Adds a new INCIDENT_RESTART event to allow source systems to reopen incidents. Described further in linked issue.

This pull request

  • changes the database

Contributor Checklist

Every pull request should have this checklist filled out, no matter how small it is. More information about contributing to Argus can be found in the Development docs.

  • [x] Added a changelog fragment for towncrier

  • [x] Added/amended tests for new/changed code Doesn't break any tests, doesn't introduce any untested lines of code, and is implemented in the same way as the existing REOPEN event. So adding/changing tests didn't seem necessary, happy to do otherwise if needed.

  • [x] Added/changed documentation, including updates to the user manual if feature flow or UI is considerably changed

  • [x] Linted/formatted the code with ruff and djLint, easiest by using pre-commit

  • [x] The first line of the commit message continues the sentence "If applied, this commit will ...", starts with a capital letter, does not end with punctuation and is 50 characters or less long. See our how-to

  • [ ] If this results in changes to the database model: Updated the ER diagram There are no changes to the ER diagram as a result of adding a new event type

jladd-geant avatar Jun 18 '25 15:06 jladd-geant

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 79.46%. Comparing base (ae5eb17) to head (6abdd2a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1504   +/-   ##
=======================================
  Coverage   79.46%   79.46%           
=======================================
  Files         121      121           
  Lines        5409     5410    +1     
=======================================
+ Hits         4298     4299    +1     
  Misses       1111     1111           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov-commenter avatar Jun 18 '25 15:06 codecov-commenter

Only thing im unsure of is the migration generated seems to be duplicating changes? It seems to be converting the id of some fields to BigAutoField despite that already being in the previous squashed migration? I cant imagine it would cause any issues but not really sure whats going on with that. Let me know if I should make any changes.

jladd-geant avatar Jun 18 '25 15:06 jladd-geant

Code looks good to me :) That migration is a bit strange yeah, maybe @hmpf can shed some light on what's happening there

Can you add one or more tests?

  • RES should reopen a closed incident
  • only source systems can restart incidents
  • Defined behaviour on RES an already open incident?

elfjes avatar Jun 19 '25 05:06 elfjes

Agree about the migrations, the change to BigAutoField should already have happened . On Tuesday I hope to be able to check this properly.

Do python manage.py dbshell against prod and dev database. In dbshell run select * from django_migrations where app = 'argus_incident';.

You should have this line (first column may be different, last column deffo different)

 89 | argus_incident | 0009_use_bigautofield_on_quickly_growing_tables | 2025-05-16 08:23:36.499823+02

If not in prod db I'll see if I can do some experiments (with your help) next week, we might need a release that canonicalizes the squashed migrations (=deletes the old ones, needs manual rewrite of the new migrations). If not in dev: nuke the dev database and migrate to version 1.37.0 and check again before you migrate to 2.0.0.

Also do \d argus_incident_incident and check that the Type of the id field is already bigint:

       Column       |           Type           | Collation | Nullable |                       Default                       
--------------------+--------------------------+-----------+----------+-----------------------------------------------------
 id                 | bigint                   |           | not null | nextval('argus_incident_incident_id_seq'::regclass)

hmpf avatar Jun 19 '25 06:06 hmpf

Can you add one or more tests?

I would also like to see a test for restarting an ended incident and then ending it again.

* Defined behaviour on RES an already open incident?

This should simply post the event without changing anything about the incident itself.

johannaengland avatar Jun 19 '25 08:06 johannaengland

Agree about the migrations, the change to BigAutoField should already have happened . On Tuesday I hope to be able to check this properly.

Do python manage.py dbshell against prod and dev database. In dbshell run select * from django_migrations where app = 'argus_incident';.

You should have this line (first column may be different, last column deffo different)

 89 | argus_incident | 0009_use_bigautofield_on_quickly_growing_tables | 2025-05-16 08:23:36.499823+02

If not in prod db I'll see if I can do some experiments (with your help) next week, we might need a release that canonicalizes the squashed migrations (=deletes the old ones, needs manual rewrite of the new migrations). If not in dev: nuke the dev database and migrate to version 1.37.0 and check again before you migrate to 2.0.0.

Also do \d argus_incident_incident and check that the Type of the id field is already bigint:

       Column       |           Type           | Collation | Nullable |                       Default                       
--------------------+--------------------------+-----------+----------+-----------------------------------------------------
 id                 | bigint                   |           | not null | nextval('argus_incident_incident_id_seq'::regclass)

Ok so have investigated this some more and it actually seems to be a small issue with the squashed migration. Firstly if I just checkout to master with a blank db and make migrations it makes the same changes for the ID's (except what I added of course), i assume there should be no detected changes. What I think has happened is that originally the ID fields were being autocreated (not defined in model) but then explicitly defined as bigints in 10bc007b672c367d4305340127aae8127cbb5488. So in migration 0001_initial it does fields=[("id", models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID")) but after being explicitly defined as bigint the alterfield in 0009 does field=models.BigAutoField(primary_key=True, serialize=False),. But in the squashed migration its treating the id's as autocreated even though they are explicitly defined in the model to be bigints. Im not sure how the squashed migration is made but changing

('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),

to

('id', models.BigAutoField(primary_key=True, serialize=False),

in the squashed migration stops makemigrations from detecting changes. DEFAULT_AUTO_FIELD is just AutoField in base.py so doesnt seem to make sense for a BigAutoField to be autocreated. I suppose this could somehow be an issue on my end or im misunderstanding things haha but this is what I think is going on. I suppose this would need a separate PR to fix the squashed migration?

jladd-geant avatar Jun 20 '25 11:06 jladd-geant

Ok have now

  • Fixed the issues with the code
  • Made a unit test to restart a closed incident then close it again
  • Fixed the migration
  • Small change to docs

Let me know if you want more tests/changes to what I made, the test does have very similar functionality to the test for reopening i'm not sure if that should be combined. fixed, seems ok now

jladd-geant avatar Jun 25 '25 16:06 jladd-geant