Have negative integration tests
Let's have these tests for the 0.5 release.
[] Permission tests [] Typos and empty arguments
@chemistry-sourabh @apoorvemohan add more suff to this list that you think we are really important that are top priority. @pgrosu will be working on these.
@naved001 can you please elaborate more on the tests ? On the tests wise I advise having tests for raising specific exceptions in the code. Like for example ceph and hil raise some exceptions if something goes wrong.
Also giving wrong inputs like image which doesnt exist, node that doesnt exist, having duplicates, etc.
@chemistry-sourabh These smoke tests have to be done as part of the acceptance testing suite I'm building, in order to provide in the release documents specifically what we are supporting. We want to test race conditions of 2 or more users trying to provision the same node, or mistyping things. If we don't test then we will have to support it and we need to re-write BMI anyway this summer, and not get swamped with 0.5 PRs/issues.
@pgrosu
We want to test race conditions of 2 or more users trying to provision the same node, or mistyping things.
- 0.5-release is not intended to support parallel/concurrent BMI operations.
Could you please put a list of unsupported features? Otherwise we are testing for it, and we want the users to be told in the release documents.
I'm sorry that sounds a bit vague. We need to bound our support, otherwise we might be expected to support it. In any case, to not make a mountain out of a molehill we will write up tests, and based on those we can draft our release notes specifying the support we will provide for 0.5.
Limitations:
- Any features not exposed by BMI API's (existing dev branch) are not supported.
- Concurrent execution of BMI operations is not supported.
- BMI services needs to be deployed as "root" user.
- RHEL 7.2 based deployment is only supported.
- Default dnsmasq for RHEL 7.2 (version 2.66)
- iPXE for github master branch (https://github.com/ipxe/ipxe - commit id 356f6c1b64d7a97746d1816cef8ca22bdd8d0b5d)
- Only TGT iSCSI software supported - (github master branch of tgt - https://github.com/fujita/tgt - commit id 3c8c9e96b82d87a334b1d340fa29218b7b94f26d)
- Ceph client version 0.94.9-9 (and any compatible corresponding ceph backend)
@naved001 - can you add the HIL version number running in Enagage1 and NEU here.
How should we address the issue of recovery from crashes, failures and/or network partitions? As an example, if the sqlite database gets corrupted should we mention that rollbacks to ceph would need to be performed manually by the user? Is there a procedure of how they can recognize the snapshots and clones performed in their most recent session? Ideally the log can be used for reaching consistency.
- The current users will either be BMI team members or close collaborators (using/deploying BMI under the supervision of BMI team members). Any failure recovery is supposed to be done manually with the help of BMI team members.
- Once BMI will be redesigned (and is ready for public use), the users would either be using a publically available BMI service maintained by the BMI/MOC team or a private BMI service maintained by the user itself. So fault tolerance should be handled by the owner of the service.
- From what I understand, as code developers, we should be responsible for scalability, fault tolerance and high availability of BMI core components (Picasso, Einstien and DB) and not the "plugin components" that it will use (data store, network isolator, auth server etc. - provided by the BMI service owner). (Open to comments/suggestions - This is an important topic for discussion. We should open/start a separate discussion thread for it.)
@apoorvemohan the "root" user is not a limitation anymore.