reafactor: simplify XModule serialization/deserialization layer [BD-13]
Description
Most of the methods in XmlMixin act as wrappers for the official API for serialization and deserialization (parse_xml() and add_xml_to_node()). XmlParserMixin contains the code which does the actual serialization and deserialization.
This simplifies the serialization layer by:
- Replacing remaining old
from_xmldeserialization methods withparse_xml, which is used by XBlocks. - Removing the
XmlMixinclass and renamingXmlParserMixintoXmlMixin. This way, the code handling the serialization and deserialization is retained. - Removing the
XmlDescriptorclass, which was used only in tests. - Removing
slugfromurl_namelookup, as we had already removed its translation in https://github.com/openedx/edx-platform/pull/29927.
Sandbox
LMS: https://pr30072.sandbox.opencraft.hosting/ Studio: https://studio.pr30072.sandbox.opencraft.hosting/
The sandbox contains one manual change because the Error XBlock is not loaded by default when importing a course. This line is replaced with:
--- self.load_error_modules = load_error_modules
+++ self.load_error_modules = True
Testing instructions
- Log in as a
staffuser. - Open any sample course or create a new course. Go to Tools > Import and upload file good.tar.gz
- Once the file is imported, verify that all XBlocks were loaded correctly
- Go to Tools > Export and download the file.
- Verify that the export's structure is correct.
- Go to Tools > Import again, and upload file bad.tar.gz.
- In this file, the CAPA module is malformed. Verify that the Error block is rendered in place of the CAPA Multiple choice block.
Other information
XmlParserMixin (renamed to XmlMixin now) and some XBlock classes will continue to retain their custom logic. This logic cannot be removed without breaking backward compatibility. Since this logic does not impact the complexity of the system there are no negatives to retaining it.
Private-ref: BB-5573
Thanks for the pull request, @Agrendalath!
When this pull request is ready, tag your edX technical lead.
@Agrendalath Just checking what the plan here is. It has been a while.
Author will pick this up soon.
@ormsbee, once we wrap up other BD-13 PRs, would you like to review this one too?
@Agrendalath: Yes, I'd like to review this. Just a high level check first: it's intended to keep exactly the same XML serialization that exists today, right (with the exception of the slug stuff which was removed in that other PR)?
@ormsbee, that's correct - it should not change any existing serialization/deserialization behaviors (with the exception you've mentioned).
FYI @pdpinch and @Colin-Fredericks: This is a refactoring of how our oldest XBlocks (the XModules implemented in edx-platform) read and write OLX. There is no expected change to output based on tests currently in edx-platform.
@Agrendalath: If @pdpinch or @Colin-Fredericks have more advanced validation tests they would like to run, is it possible for them to do so in the sandbox provided with this PR? What are the credentials for that?
Thank you.
@ormsbee,
@Agrendalath: If @pdpinch or @Colin-Fredericks have more advanced validation tests they would like to run, is it possible for them to do so in the sandbox provided with this PR? What are the credentials for that?
Of course - the credentials are [email protected]:edx.
Sorry I missed this. I'll try to take a look next week.
I confirmed that I can log into the sandbox. I guess what I want to test is importing some courses with non-standard xBlocks and see how they are modifed in export.
Is there any way to test the import management command? I had thought that followed the same code path as a studio import, but we've found some different behaviors in the nutmeg release.
Thanks, @pdpinch.
Is there any way to test the import management command? I had thought that followed the same code path as a studio import, but we've found some different behaviors in the nutmeg release.
I can SSH to the sandbox and import them if it works for you. If not, I can look into giving you direct access to this instance.
Sorry for dragging this out so long, but I finally got some time to test today and it seems like the sandbox isn't working. I've tried importing two courses, and now I'm trying to export one of the existing ones from studio. All of them are just spinning.
export: stuck on "Exporting: Creating the export data files" import: stuck on "Unpacking: Expanding and preparing folder/file structure"
To nudge things along a little further, this is the course I'd like to test importing. It contains a custom xBlock that isn't installed on the sandbox. I was hoping to see how that was handled during export. If this isn't going to be a fruitful test, let me know.
@pdpinch, I restarted the workers and the imports are passing now (including the ones that were stuck). These sandbox instances are not always perfectly stable.
The test looks fine to me. However, please note that this sandbox has one customization that's not included in this PR - it loads the ErrorBlock when there are errors while importing an XBlock. I can revert it if you'd like to see the standard behavior.
@pdpinch are you able to complete your testing of this now?
Argh, sorry I missed your comment back in October!
My imported course looks good, but I can't export the course from the sandbox again.
Reading through Dave's comments more carefully, it seems like I shouldn't expect any different behavior from the export, so I'm not sure it's worth testing. But if you can get the sandbox working again, I'll check (and I promise to look out for notifications)
@pdpinch, I rebased the PR and re-created the sandbox instance.
Could you please import course-v1:MITx+Test101+3T2022 again before trying to export it? We have recently observed the corrupted GridFS in one of our MongoDB instances, and this course seems to be affected.
Please re-import other existing courses if you notice the Extra chunk found: expected 5 chunks but found chunk with n=5 error. It is not related to this PR.
i was able to import my two test courses without error now. They both include unsupported xBlocks which, as expected, are just dropped on import.
I then tested exporting the test courses, which also worked without error. The exported OLX was missing the unrecognized xBlocks.
So I think it's working as expected?
@pdpinch, correct; this is expected. It is the same behavior as in the current master version of edx-platform.
@ormsbee, is there anything else we should do before scheduling the merge?
@Agrendalath: Nothing else that I can think of. @jristau1984: Please note that while this change has been tested as best as it's practical to do, the support team should immediately flag any new import/export related bugs here.
@Agrendalath By any chance, did you investigate if this change would allow us to get rid of parse_xml_new_runtime and just have one parse_xml method for blocks like Video and HTML? Currently those blocks each have two different parse methods.
See https://github.com/openedx/edx-platform/blob/2df1b60469b6325f43f187fa4f939f4148b37f79/openedx/core/djangoapps/xblock/runtime/blockstore_runtime.py#L66-L77
https://github.com/openedx/edx-platform/blob/d9f54b9f460bed0001e999dac8884d3b8e8a949f/xmodule/html_module.py#L301-L313
https://github.com/openedx/edx-platform/blob/fd2e95f53136d2537657b4b991ee7eab4d0196ea/xmodule/video_module/video_module.py#L623-L638
etc.
@bradenmacdonald, thanks for bringing this up. I didn't check this, but verifying it is a good idea, given that we got rid of from_xml. I created a follow-up ticket to investigate this (internal link: BB-6956). As you have the most context, I tentatively added you as the second reviewer.
@Agrendalath 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.
EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production.
EdX Release Notice: This PR has been deployed to the production environment.
@bradenmacdonald, I looked into this a bit, but while this change simplifies the parsing, we still need to maintain some legacy code (as we noted in https://github.com/openedx/edx-platform/pull/31108).
We could add something like new_runtime=False to the parse_xml method in edx-platform to unify parse_xml and parse_xml_new_runtime and determine if we should fall back to the code from the XmlSerializationMixin class.
I added a quick PoC of this approach here. It doesn't simplify the existing codebase too much, though, so we might want to leave it as-is. Let me know what you think.
@Agrendalath Thanks for looking into it! Yeah, it seems like not a big benefit after all, so I don't have a strong opinion on which way to go. If you want to merge your PoC feel free, or I'm totally fine to just leave it.
@bradenmacdonald, I believe that parse_xml_new_runtime shows a much clearer distinction about the purpose of the block-specific parsing methods, so let's leave it as-is.