crates.io icon indicating copy to clipboard operation
crates.io copied to clipboard

Improve admin::render_readmes

Open nipunn1313 opened this issue 4 years ago • 3 comments

From @Turbo87 in discord:

I would do it in roughly these steps:

  1. Add a vcs_info JSONB column to the versions table and populate it with the file content from new publishes
  2. Create script or temporary crates-admin command to backfill vcs_info and readme columns on versions that don't have them yet
  3. Adjust render-readmes to use readme column instead of downloading the crate file
  4. Adjust render-readmes to use the vcs_info column to render relative paths correctly for crates in subdirectories

In this strategy

Ideally admin::render_readmes does not download the crate at all.

Requires a one-off script to backfill columns to the versions and crates table from the downloaded crate.

If we want to support rerunning admin::render_readmes on older versions of crates - we would need to move the readme column to the versions table (currently it's in crates table - only the latest version of readme is persisted - overwritten every time a new version is uploaded - never used).

Alternate strategy

Opted against this strategy in the discussion so far, but putting it here for completeness. If we considered going with this - which would enable things like #3971 and #3972

Don't use the readme column of crates (I've just confirmed that it's completely unused currently in the codebase afaict) Download old crates from crates.io to recalculate/find the readme again.

nipunn1313 avatar Oct 28 '21 01:10 nipunn1313

I've just confirmed that it's completely unused currently in the codebase

It might be unused in the Rust portion of the codebase, but I think the search functionality uses the column inside of SQL query. 😉

Turbo87 avatar Oct 28 '21 08:10 Turbo87

Aha. Makes sense! Search functionality would make fantastic use of that column.

Moving readme column to the versions table would certainly help us regenerate old readmes. Would it be beneficial for the user to be able to search through past versions of readmes as well?

nipunn1313 avatar Oct 28 '21 16:10 nipunn1313

oh... to be honest, I wasn't aware the readme column was only on the crates table 🙈

that certainly complicates things. and I guess it explains why the render-readmes command does not use the readme column... 😅

Moving readme column to the versions table would certainly help us regenerate old readmes

Yeah, adding a readme column to the versions table might be a good idea. I'm not sure how much bigger the database would grow if we did this though 🤔

Would it be beneficial for the user to be able to search through past versions of readmes as well?

It's probably sufficient to only be able to search in the README of the most recent release.

Turbo87 avatar Oct 28 '21 16:10 Turbo87