lc-shell icon indicating copy to clipboard operation
lc-shell copied to clipboard

Link to free text option seems to go to something else

Open maneesha opened this issue 1 year ago • 4 comments

What is the problem?

In the Working with free text section, there is a DOI link to the OCR General Report on the Physiography of Maryland. However, the link actually seems to go to a page with a zip file of 14847 volumes.

Location of problem (optional)

No response

EDITED to fix the link to the lesson episode.

maneesha avatar May 07 '24 14:05 maneesha

I think this is as expected - we have an excerpt of text in our data files that we use in exercise 2, and this is the citation for it:

An example of text captured by an optical character recognition process: General Report on the Physiography of Maryland. A dissertation, etc. (Reprinted from Report of Maryland State Weather Service.) [With maps and illustrations.] 1898 (from https://doi.org/10.21250/db12)

The DOI in the citation takes the user to the full collection where this work was pulled from.

kaitlinnewson avatar May 07 '24 15:05 kaitlinnewson

OK, I see. In that case, maybe we can note that these links are for reference (not for use in the lesson), and explicitly remind users that excerpts for learning purposes are included with the download files for this lesson.

maneesha avatar May 08 '24 10:05 maneesha

Also - As the source file is 44G I think most people would not want to download it. It may be useful and interesting to have the original accessible to view in some other way so learners can see the source file that was rendered to plan text using OCR.

maneesha avatar May 08 '24 11:05 maneesha

A couple of thoughts on how we can improve this:

  • Change the text to say "Retrieved from https://doi.org/10.21250/db12"
  • Add "Option 1", "Option 2", "Option 3" to the list, and link to each section so that it's clear that the examples are below and not in the links.

I'll make a PR for this change soon.

kaitlinnewson avatar May 21 '24 19:05 kaitlinnewson