website icon indicating copy to clipboard operation
website copied to clipboard

Create Guides Markdown Converter (Google Apps Script)

Open abenipa3 opened this issue 2 years ago • 14 comments

Overview

As developers, we want a Google Apps Script to convert the current guides in Google Docs to markdown files, so that the markdown files can be used to display the guides on our website.

List of Features Required for the MD Converter

The following are features of what we need the MD Converter to do:

  • Find the file (guide) within the folder or drive. o File can be found by reference or name if it exists.
  • Reads the contents of the file, including: o Headings o Bulleted lists o Numbered lists o Multileveled lists o Indentations and Spaces o Font styles (bold, italics, underline) o URLS o Images o Text blocks, or areas of code nested inside of the document.
  • Formats the contents of the file accordingly.
  • Returns image paths where images were specifically inserted in the document.
  • Creates a folder as an output with the following contents: o File as a Markdown, generated by name. (Example: “how-to-set-reminders-in-slack.md”) o Creates a folder to save images that were inserted within the Google Doc.

Action Item

Continue building Hack for LA's customizable Markdown Converter using Apps Script, including features that would make the conversion from Docs to MD feasible.

  • [x] Research and identify additional Google Docs to Markdown converters that could be used
  • [x] Research and identify if there are any open-source code that we can use in our own Markdown converter code by crediting them. This will require you to check their open-source code license and decide if they allow it.
  • [x] Find out if this needs to be done formally. From your research, do a competitive and comparative analysis and fill out the HfLA-Website: Google Docs to Markdown Converter Competitive or Comparative Analysis spreadsheet
  • [x] Create a method that will search and return all bulleted and numbered lists in MD syntax.
  • [x] Review links from Bonnie added July 11 at the bottom, especially Figma
  • [x] Review Wins to Review folder (must ask for permission from one of the HfLA Website Team's technical leads or merge team members for access
  • [x] Submit to 100 Automations
  • [x] Revert Google Script to original

Findings

  • Read DR: Use an existing add on converter called Docs to Markdown to learn about the add-on we were using and why we stopped using it.
  • https://github.com/oazabir/GoogleDoc2Html is an open source project is a promising starting point based on initial testing. Bullets, headings, font, images, and color were all preserved. No license is specified. It was the top hit in google.
Front Matter
layout: guide-pages
title: How to Set Reminders in Slack
provider-link: "/how-to-set-reminders-in-slack"
overview: "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." 
guide-author:
  - name: "Jane Doe"
    links:
      linked-in: "https://www.linkedin.com/in/jane-doe/"
      github: "https://github.com/jane-doe"
    picture: https://avatars.githubusercontent.com/jane-doe
  • [x] If any updates in the script are needed to make the converter more feasible, please feel free to edit.
  • [x] All outputs must match the original Google Docs.
  • [x] If there is another option (such as a different Docs to MD extension) that (at the very least) returns the same output in MD, please feel free to let us know as well. If the new option returns the same output in MD, then the next step is to fulfill the rest of the tasks list to complete the issue.

Tasks

  • [ ] Create directory with some Google doc guides
  • [ ] Create node script that calls npm package to convert google docs to markdown from specified Google dir and its subdirs to specified output location
  • [ ] Inspect results for Google doc guides
  • [ ] Create script to copy converted documents where build process would pick them up
  • [ ] Manually create a page for listing the guides
  • [ ] Create separate directories for in progress. Add a new document and revised document to this directory.
  • [ ] Manually create a page that shows original and in progress guides.
  • [ ] Create script to auto generate a TOC for these pages
  • [ ] Add auto generate TOC to build process
  • [ ] Add hook to automatically generate new markdown and TOC when a file is created, modified, or deleted.

Resources/Instructions

  • See Resources
  • Message @ethanstrominger via Slack or Email for clarification if needed.

Blockers

  • [ ] Unable to add additional pages to the toolkit. I modified 2FA.html title, description, and short description and none of the changes show up in the toolkit. I also copied 2FA.html to 2FAx.html and made similar changes.
  • [ ] Need info on image sizing. The html for images in existing guides specify a unique class for each image. The classes determine the size.
  • [ ] Need discussion on how/where to store images - Current converter stores images that are pasted into Google docs are copied to Google storage. Images for the existing html pages are stored separately so must have been created separately.
  • [ ] Need discussion on syntax for elements that are associated with a class. For example, the guide pages html includes the class "content-section". One potential syntax:

    class-start: content-section

    A bunch of text ...

class-end

abenipa3 avatar Mar 13 '22 21:03 abenipa3

Hi @alyssabenipayo.

Please don't forget to add the proper labels to this issue. Currently, the labels for the following are missing: Size, Role, Feature

To add a label, take a look at Github's documentation here.

Also, don't forget to remove the "missing labels" afterwards. To remove a label, the process is similar to adding a label, but you select a currently added label to remove it.

After the proper labels are added, the merge team will review the issue and add a "Ready for Milestone" label once it is ready for prioritization.

Additional Resources:

github-actions[bot] avatar Mar 13 '22 21:03 github-actions[bot]

Next steps:

  • [x] Discuss this issue with Bonnie.
  • [x] Once we move the Google Apps Script and any relevant file(s) to Markdown Converter folder, the following links need to be updated:
    • [x] Action Items section

      Alyssa has started building the Apps Script as seen in this link. (Link subject to change once we determine where to move the file in HfLA folder)

    • [x] Resources/Instructions section

JessicaLucindaCheng avatar Mar 15 '22 20:03 JessicaLucindaCheng

@alyssabenipayo

  • [x] Include a list of features. In other words, what exactly do we need the md converter to do.
  • [x] Identify existing converter in the issue that is currently used.

@JessicaLucindaCheng

  • [x] Add an action item to identify additional converters that might work.
  • [x] Add an action item to identify/look for if there are any open-source code that we can use in our own converter by crediting them. This will be dependent on the source code license and decide if this is allowed.
  • [x] Add an action item to do a C&C analysis on it.
  • [x] Make a copy of this C&C: https://docs.google.com/spreadsheets/d/1ePxxsLpdC4MMvJICyMYljHawhs1j-hpFHZFOtB0fdDc/edit#gid=1259306930
  • [x] add a link to the copy of the C&C to this issue.
  • [x] Move Markdown Converter folder to Feature Branches folder

JessicaLucindaCheng avatar Mar 19 '22 22:03 JessicaLucindaCheng

List of Features Required

The following are features of what we need the MD Converter to do:

  • Find the file (guide) within the folder or drive. o File can be found by reference or name if it exists.
  • Reads the contents of the file, including: o Headings o Bulleted lists o Numbered lists o Multileveled lists o Indentations and Spaces o Font styles (bold, italics, underline) o URLS o Images o Text blocks, or areas of code nested inside of the document.
  • Formats the contents of the file accordingly.
  • Returns image paths where images were specifically inserted in the document.
  • Creates a folder as an output with the following contents: o File as a Markdown, generated by name. (Example: “how-to-set-reminders-in-slack.md”) o Creates a folder to save images that were inserted within the Google Doc.

Existing Converter

Docs to Markdown

  • Docs to Markdown - Free, open-source Drive add-on that converts a Google Doc to simple, readable Markdown or HTML. (Available in Google Workspace Marketplace)
  • Link to Extension: https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607
  • The next sections are also mentioned in this issue: https://github.com/hackforla/guides/issues/10#issuecomment-1066190743

Summary:

As of January 8, 2022, the Docs to Markdown Converter released an update that removes error/warning messages called reckless mode.

However, a message as seen below still appears at the top of the converted MD file and there are more issues encountered (mentioned in the next section) after converting the document. image

New Issues Encountered

Images are not in the correct order

Expected Output right-order

Actual Output wrong-order

Backslashes appear randomly in the file

Screenshot 2022-03-12 174359

abenipa3 avatar Mar 20 '22 04:03 abenipa3

Progress: Alyssa was working on this and I think I am now the only developer working on it.

  • Reviewed issue description and followed all links.
  • Ran script developed by Alyssa
  • Tried out Google Docs to HTML converter (see findings)
  • Planned out next steps
  • Modified this issue description with actions. Some of the actions are questions.

Blockers: None Availability: Next week 8 hours. ETA: 4 - 8 weeks? Will have better sense at end of next week.

ethanstrominger avatar Jun 24 '22 19:06 ethanstrominger

Hi @ethanstrominger, thank you for taking up this issue! Hfla appreciates you :)

Do let fellow developers know about your:- i. Availability: (When are you available to work on the issue/answer questions other programmers might have about your issue?) ii. ETA: (When do you expect this issue to be completed?)

You're awesome!

P.S. - You may not take up another issue until this issue gets merged (or closed). Thanks again :)

github-actions[bot] avatar Jun 26 '22 11:06 github-actions[bot]

Progress: Created a script that recursively converts google docs to html and saves the html on to google drive. Source and target directories are currently hardcoded. I am currently storing in a repository on my github and will be meeting with Tamara Snyder, tech lead, to figure out where to store on HackforLA. Repsitory is here.

ethanstrominger avatar Jun 26 '22 11:06 ethanstrominger

@ethanstrominger Since you are working on this issue, please move it to the "In Progress" column for the "Project Board". You can do this by clicking on the dropdown arrow (as circled below) and selecting "In Progress". Thanks.

JessicaLucindaCheng avatar Jun 26 '22 18:06 JessicaLucindaCheng

@ethanstrominger

Please add update using the below template (even if you have a pull request). Afterwards, remove the 'To Update !' label and add the 'Status: Updated' label.

  1. Progress: "What is the current status of your project? What have you completed and what is left to do?"
  2. Blockers: "Difficulties or errors encountered."
  3. Availability: "How much time will you have this week to work on this issue?"
  4. ETA: "When do you expect this issue to be completed?"
  5. Pictures (optional): "Add any pictures of the visual changes made to the site so far."

If you need help, be sure to either: 1) place your issue in the developer meeting discussion column and ask for help at your next meeting, 2) put a "Status: Help Wanted" label on your issue and pull request, or 3) put up a request for assistance on the #hfla-site channel.

You are receiving this comment because your last comment was before Tuesday, July 19, 2022 at 12:21 AM PST.

github-actions[bot] avatar Jul 22 '22 07:07 github-actions[bot]

@ethanstrominger

Please add update using the below template (even if you have a pull request). Afterwards, remove the 'To Update !' label and add the 'Status: Updated' label.

  1. Progress: "What is the current status of your project? What have you completed and what is left to do?"
  2. Blockers: "Difficulties or errors encountered."
  3. Availability: "How much time will you have this week to work on this issue?"
  4. ETA: "When do you expect this issue to be completed?"
  5. Pictures (optional): "Add any pictures of the visual changes made to the site so far."

If you need help, be sure to either: 1) place your issue in the developer meeting discussion column and ask for help at your next meeting, 2) put a "Status: Help Wanted" label on your issue and pull request, or 3) put up a request for assistance on the #hfla-site channel.

You are receiving this comment because your last comment was before Tuesday, July 26, 2022 at 12:20 AM PST.

github-actions[bot] avatar Jul 29 '22 07:07 github-actions[bot]

.

ethanstrominger avatar Jul 29 '22 19:07 ethanstrominger

ARCHIVED FOLLOWING TASKS AS NO LONGER APPLICABLE: ARCHIVE: New approach - see

  • [] Call from node script and write to directory
  • [] Create subdirectories that match phase
  • [] Identify where to store Google Docs used for testing.
  • [] Document process for importing and exporting from Google Apps Script and doing source control
  • [] Create automated test using "Gold Data" created when a conversion for a Google doc is deemed successful.
  • [] Create samples to convert
  • [] Write to directory within web site
  • [] Document process for importing and exporting from Google Apps Script and doing source control
  • [] Create automated test using "Gold Data" created when a conversion for a Google doc is deemed successful.
  • [] Create express server and route for kicking off job
  • [] Make it possible to convert from a shared drive (currently, only user's folders can be accessed)
  • [] Try existing manuals
  • [] Add support for images
  • [] Add support for numbered lists with paragraphs interspersed between items (requires start with number)
  • [] Create a method that replaces every double quote with backticks in the markdown, e.g. Enter "/Remind" into the field, into copy and paste text, e.g., Enter /Remind into the field. More information here.
  • [] Write up proposal for how this will work in proposal.md
  • [] Figure out and write up how to run using clasp from terminal.
  • [] Correct numbering for when lists split
  • [] Figure out where to put documents for converting
  • [] Create a method that replaces every double quote with backticks in the markdown, e.g. Enter "/Remind" into the field, into copy and paste text, e.g., Enter /Remind into the field. More information here.
  • [] Create a method that would automatically set the full image path for images in generated image folder.
    • The current method returns ![image alt text](/image-1.png)
    • Expected output: ![[image alt text](../assets/images/guides/how-to-set-reminders-in-slack/image2.png "image_tooltip")
  • [] Create a method that will automatically generate the front matter as seen below:
  • [] Once the pull request associated with this issue is approved and merged, please update and edit issue #1515 by
    • [] Checking off the dependency for this issue
    • [] If all dependencies are checked off, please move issue #1515 to the New Issue Approval column and remove the Dependency label

Alyssa built the Apps Script as seen in this link. Alyssa is moving off the project and @ethanstrominger is now working on it.

For testing purposes, "How to Set Reminders in Slack" was used in development of that script.

Work left to be done on the script (note: may use an existing open source project - see findings below.

Initial steps for Ethan (completed):

  • [X] Review work done by Alyssa documented in this issue
  • [X] Investigate existing open source (see findings below)
  • [X] Check with Alyssa if any additional resources created and anything checked into GitHub
  • [X] Find out where to store collateral:
    • [X] Is separate repository an option?
    • [X] If existing repository, where does the collateral get stored
  • [X] Find out if converting to HTML is acceptable. There is an open source project that already does this (see Findings) that does a great job.
    consist of google docs to be converted and HTML or markdown equivalent.
  • [X] Create sample Google docs with one or two features to be tested and one document with all the features.
  • [x] Investigate how to import and export from Google Apps Script. The scripts have to be written and run in Google Apps Script. Import/export is needed to be able to check in to github.
  • [x] Investigate existing markdown to Google Docs apps. This will help in creating scripts.
  • [ x] Create initial script either based on Alyssa's script or an open source project for converting a single document to markdown.
  • [x] Modify script to convert all documents in a specified directory

ethanstrominger avatar Jul 31 '22 15:07 ethanstrominger

Progress: Archived previous repository and started new repository https://github.com/ethanstrominger/hfla-googledocs-converter. Previous repository could files in a folder and subfolders in Google Docs and convert to html. However, this is now being abandoned as an already completed npmjs package. New repository has this package installed without any other changes as a starting point.
Blockers: None. Availability: 10 hours Plan: Create node program that calls npmjs package and document how to use. ETA: End of August

ethanstrominger avatar Aug 01 '22 12:08 ethanstrominger

Blockers: See bottom of issue description for info on blockers.

ethanstrominger avatar Aug 02 '22 20:08 ethanstrominger

Progress: Created a document with info about requirements, deployment, "jekyllifying", how to package for npmjs, and more. Info needed: Bonnie thought Alyssa had made code that converted How to Set Reminders in Slack to a version that looks good on the website and replicates screenshot on figma. I sent Alyssa a slack Blockers: None. Availability: 0 this week (on vacation) 8 the following week Plan: When I get back review Slack from Alyssa and potentially adjust my approach, otherwise continue with items on my To Do list. ETA: Mid Sep I will either have a working version or have enough info to more accurately predict.

ethanstrominger avatar Aug 19 '22 16:08 ethanstrominger

***** SEE https://docs.google.com/document/d/1Tx17ewpeI8r7u_-nSazJtUPjKsFeiSYWu-qmJAsbV3I/edit# *****

@ethanstrominger Thanks for reaching to me with this! Posting the following steps here for our team's reference.

Added this comment/documentation under this issue's Resources/Instructions as well.

Here are the steps to how we produce the MD file with the Apps Script:

  • Must access to HfLA Google Drive to view the Google Docs to Markdown Converter folder.
  • Click on the GuidesMarkdownConverter - HfLA. In the first file (Main.gs), we see the following: image
  • Two important parts of the function to keep in mind are:
    • var file = findFileByName("How to Set Reminders in Slack");: For testing purposes (at least in the meantime), let's not change the name here. This locates the file within the folder by name, which in this case is "How to Set Reminders in Slack." image
    • var newFolder = getFolder("MDConverted", true); : After running the command, it creates a new folder called MDConverted in the ROOT directory ("My Drive"/your personal drive).
  • To produce, we click on Run. Note that we need to review permissions to do so. image
  • Click on "Review Permissions" and proceed with granting the script permission to view your drive.
  • At the bottom of the screen, we see the Execution Log, which will tell us when it starts and finishes generating the files. image
  • Then, head to "My Drive" and locate a new folder called "MDConverted" image
  • Click on the folder, and then click on the latest "Output" Folder. image
    • Side Note: Feel free to change the name of the "Output" folder in line 14 of the script here: var subFolder = getFolder("Output-"+getTimeStamp(),true, newFolder);
  • Within the "Output" Folder, we have the following contents:
    • The MD file for "How to Set Reminders in Slack".
    • A new folder that contains all of the downloaded images from the original Google Doc. image
  • The next thing to do from here is download the "Output" folder. image
  • Locate the downloaded "Output" zipped folder and extract it.
  • Move the newly MD-converted file into the website folder (or whichever directory needed to view/edit the file) image
  • Move the images folder into the directory.
  • Jekyll will automatically convert the new MD file into HTML as long as there's a Front Matter as seen below. image
  • As a result, the output will look like this: image

Additional Notes - TOC

  • The _includes\toc.html generates the section and subsection titles for the sticky navigation of the Guide Pages.
    • This is the same method used for 100Automations' Guide Pages as seen [here].
  • Additional Source: [100 Automation All Guides - Live Site].

Caveats

As mentioned earlier, the MD file will automatically be converted into HTML.

However, we had to input/edit the following manual:

  1. Write out the Front Matter.
    • This will automatically convert the MD to HTML.
  2. Arrange the images in the right order.
  3. Edit the image paths so that the MD file will grab the appropriate image from its respective directory.

Please see images below.

Images - Current Output

image image

Images - Expected Output

image image

Previous Action Items

It seems a lot of changes were made to this issue's description, but here were the Action Items to continue building this App Script:

  • [ ] Create a method that will search and return all bulleted and numbered lists in MD syntax.
  • [ ] Update the following variable var file = findFileByName("How to Set Reminders in Slack"); in Main.gs so that it acts as an add-on for any open documents rather than the name of the file.
  • [ ] The current method only searches for the file name. We want to use this script as an add-on so we can apply it to any guide.
  • [ ] For extra references, ask for permission from HfLA Website Team to access the Apps Script for Wins to Review folder.
  • [ ] Create a method that replaces every double quote with backticks in the markdown, e.g. Enter "/Remind" into the field, into copy and paste text, e.g., Enter /Remind into the field. More information here.
  • [ ] Create a method that would automatically set the full image path for images in generated image folder.
    • The current method returns![image alt text](/image-1.png)
    • Expected output: ![[image alt text](../assets/images/guides/how-to-set-reminders-in-slack/image2.png "image_tooltip")
  • [ ] Create a method that will automatically generate the front matter as seen below:
Front Matter
layout: guide-pages
title: How to Set Reminders in Slack
provider-link: "/how-to-set-reminders-in-slack"
overview: "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." 
guide-author:
  - name: "Jane Doe"
    links:
      linked-in: "https://www.linkedin.com/in/jane-doe/"
      github: "https://github.com/jane-doe"
    picture: https://avatars.githubusercontent.com/jane-doe

Please feel free to let me know if you have any further questions. Thanks again for taking on this issue.

abenipa3 avatar Aug 20 '22 01:08 abenipa3