center-randomize
center-randomize copied to clipboard
[TODO] adapt to run in google collab with google sheets as datasource
The script runs in collab but requires manually uploading files to collab vm. Adapt the script such that it takes input from google sheet and outputs to sheet
The goal is to make this easy to run for non technical folks directly from a browser without having to setup anything
I have created a google colab document based on the current code. Google codelab doesn't support GUI based file selection from the google drive. Hence, files cannot be selected unless url is pasted manually. That is more confusing. Hence I implemented local file upload support. A copy of timestamped output is saved in google drive and downloaded locally at the end: Google Colab
Removed
- Removed argparse
Implemented
- google drive authentication for drive directory access
- local tsv file selection prompts
- google drive timestamped folder creation and output upload
- local download of final output
Need to Implement
- tsv file validation
- graceful handling of keyboard interruptions
This looks promising. Would like to retain the CLI interface as well. Is there a way to share code between the CLI interface and the collab one, it seems we are copying code to ipynb. we could refactor to separate core logic from CLI stuff to separate files. Is is possible to source collab code block directly from github raw url ?
We could have a single google spreadsheet file with three sheets - schools, centers, prefs. This would be mean the entire workflow is browser based. Having to copy a single url to a variable to achieve this may be an easy enough solution
#19
@sumanashrestha It's possible to share the code between the two. Instead of copying the code to the notebook, you can simply clone the github repo on the colab notebook and run the cli directly.
I have implemented a version, where the data is present in the Google sheet here and the notebook automatically downloads the tsv files from the sheet and runs the code from this repo. If the google sheet is modified, you just run the code again.
Here is the colab notebook
You just have to make sure that for the sheet you use, you turn on sharing to "Anyone with the link" can view.
This doesn't mean the sheet is public to everyone. Only those with the link can view it. As such, you can keep a copy of the sheet privately in the ministry's account if that's the intention. Alternatively, you can also make the sheet public, but only give edit access to authorized personnel in the ministry. With that, people can suggest changes via comments, but only authorized personnel can make final changes.
Use gspread without API Authentication
You can also use gspread
without API authentication if you're working with your own Google account and do not need to access private spreadsheets.
Pre-requisites
- Save files in Google Drive (Python library will search the entire Google Drive for sheet names)
- file type: Google Sheets
Code Considerations
-
Configure Google Sheet Access
import gspread
from google.auth import default
def authorize_google_sheets():
"""Authorize and return the gspread client."""
#Authenticate google sheet
creds, _ = default()
return gspread.authorize(creds)
-
Reading School and Center data
def read_google_sheet(sheet_name: str) -> List[Dict[str, str]]:
"""Read data from a Google Sheet."""
gc = authorize_google_sheets()
sheet = gc.open(sheet_name).sheet1
return sheet.get_all_records()
-
Reading Preference data
def read_google_sheet_prefs(sheet_name: str) -> Dict[str, Dict[str, int]]:
"""Read preference data from a Google Sheet."""
prefs = {}
gc = authorize_google_sheets()
sheet = gc.open(sheet_name).sheet1
data = sheet.get_all_records()
for row in data:
scode = row.get('scode', '')
cscode = row.get('cscode', '')
pref = int(row.get('pref', 0))
if scode:
prefs.setdefault(scode, {}).setdefault(cscode, 0)
prefs[scode][cscode] += pref
return prefs
-
Other Code Sections
schools = sorted(read_google_sheet('schools_grade12_2081'), key= school_sort_key)
centers = read_google_sheet('centers_grade12_2081')
prefs = read_google_sheet_prefs('prefs')
We can offer the option to customize sheet names dynamically by either prompting the user or utilizing environment variables. When using 'gspread' with API authentication, you will need to create a service account and obtain a JSON key file from the Google Cloud Console. This may be more complicated for non-techies.