dime-data-handbook icon indicating copy to clipboard operation
dime-data-handbook copied to clipboard

Ch2: Feedback

Open g4brielvs opened this issue 4 years ago • 2 comments

I am thankful for the opportunity to share our feedback as part of the final review (#476) and I appreciate the effort DIME is putting in disseminating these valuable guidelines and resources.

The chapter addresses a crucial part of a project's success: collaboration. Here are some ideas, especially coming from the angle of the Data Partnership. I'd be more than happy to collaborate.

Ideas

  • More often than not, relying on absolute paths causes trouble. It will almost guarantee your code won't run on computer other than yours. https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L79

  • It is recommended to check the Bank's stance on Dropbox. Alternatively, the Bank supports OneDrive with the advantage, other than being official and offering up to 5TB per account, of ensuring data classification (Official Only, Confidential, Strictly Confidential). https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L98-L101

  • It would be beneficial to have additional step-by-step examples on how to set up the many recommendations on the chapter. More can be found at DIME Wiki, but the intended audience might find helpful to have quick guides or more references to tutorials.

  • The book touches on a super important point when it comes to team communication and decisions. However, the section might need elaboration. Using tools like GitHub or Dropbox won't help much unless the team adopts an effective approach to project management. For example Agile, Agile-like, Scrum, Kanban. Of course, GitHub does support amazing features like GitHub Projects that can dramatically improve the team's performance (and sanity). In a nutshell, what's important here is not the tool, it is the process.

https://github.com/worldbank/dime-data-handbook/blob/ba0105d6a9a3f779abbb7026e723db8bdecaf792/chapters/2-collaboration.tex#L140-L187

  • Probably out of scope, but it would be great to have a section on cloud computational environments and resources, such as JupyterHub, AWS Sagemaker or Google Colab.
  • Probably out of scope, but Python is a dispensable part of a modern analytics stack and there are considerations that might be useful when using Python or, more specifically, working on a data science project.
  • Probably out of scope, same goes for containerization with Docker.

g4brielvs avatar Dec 08 '20 01:12 g4brielvs

Sorry. I don't have the permissions to assign a label to the issue per CONTRIBUTING.

g4brielvs avatar Dec 08 '20 01:12 g4brielvs

Also, opening #532 to fix a typo to develop.

g4brielvs avatar Dec 08 '20 01:12 g4brielvs