dislib icon indicating copy to clipboard operation
dislib copied to clipboard

Standarize nomenclature

Open javicid opened this issue 6 years ago • 0 comments

Standarize nomenclature in the source code and documentation according to:

  1. dislib.Array instances are referred to as ds-array.
  2. NumPy array instances are referred to as ndarray.
  3. x and y should be used for ds-arrays representing samples and target values.
  4. x_np and y_np should be used for ndarrays representing samples and target values.
  5. A NumPy array or csr_matrix that is a part of a ds-array should be named block.
  6. When iterating ds-arrays horizontally and vertically, hblock and vblock should be used to refer to sets of blocks.
  7. Tasks that receive a set of blocks as input parameter should name this parameter blocks.
  8. In the documentation, variable x should be described as 'Training samples'.
  9. In the documentation, variable y should be described as 'Target values'.
  10. Optional arguments should be documented in the formula: "int, optional (default=0)"
  11. Input ds-arrays should be documented with the formula: "ds-array, shape=(n_samples, n_features)"
  12. The name of functions that are tasks should start with _.
  13. The name of files and functions that are not supposed to be accessed by users should start with _.
  14. Estimators need to be implemented in a base.py file in a separate sub-folder inside the appropriate submodule. Additional files can be included in the same subfolder named with a leading _.
  15. Other typical variable names:
  • number of something = n_something
  • max_iter
  • arity
  • tol = tolerance criteria
  • random_state
  • verbose
  • check_convergence (whether to check for convergence)

javicid avatar Sep 03 '19 09:09 javicid