cookbook issues

WIP Add HuggingFace arg so that arch is automatic

5

This pull request is made to work on adding automated parameter calculations for all hugging face models. Expected Behaviour: ```python python transformer_mem.py --hf_model_name_or_path meta-llama/Llama-2-7b-hf --num-gpus 8 --zero-stage 3 --batch-size-per-gpu 2...

bhavnicksm

Add HuggingFace arg so that arch is automatic

3

Stas Bekman had the idea of supporting a HuggingFace model as input so that all model architecture settings don't need manually dug up. We'd like something like: ``` python transformer_mem.py...

Quentin-Anthony

Add MLP Linears Argument

Addresses https://github.com/EleutherAI/cookbook/issues/36 Before: ``` $ python calc/calc_transformer_mem.py --infer --high-prec-b ytes-per-val 4 --low-prec-bytes-per-val 1 --num-gpus 2 --zero-stage 3 -ca -b 1 -s 1024 -v 152 064 -hs 8192 -a 64 -l...

Quentin-Anthony

calc_transformer_mem.py is inaccurate for most popular open models

Running `calc_transformer_mem.py` with the parameters for Qwen1.5-72B prints that this model has 56.19 billion parameters, while the real number is around 72 billion: `python calc_transformer_mem.py --infer --high-prec-bytes-per-val 4 --low-prec-bytes-per-val 1...

rationalism

Add missing `--ffn-expansion-factor` to FLOPs calculator script

1

As per the title. This arg is in the other two scripts but was missing for `calc_transformer_flops.py`

haileyschoelkopf

I/O Benchmarking

Would be good to add I/O benchmarks in the style of existing communication and computation benchmarks.

Quentin-Anthony

Fix webpage hosting

Currently the model directory webpage at https://github.com/EleutherAI/cookbook/tree/main/model-directory isn't live and entirely undocumented. - [ ] Make model directory webpage live - [ ] Add model hparam setting html page and...

Quentin-Anthony

Add communication volume calculation script

Would be good to model the communication volume in bytes of a given parallelism setup. Situations to model: - Different parallelism schemes - ZeRO-1/2/3, ZeRO++ - 3D parallelism - Activation...

Quentin-Anthony

Add Inference FLOP Calculation

As recently pointed out in https://arxiv.org/abs/2401.00448, inference FLOPs are also important and it would be easy to add a flag to https://github.com/EleutherAI/cookbook/blob/main/calc/calc_transformer_flops.py for the inference and training+inference cases.

Quentin-Anthony

Improve Handling of Llama-style Models

While the calc scripts are correct for llama-style models, their implementation is inflexible (see https://github.com/EleutherAI/cookbook/issues/36 and https://github.com/EleutherAI/cookbook/pull/35) It'd be nice to clean this up a bit.

Quentin-Anthony

cookbook
cookbook copied to clipboard

Metadata

WIP Add HuggingFace arg so that arch is automatic

Add HuggingFace arg so that arch is automatic

Add MLP Linears Argument

calc_transformer_mem.py is inaccurate for most popular open models

Add missing `--ffn-expansion-factor` to FLOPs calculator script

I/O Benchmarking

Fix webpage hosting

Add communication volume calculation script

Add Inference FLOP Calculation

Improve Handling of Llama-style Models

← Metadata

Owner

Metadata

cookbook cookbook copied to clipboard

Metadata

← Metadata

Owner

Metadata

cookbook
cookbook copied to clipboard