Luke Friedrichs

Results 10 issues of Luke Friedrichs

Caused assertion error: ``` =========================== short test summary info ============================ FAILED tests/test_example_concept_learning_neural_evaluation.py::TestConceptLearningCV::test_cv - AssertionError: Search tree is empty. Ensure that there is at least one owl:Class orowl:ObjectProperty definitions ============= 1...

https://github.com/dice-group/dice-embeddings/blame/674e9f5e521e304691ef063f9f79b23e0a5f8ef2/retrieval_aug_predictors/models/RALP.py#L59C2-L59C64 Why do we use gpt-3.5-turbo tokenizer here? Is this the one used by the current LM we are using? Also, shouldnt this be variable depending on the model used?...

question

WIP but works for negative Sampling and crazy slow inference (see below): https://github.com/dice-group/dice-embeddings/tree/BET Instead of learning one embedding per entity, BET encodes the raw bytes of the entity and relation...

enhancement

We should run benchmarks for the CoKE model and add its metrics (MRR, Hits@1/3/10, datasets, training setup) to the benchmarking tables in the README.

enhancement

we could add wandb for tracking of loss curves, hyperparameters, eval results,... https://github.com/wandb/wandb

enhancement

Otherwise you can not use the deepspeed trainer for instance (using the --strategy argument). Also the deepspeed package is not installed by default rn, maybe we want to add it...

For different batch_sizes i observed a quadratic memory increase, ie: ``` 256 -> CUDA out of memory. Tried to allocate 37.61 GiB and 512 -> ... 150.42 GiB and 1024...

This: https://github.com/deepspeedai/DeepSpeed/blob/53e91a098d0a0666ac8cb8025a5b36e5af172d08/.gitignore#L61C1-L61C6 ignores the [multi_tensor_apply.cuh](https://github.com/deepspeedai/DeepSpeed/blob/master/csrc/adam/multi_tensor_apply.cuh) file which prevents fusedAdam from working when this file is not being pushed to remote, when working with a cloned copy of Deepspeed.

**Describe the bug** Importing the `deepspeed.layer.moe` throws raises this ValueError: ` ValueError: Target parameter "qkv_w" not found in this layer. Valid targets are []` from: https://github.com/deepspeedai/DeepSpeed/blob/e993fea38efe654592b956d1ab52e340bfbf9714/deepspeed/inference/v2/model_implementations/layer_container_base.py#L97-L99 and this ValueError: `...

bug
training

I have implemented some custom logic in the deeepspeed_moe classes and having "expert" in any parameter name breaks the saving function for checkpoints. The warning triggers since the code founds...