Ruhollah Majdoddin issues

Results 24 issues of


                                            Ruhollah Majdoddin

Why max_input_length = 128 instead of 512 in examples/translation.ipynb??

@sgugger @lewtun Why the inputs are truncated at 128 tokens, although the model can take 512 tokens? ```python max_input_length = 128 model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True) ``` And this is...

Compilation error

I got this error while compiling on Debian 12 (bookworm) with kernel 6.1.0-12-amd64: /var/lib/dkms/rtl88x2bu/5.13.1/build/os_dep/linux/wifi_regd.c:409:36: error: ‘REGULATORY_IGNORE_STALE_KICKOFF’ undeclared (first usein this function) 409 | wiphy->regulatory_flags |= REGULATORY_IGNORE_STALE_KICKOFF; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I commented...

Count only nonoverlapping occurences of a pair

For example, you want to count 2 (not 4) occurrences of the pair 'aa' in text 'aaaaa', because merge() can replace it just 2 times. In other words the counted...

Debugging at funtion level

## **User description** This PR is based on the observation that ChatGPT is not able to do the computation necessary to calculate the output of the whole code for a...

bug_fix

Review effort [1-5]: 4

Ruhollah Majdoddin

Why max_input_length = 128 instead of 512 in examples/translation.ipynb??

Compilation error

Count only nonoverlapping occurences of a pair

Debugging at funtion level

Alfa AWUS036ACU

Optimal algorithm for _encode_chunk(): 20% faster encoding, with 0.5% better COMPRESSION

Deduplication of text chunks with frequency count, training and encoding 5x speedup

Uses Regex instead of fancy-regex - 6x speedup

Cache for Encoding - Runtime Boosted by 12%

repeats audio_features tensor, just like tokens tensor, by group size…