rust-bert GPTNeo and LanguageGenerator with long prompt text resulting in a panic

GPTNeo and LanguageGenerator with long prompt text resulting in a panic

Open jschnitzer opened this issue 2 years ago • 1 comments

In an effort to build up a generated document part by part. I am leveraging a prompt to generate some text then concatenating the generated text and feeding it back in to keep building it up.

At point in the process I am hitting a panic:

panicked at 'called `Result::unwrap()` on an `Err` value: Torch("index out of range in self\nException raised from operator() at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1670525682339/work/aten/src/ATen/native/TensorAdvancedIndexing.cpp:1189 (most recent call first):\nframe #0: at::native::index_select_out_cpu_(at::Tensor const&, long long, at::Tensor const&, at::Tensor&)::$_8::operator()(long long, long long) const + 592 (0x10d7f1084 in libtorch_cpu.dylib)\nframe #1: std::__1::__function::__func<at::internal::invoke_parallel(long long, long long, long long, std::__1::function<void (long long, long long)> const&)::$_1, std::__1::allocator<at::internal::invoke_parallel(long long, long long, long long, std::__1::function<void (long long, long long)> const&)::$_1>, void (int, unsigned long)>::operator()(int&&, unsigned long&&) + 148 (0x10d0c44c8 in libtorch_cpu.dylib)\nframe #2: std::__1::__function::__func<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3, std::__1::allocator<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3>, void ()>::operator()() + 48 (0x10d0c08d0 in libtorch_cpu.dylib)\nframe #3: c10::ThreadPool::main_loop(unsigned long) + 576 (0x1037b98c8 in libc10.dylib)\nframe #4: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, c10::ThreadPool::ThreadPool(int, int, std::__1::function<void ()>)::$_0> >(void*) + 72 (0x1037b9e8c in libc10.dylib)\nframe #5: _pthread_start + 148 (0x1b9f3026c in libsystem_pthread.dylib)\nframe #6: thread_start + 8 (0x1b9f2b08c in libsystem_pthread.dylib)\n")', /Users/jason/.cargo/registry/src/github.com-1ecc6299db9ec823/tch-0.9.0/src/wrappers/tensor_generated.rs:7612:87
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::result::unwrap_failed
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
   3: tch::wrappers::tensor_generated::<impl tch::wrappers::tensor::Tensor>::embedding
   4: rust_bert::gpt_neo::gpt_neo_model::GptNeoModel::forward_t
   5: <rust_bert::gpt_neo::gpt_neo_model::GptNeoForCausalLM as rust_bert::pipelines::generation_utils::LMHeadModel>::forward_t
   6: rust_bert::pipelines::generation_utils::private_generation_utils::PrivateLanguageGenerator::generate_beam_search
   7: tch::wrappers::tensor::no_grad
   8: rust_bert::pipelines::generation_utils::LanguageGenerator::generate_from_ids_and_past
   9: rust_bert::pipelines::generation_utils::LanguageGenerator::generate
  10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  11: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
  12: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
  13: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  17: tokio::runtime::scheduler::multi_thread::worker::Context::run
  18: tokio::macros::scoped_tls::ScopedKey<T>::set
  19: tokio::runtime::scheduler::multi_thread::worker::run
  20: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
  21: tokio::runtime::task::core::Core<T,S>::poll
  22: tokio::runtime::task::harness::Harness<T,S>::poll
  23: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Hoping for a pointer if this is bug or if I need to manage the max_new_tokens and max_tokens better.

Currently only setting max_new_tokens.

Jan 11 '23 09:01 jschnitzer

Hello @jschnitzer ,

GPTNeo (and most autoregressive language models) have a limited input context length (2048 for GPTNeo by default). This is because they rely on positional embedding implemented as an Embedding module - passing a sequence length exceeding the maximum input sequence length would cause an indexing error (our of bound for the embedding matrix). This issue is therefore not caused by the max_tokens or max_new_tokens but by the total length of the document generated.

Ideally this would get caught during generation for a more graceful error handling. Could you please share a reproducible example?

Jan 13 '23 18:01 guillaume-be

rust-bert rust-bert copied to clipboard

GPTNeo and LanguageGenerator with long prompt text resulting in a panic

rust-bert
rust-bert copied to clipboard