rust-bert
rust-bert copied to clipboard
GPTNeo and LanguageGenerator with long prompt text resulting in a panic
In an effort to build up a generated document part by part. I am leveraging a prompt to generate some text then concatenating the generated text and feeding it back in to keep building it up.
At point in the process I am hitting a panic:
panicked at 'called `Result::unwrap()` on an `Err` value: Torch("index out of range in self\nException raised from operator() at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1670525682339/work/aten/src/ATen/native/TensorAdvancedIndexing.cpp:1189 (most recent call first):\nframe #0: at::native::index_select_out_cpu_(at::Tensor const&, long long, at::Tensor const&, at::Tensor&)::$_8::operator()(long long, long long) const + 592 (0x10d7f1084 in libtorch_cpu.dylib)\nframe #1: std::__1::__function::__func<at::internal::invoke_parallel(long long, long long, long long, std::__1::function<void (long long, long long)> const&)::$_1, std::__1::allocator<at::internal::invoke_parallel(long long, long long, long long, std::__1::function<void (long long, long long)> const&)::$_1>, void (int, unsigned long)>::operator()(int&&, unsigned long&&) + 148 (0x10d0c44c8 in libtorch_cpu.dylib)\nframe #2: std::__1::__function::__func<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3, std::__1::allocator<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3>, void ()>::operator()() + 48 (0x10d0c08d0 in libtorch_cpu.dylib)\nframe #3: c10::ThreadPool::main_loop(unsigned long) + 576 (0x1037b98c8 in libc10.dylib)\nframe #4: void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, c10::ThreadPool::ThreadPool(int, int, std::__1::function<void ()>)::$_0> >(void*) + 72 (0x1037b9e8c in libc10.dylib)\nframe #5: _pthread_start + 148 (0x1b9f3026c in libsystem_pthread.dylib)\nframe #6: thread_start + 8 (0x1b9f2b08c in libsystem_pthread.dylib)\n")', /Users/jason/.cargo/registry/src/github.com-1ecc6299db9ec823/tch-0.9.0/src/wrappers/tensor_generated.rs:7612:87
stack backtrace:
0: rust_begin_unwind
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
2: core::result::unwrap_failed
at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:1791:5
3: tch::wrappers::tensor_generated::<impl tch::wrappers::tensor::Tensor>::embedding
4: rust_bert::gpt_neo::gpt_neo_model::GptNeoModel::forward_t
5: <rust_bert::gpt_neo::gpt_neo_model::GptNeoForCausalLM as rust_bert::pipelines::generation_utils::LMHeadModel>::forward_t
6: rust_bert::pipelines::generation_utils::private_generation_utils::PrivateLanguageGenerator::generate_beam_search
7: tch::wrappers::tensor::no_grad
8: rust_bert::pipelines::generation_utils::LanguageGenerator::generate_from_ids_and_past
9: rust_bert::pipelines::generation_utils::LanguageGenerator::generate
10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
11: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
12: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
13: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
14: tokio::runtime::task::core::Core<T,S>::poll
15: tokio::runtime::task::harness::Harness<T,S>::poll
16: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
17: tokio::runtime::scheduler::multi_thread::worker::Context::run
18: tokio::macros::scoped_tls::ScopedKey<T>::set
19: tokio::runtime::scheduler::multi_thread::worker::run
20: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
21: tokio::runtime::task::core::Core<T,S>::poll
22: tokio::runtime::task::harness::Harness<T,S>::poll
23: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Hoping for a pointer if this is bug or if I need to manage the max_new_tokens and max_tokens better.
Currently only setting max_new_tokens.
Hello @jschnitzer ,
GPTNeo (and most autoregressive language models) have a limited input context length (2048 for GPTNeo by default). This is because they rely on positional embedding implemented as an Embedding
module - passing a sequence length exceeding the maximum input sequence length would cause an indexing error (our of bound for the embedding matrix). This issue is therefore not caused by the max_tokens
or max_new_tokens
but by the total length of the document generated.
Ideally this would get caught during generation for a more graceful error handling. Could you please share a reproducible example?