Dmitry Labazkin issues

Results 18 issues of


                                            Dmitry Labazkin

Question about implementation of CausalAttention class (3.5.3 Implementing a compact causal self-attention class)

Hi @rasbt, This [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) contains the following implementaion of CausalAttention: ```python class CausalAttention(nn.Module): def __init__(self, d_in, d_out, block_size, dropout, qkv_bias=False): super().__init__() self.d_out = d_out self.W_query = nn.Linear(d_in, d_out, bias=qkv_bias) self.W_key...

Solution for Exercise 3.2 is included in the notebook with main code (3.6.1 Stacking multiple single-head attention layers)

Hi @rasbt, It seems that cell [36] in the [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) with main code contains solution to Exercise 3.2. Thank you.

Probably a typo in multi-head attention description (3.6.1 Stacking multiple single-head attention layers)

Hi @rasbt, I found the following statement in the mentioned section: > Figure 3.24 illustrates the structure of a multi-head attention module, which consists of multiple single-head attention modules, as...

Inconsistencies in output for dropout section (3.5.2 Masking additional attention weights with dropout)

Hi @rasbt, I am trying to explore and reproduce Chapter 3 and found that I can't reproduce results that you specified in the notebook and the book, even if I...

Inconsistencies between the code in the book and the notebooks (2.6 Data sampling with a sliding window)

Hi @rasbt, I noticed that in the book you provide the following code with function name `create_dataloader` and the argument `stride = max_length + 1` to avoid overlap in data...

BooleanOutputParser error when non-english prompt is used

There is an [`LLMChainFilter`](https://github.com/ai-forever/gigachain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter.py#L30) in `gigachain` legacy API which can be used as additional filter for the chunks after they were retrieved from the vectorstore. In the original version (`langchain`)...

MemorySaver doesn't store checkpoints in descending order by timestamp

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the [LangGraph](https://langchain-ai.github.io/langgraph/)/LangChain documentation with the integrated search. - [X] I used...

DOC: <Issue related to /tutorials/Developers/backtesting>

Hi, I tried to reproduce the example from the documentation for backtesting and also [Backtesting | LangSmith Evaluations - Part 19](https://www.youtube.com/watch?v=3cDtDI2W-xA), but each time I have the following error during...