Notes-for-Stanford-CS224N-NLP-with-Deep-Learning
Notes-for-Stanford-CS224N-NLP-with-Deep-Learning copied to clipboard
Minor formula notation issue for lecture 1 note
Dear Jack, Thank you so much for this wonderful knowledge share. My encounters with you have made me better in this field!
There are very minor notation issues for lecture 1 note. Others are just perfect!
- formula (1), should be given the centor word $w_t$, and the likelihood should be $$Likelihood = L(\theta) = \prod_{t}^{T} \prod_{\substack{-m \leq j \leq m \\ j \neq 0}} P(w_{t+j}|{w_{t}; \theta})$$
The reason to change to $$j \neq0 $$, rather not $$j \neq m$$ is when $j$ is 0, it means the center word $w_t$ itself. Similar issue applies to other formula.
- For formula(3), the wording for prediction function should be : function of predicting context word $v_{o}$ , given the center word $v_{c}$ and the vocabulary $V$. Similarly, formula (5), it should be $$J(\mathbf{u_{o} \;|\; v_{c}}) $$
- For Partial derivative with regard to $v_c$ section, step 3 $$\frac{\partial}{\partial \mathbf{v_c}}exp(\mathbf{u_{x}^{T} v_{c}}) = exp(\mathbf{u_{x}^{T} v_{c}}) \cdot \frac{\partial}{\partial \mathbf{v_c}}\mathbf{u_{x}^{T} v_{c}} = exp(\mathbf{u_{x}^{T} v_{c}}) \cdot \mathbf{u_{x}}\tag{Step 3}$$. I think $u_x$ here should indicate a vector $$\mathbf{u_{x}}$$.
Let me know if my understanding is correct or not. Again, thank you for generously sharing your time with community!