mxnet-the-straight-dope
mxnet-the-straight-dope copied to clipboard
[optimization-intro.ipynb] problematic explanation of worse error than machine precision
In the current explanation below, it blames O(epsilon^2) for error larger than machine precision, but epsilon^2 is not going to be larger than epsilon (1e-8). The real reason is error magnification due to potentially large f'(x) and other things.
Current explanation
This means that a small change of order $\epsilon$ in the optimum solution $x^$ will change the value of $f(x^)$ in the order of $\epsilon^2$. In other words, if there is an error in the function value, the precision of solution value is constrained by the order of the square root of that error. For example, if the machine precision is $10^{-8}$, the precision of the solution value is only in the order of $10^{-4}$, which is much worse than the machine precision.