jupyter
jupyter copied to clipboard
Executing blocks with malformed results section erases org file contents
In some scenarios (does not happen often) the executing a src block erases the appended commend of the org-mode file. The result is that everything at least until the next src block gets erased.
My guess is that in the #+RESULTS: section the :RESULTS: are not properly wrapped with the :END: block. Thus, re-executing the src block erases everything up until the next :END: block. If lucky and executed the next block it will stop in :END: of that block. If not, most likely it will erase everything. Usually this can be reverted, if I realize described behavior with an undo.
But this can lead to some nasty situations. Is there a way to introduce a fail-safe for this?
I face a similar symptom regularly but only on commands that fails and triggers error / warnings.
In which org-mode will complain, here are an excerpt of my *Warnings* buffer
■ Warning (org-element-cache): org-element--cache: Unregistered buffer modifications detected (312536 != 312301). Resetting.
If this warning appears regularly, please report the warning text to Org mode mailing list (M-x org-submit-bug-report).
The buffer is: scratch.org
Current command: nil
Backtrace:
nil
So a good starting point on any investigation would be:
- how
emacs-jupyterput error message ([goto-error]?) into org buffer- via
jupyter-handle-error
- via
- how
org-element--cache-syncexpects buffer are edited.
Sorry that I do not have time right now to look further into this but I shall investigate more on it later since it affects me a lot. I used to just ignore it but lately I found that huge sections of my org file got deleted away without me noticing. Undo also won't work but luckily I had made a backup few weeks ago.
Could you give an example where Emacs-Jupyter causes the :END: line to be missing from the results of a source block? There shouldn't be any such cases happening, if there are let me know.
It would be great if you could provide a minimum working example Org file with example source blocks that can reproduce your problem. I am aware of the scenario that you mention, but evaluating source blocks with Emacs-Jupyter should always create valid Org documents.
I have also been seeing these issues but haven't had a chance to debug further and create a minimum working example. I suspect the org-element--cache issues may be related to to the way jupyter handles ansi colors as I still see inconsistent behavior for error outputs that have escape characters to color them.
@nnicandro
For me, I am not sure if :END: line is missing all the time when erasure of contents happen.
I am sure that:
- if the code fails and triggers error / warnings, content MIGHT be erased.
- if it triggers error, THEN try doing undo will DEFINITELY erase content, due to corrupted cache as shown in my previous comment (https://github.com/emacs-jupyter/jupyter/issues/486#issuecomment-1736906290)
I can reproduce this quite reliably in my current setup so I'll try look into it maybe at the weekends for a minimal setup.
I have also seen this happen, e.g.,
* before
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
8
#+end_src
#+RESULTS:
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
9
#+end_src
* after
#+begin_src jupyter-python :kernel py_base :session emacs_py_1 :async yes :exports both
8
#+end_src
#+RESULTS:
: 8
Here the problem is that I have deleted the blank line after #+RESULTS, but I think emacs-jupyter should be robust to such user misbehaviors. Can't it just check for the next #+begin_src and always stop there? After all, all results are prepended with : , so there should never be a line in the results that starts with #+begin_src. We can even just stop at the first #+.
@nnicandro Hmm, beware that mine might be a different issue. My issue have
strong connection with undoing and org-element--cache.
@NightMachinery
Have you ever seen the warning messages from org-element--cache?
@ed9w2in6 You’re right, the issue I raised is distinct from yours, but perhaps the solution I proposed would work for your case, as well?
@NightMachinery In Org, src blocks can also be the results of execution of some other source block, e.g.
#+begin_src shell :wrap "src jupyter-python"
echo 9
#+end_src
Running the above yields
#+begin_src shell :wrap "src jupyter-python"
echo 9
#+end_src
#+RESULTS:
#+begin_src jupyter-python
9
#+end_src
Org is the one that is removing the source block with the #+RESULTS keyword attached to it, not Emacs-Jupyter, when a new result is inserted see org-babel-insert-result. There is really no way to work around this unless we mess with Org's internals.
@nnicandro I think this problem is severe enough that it warrants forking org-babel-insert-result. I am pretty sure I have lost code to this behavior.
This org feature of inserting a source block as a result of another thing is not that useful (I would personally use eval instead in such a situation, which will work even if I migrate to a script).
We can of course check :wrap and do delete begin_src if such a :wrap is present. This new behavior can even be sent upstream, no?
Just to confirm that I can reliably reproduce this. Evaluate buffer with error. An example is:
#+BEGIN_SRC jupyter-python :exports both
import numpy as np
import xarray as xr
import pandas as pd
# foo = xr.Dataset({'foo': xr.DataArray(data=[100,200], dims=['dim'], coords={'dim':['a','b']})})
# bar = xr.Dataset({'bar': xr.DataArray(data=[200,300], dims=['dim'], coords={'dim':['b','c']})})
# baz = xr.Dataset({'baz': xr.DataArray(data=[200,100], dims=['dim'], coords={'dim':['b','a']})})
# print((foo['foo']-bar['bar']).values)
# print((foo['foo'] - baz['baz']).values)
times = pd.date_range(start='2000-01-01',freq='1D',periods=3)
dims = np.array(['aa','ab']).astype(np.object)
foo = xr.Dataset({'foo': xr.DataArray(data=[[1,2],[3,4],[5,6]], dims=['time','dim'], coords={'time':times, 'dim':dims})})
bar = xr.Dataset({'bar': xr.DataArray(data=[[2,1],[4,3],[6,5]], dims=['time','dim'], coords={'time':times, 'dim':dims[::-1]})})
# print((foo['foo']-bar['bar']).values)
ds = xr.Dataset()
ds['time'] = (('time'), times)
ds['dim'] = (('dim'), dims)
ds['baz'] = (('time','dim'), foo['foo']-bar['bar'])
print(ds)
#+END_SRC
Which errors with:
#+RESULTS:
:RESULTS:
: /tmp/ipykernel_521850/3839449454.py:12: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
: dims = np.array(['aa','ab']).astype(np.object)
# [goto error]
#+begin_example
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[13], line 12
5 # foo = xr.Dataset({'foo': xr.DataArray(data=[100,200], dims=['dim'], coords={'dim':['a','b']})})
6 # bar = xr.Dataset({'bar': xr.DataArray(data=[200,300], dims=['dim'], coords={'dim':['b','c']})})
7 # baz = xr.Dataset({'baz': xr.DataArray(data=[200,100], dims=['dim'], coords={'dim':['b','a']})})
8 # print((foo['foo']-bar['bar']).values)
9 # print((foo['foo'] - baz['baz']).values)
11 times = pd.date_range(start='2000-01-01',freq='1D',periods=3)
---> 12 dims = np.array(['aa','ab']).astype(np.object)
14 foo = xr.Dataset({'foo': xr.DataArray(data=[[1,2],[3,4],[5,6]], dims=['time','dim'], coords={'time':times, 'dim':dims})})
15 bar = xr.Dataset({'bar': xr.DataArray(data=[[2,1],[4,3],[6,5]], dims=['time','dim'], coords={'time':times, 'dim':dims[::-1]})})
File ~/local/mambaforge/envs/ds/lib/python3.10/site-packages/numpy/__init__.py:305, in __getattr__(attr)
300 warnings.warn(
301 f"In the future `np.{attr}` will be defined as the "
302 "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
304 if attr in __former_attrs__:
--> 305 raise AttributeError(__former_attrs__[attr])
307 # Importing Tester requires importing all of UnitTest which is not a
308 # cheap import Since it is mainly used in test suits, we lazy import it
309 # here to save on the order of 10 ms of import time for most users
310 #
311 # The previous way Tester was imported also had a side effect of adding
312 # the full `numpy.testing` namespace
313 if attr == 'testing':
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
#+end_example
:END:
Note that there is both an #+end_example and :END:.
However, if I then type C-/ which is (undo), it deletes the next ~100 lines of the buffer. Sections, babel blocks with BEGIN_SRC, etc. (a comment above said to the next source block but I am seeing this bug consume multiple source blocks).
I can reliably recreate this bug with my (large) config. I cannot create an MWE.
@mankoff The symptoms that you just described are exactly the same as mine.
Do you see the org-element--cache warning messages?
@nnicandro Do you prefer me and @mankoff to open another issue? At least the symptoms are different, if it turns out to be the same cause we can just close one of them.
@ed9w2in6 I'm running latest org from git and updated yesterday. I cannot recreate this bug today. The git log has a lot of entries on org-element and the cache. Can you check if you still have this bug on the latest commit? I'm at
- 098f08159 - (HEAD -> main, origin/main, origin/HEAD) org-open-at-point: Preserve point unless opening link moves the point (2023-10-23)
@ed9w2in6 I'm running latest org from git and updated yesterday. I cannot recreate this bug today. The git log has a lot of entries on
org-elementand the cache. Can you check if you still have this bug on the latest commit?
Nevermind, Still here :(.
I posted about this issue on the Org list in case it is related to org-cache.
Their suggestion:
This is most likely a problem with emacs-jupyter. It does something that bypasses `after-change-functions', which is not allowed in Org mode.
@mankoff The symptoms that you just described are exactly the same as mine. Do you see the
org-element--cachewarning messages?
Yes, I get the same cache warning. I see
Toggle...
Warning (org-element-cache): org-element--cache: Unregistered buffer modifications detected (57640 != 55418). Resetting. If this warning appears regularly, please report the warning text to Org mode mailing list (M-x org-submit-bug-report). The buffer is: misc.org Current command: nil Backtrace: " backtrace-to-string(nil) org-element--cache-sync(#37660) org-element-at-point() (progn (org-element-at-point)) (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate)) (let ((save-match-data-internal (match-data))) (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate))) (let ((element (let ((save-match-data-internal (match-data))) (unwind-protect (progn (org-element-at-point)) (set-match-data save-match-data-internal 'evaporate))))) (and (eq (org-element-type element) 'src-block) (>= (line-beginning-position) (let* ((parray (and t (let* ... ...)))) (if parray (let* ((val ...)) (if (eq val ...) 'nil (let ... val))) (let* ((val ...)) (cond (... ...) (... ...) (t ...)))))) ()) run-hook-with-args-until-success(org-eldoc-documentation-function #f(compiled-function (string &rest plist) # )) eldoc-documentation-default() eldoc--invoke-strategy(nil) eldoc-print-current-symbol-info() #f(compiled-function () # )() apply(#f(compiled-function () # ) nil) timer-event-handler([t 0 0 500000 nil #f(compiled-function () # ) nil idle 0 nil]) " Disable showing Disable logging
This org feature of inserting a source block as a result of another thing is not that useful
It is useful in some scenarios. But can be disabled, if you wish to (:results none). If you find something missing, feel free to write a feature request.
Hi @yantar92 - thanks for following up here. It does appear that we may be discussing two different problems in this GitHub issue. The error in the latter (more recent above here) comments and that I reported on the Org mailing list is not about code generating a source block. It happens with 'normal' #+RESULTS:.
See comment here: https://github.com/emacs-jupyter/jupyter/issues/486#issuecomment-1775830252
However, if I then type
C-/which is(undo), it deletes the next ~100 lines of the buffer. Sections, babel blocks withBEGIN_SRC, etc. (a comment above saidto the next source blockbut I am seeing this bug consume multiple source blocks).I can reliably recreate this bug with my (large) config. I cannot create an MWE.
I am unable to reproduce with my config.
I was unable to reproduce with my config yesterday either :). But I was today. Same config! :(.
I am seeing the following in ob-jupyter:
;; KLUDGE: Remove the file result-parameter so that
;; `org-babel-insert-result' doesn't attempt to handle it while
;; async results are pending. Do the same in the synchronous
;; case, but not if link or graphics are also result-parameters,
;; only in Org >= 9.2, since those in combination with file mean
;; to interpret the result as a file link, a useful meaning that
;; doesn't interfere with Jupyter style result insertion.
Do note that async evaluation API is in place in the latest Org mode. There is no need to write custom async code that might indeed be prone to various errors. Check out org-babel-comint-async-register (if it is not sufficient, consider writing a feature request).
A package with similar (same?) issue: https://github.com/nobiot/org-transclusion/issues/105 Their fix: https://github.com/nobiot/org-transclusion/commit/eb3ff3c83fee6edf45229eb570f5e6ca560851ee
I believe our issue here can be similar to them, in which the culprit is inhibit-modification-hook.
usage in jupyter-org-client:
https://github.com/emacs-jupyter/jupyter/blob/3a31920d48dc5e0d1028fb676cf20d13ea9f78ad/jupyter-org-client.el#L563-L568
usage in jupyter-repl (may not need to change but maybe change to for consistency?):
https://github.com/emacs-jupyter/jupyter/blob/3a31920d48dc5e0d1028fb676cf20d13ea9f78ad/jupyter-repl.el#L748-L755
A quick search in org-mode mailing list reveals a few similar issues: https://list.orgmode.org/?q=inhibit-modification-hooks
I am not sure about a right fix, maybe just doing a (org-element-cache-reset)?
Reading the long mailing lists for a bit, insert-file-contents apparently also need to do a cache reset dance.
I am not sure about a right fix, maybe just doing a (org-element-cache-reset)?
Why do you need to set inhibit-modification-hooks to start with? Running org-element-cache-reset will cause performance degradation in the whole Org file.
Similar problem I met. I am using doom with latest emacs-jupter. My problem is that when code running cost long time and met an error, then stdout during running got erased. A minimal example like this
#+begin_src jupyter-python :session test :async yes
from time import sleep
n_all = 10
for i in range(n_all):
sleep(0.05)
print(i)
1/0
#+end_src
the results block looks
#+RESULTS:
:RESULTS:
#+begin_example
ter-python :session
#+end_example
# [goto error]
: ---------------------------------------------------------------------------
: ZeroDivisionError Traceback (most recent call last)
: Cell In[46], line 7
: 5 sleep(0.05)
: 6 print(i)
: ----> 7 1/0
:
: ZeroDivisionError: division by zero
:END:
Note that without 1/0 in the last line, the results are 0-9 wrapped inside example block. Now they are replaced with some codes anywhere from my org file (ter-python :session). Either comment sleep(0.05) or change header args to :async no fix it.
If I change n_all to 9, the stdout would not be warpped inside example block and it works fine. Any idea on which function/variable triggered this wrapper?
For anyone here experiencing the org-element-cache warning issue, you can try out the fix in #515 and see if that helps. I'm not sure about the other issue with erasing org file contents. I've seen this happen before but never reproducibly and so I've always attributed to accidentally deleting :END: or something else in the org buffer while long running async code works such that the org document it tries to insert its results in is itself malformed.
One thing I have just noticed is about the output which if it is several lines it is formatted in an example block. If a #+begin_example block is not properly ended with an end_example , it will ignore the :END: sequence and will delete everything until it comes across the next #+end_example occurrence. So it may be that in blocks that are erroring this sequences of formatting is not occurring properly.