jupyter
jupyter copied to clipboard
Long output lines in org results slows Emacs to a halt
I keep being bitten by Emacs' inability to gracefully handle long lines whenever something is outputted from a jupyter babel src block on a single line with something like over 9000 characters.
Ideally there would be an option in org babel which would support this, however since I primarily use :async t and so ob-jupyter inserts it's own org formatted strings, I think an option is necessary here. There's already jupyter-repl-maximum-size for the repl as I imagine the same issue with long lines would affect users there.
Perhaps a warning could be displayed in the minibuffer or even inserted into the results block to warn that the complete output wasn't inserted. I've patched this myself the time being so I can stop having to kill emacs everytime I accidentally print something I probably shouldn't have:
(cl-defmethod jupyter-org--insert-result (_req context result)
(let ((str
(org-element-interpret-data
(jupyter-org--wrap-result-maybe
context (if (jupyter-org--stream-result-p result)
(thread-last result
jupyter-org-strip-last-newline
jupyter-org-scalar)
result)))))
(if (< (length str) 4000)
(insert str)
(insert (format ": Result was too long! Length was %d" (length str)))))
(when (/= (point) (line-beginning-position))
;; Org objects such as file links do not have a newline added when
;; converting to their string representation by
;; `org-element-interpret-data' so insert one in these cases.
(insert "\n")))
Thank you for this great package!
@akirakyle thank you for posting this! I'm running into the same issue myself.
For now I'm thinking about doing something like write to a file and suppress that output:
#+begin_src jupyter-python :async yes :session py :results none
from json import dumps
from my_async_stuff import get
with open("./data.json", "w") as fout:
fout.write(dumps(get("Something that returns a really large output")))
#+end_src
But that's not a great solution. I'd really like to see that result in the minibuffer, but just with an option to totally hide the contents (only show ID)
I've not been able to figure that out yet though, perhaps you have a solution?
A solution to this issue would probably make this much better.
If you want to see output in the minibuffer, try using :results silent. I think having both a :file header and an equivalent to jupyter-repl-maximum-size for org outputs would be ideal as sometimes I'll run something which unexpectedly generates a long output line and I wrote that small patch as I kept "shooting myself in the foot" this way which would really slow me down. If a warning is printed then it would be convienent to simply add a :file header to redirect output to a file where I can open it with something else without having to write code that does that every time.
Yep, for me it's also accidental cases where appears.
What I've found to help quite a bit is to never deal with the raw data (line data), but rather process it to something else, such as a class which contains the result.
In this way Emacs displays some class information but not the full data set:
#+BEGIN_SRC jupyter-python :session py
class MyClass:
def __init__(self, data):
self.data = """
This is a long long long long
long long long long long long
long long long long long long
long long long long long line
"""
print(MyClass("This is where data would go"))
#+END_SRC
#+RESULTS:
: <__main__.MyClass object at 0x11e1ca950>
pandas has been exceptionally helpful here as most of my data is numerical arrays. Transforming it to pandas, means that they handle the truncation of output when printed to console for debugging.
In order to avoid accidental use cases, I just never allow the block to return the raw output, and instead use an autoload script header to ensure it reloads the external py file helper which wraps my data.
#+BEGIN_SRC jupyter-python :session py :async yes :exports results
# Prevent caching of our external python file
%reload_ext autoreload
%autoreload 2
from my_module import my_stuff
N.b. :exports results only as I don't want my script to be included in PDF generation.
Unfortunately such a technique won't work for my case which involves using sympy to generate symbolic expressions. Sometimes an operation such as matrix inversion doesn't simplify and the resulting latex output is sometimes thousands of characters wide. Latex itself has not trouble typesetting this however emacs will seriously slow down trying to just insert the result into the buffer. (Interestingly enough if the typeset image is overlaid on the text, emacs speeds back up a bit)
I modified the do function as that one is called afterwards and only checks the last result, whereas like this also print statements and errors are formatted; also made it print colors as per #197
And the lines are truncated so that the begging and the ending are displayed to fill the window width.
The color function should also take care of the case when the truncation removes codes in betweens.
(after! ob-jupyter
(defun jupyter-org--do-insert-result (req result)
(if (and (stringp result) (> (length result) (window-total-width)))
(let* ((max-width (- (window-total-width) 2))
(half-width (/ max-width 2))
(result-split (split-string result "[\n]"))
(result-len (length result)))
(setq result (concat
;; (format "Result was truncated, because it was too long! (%d)\n"
;; result-len)
(mapconcat #'identity
(mapcar
#'(lambda (str)
(if (> (length str) max-width)
(concat (substring str 0 half-width) " ... "
(substring str (- half-width)))
str)
)
result-split) "\n")
))))
(org-with-point-at (jupyter-org-request-marker req)
(let ((res-begin (org-babel-where-is-src-block-result 'insert)))
(goto-char res-begin)
(let* ((indent (current-indentation))
(context (jupyter-org--normalized-insertion-context))
(pos (jupyter-org--append-stream-result-p context result)))
(cond
(pos
(goto-char pos)
(jupyter-org-indent-inserted-region indent
(jupyter-org--append-stream-result result)))
(t
(forward-line 1)
(unless (bolp) (insert "\n"))
(jupyter-org--prepare-append-result context)
(jupyter-org-indent-inserted-region indent
(jupyter-org--insert-result req context result))))
(when (jupyter-org--stream-result-p result)
(let ((end (point-marker)))
(unwind-protect
(jupyter-org--handle-control-codes
(if pos (save-excursion
(goto-char pos)
;; Go back one line to account for an edge case
;; where a control code is at the end of a line.
(line-beginning-position 0))
res-begin)
end)
(set-marker end nil)))
(jupyter-org--mark-stream-result-newline result))
(ansi-color-apply-on-region (plist-get (car (cdr context)) ':end) (point))
)))))