pythontex
pythontex copied to clipboard
Extracting code from a document into a .py file
Hi. Is there a possibility to extract all Python code from a document into a separate Python file in PythonTex? AFAIK, PyWeave has an execution mode called ptangle
which does right that.
The reason why I am asking this is follows: I recently realized that it is really hard to copypaste the Python code from the code boxes of Pdf documents produced by PythonTex. In my case, the document contains line numbers and also I use Evince for viewing pdfs (not sure if the problem is viewer-dependent or not). My planned workaround is to provide users with raw text code files.
Note, that I tried to parse the .*pytxcode with the 2-liner sed -n '/^=>PYTHONTEX:SETTINGS/q;p' "$1" | sed 's/^=>PYTHONTEX.*//g'
, but my sed skill is not enough to deal with indentations like this
...
build_rref:RRef = evaluate(stage_build)
print(build_rref)
=>PYTHONTEX#py#stdout#default#3#block#####367#
print(mklens(build_rref).fetch_ref.rref) # Unlucky snippet happened to be inside the itemized list of the original document
print(mklens(build_rref).fetch_ref.url.val)
=>PYTHONTEX#py#stdout#default#4#block#####385#
from subprocess import run, PIPE
print(run([mklens(build_rref).bin.syspath], stdout=PIPE).stdout.decode('utf-8'))
=>PYTHONTEX:SETTINGS#
...
You could try something like this on the *.pytxcode:
import collections
import pathlib
import sys
sources = collections.defaultdict(list)
pytxcode = sys.argv[1]
with open(pytxcode, encoding='utf8') as f:
in_source = False
source_name = None
for line in f:
if line.startswith('=>PYTHONTEX#'):
in_source = True
source_name = f'source_{line.split("#")[1]}_{line.split("#")[2]}.py'
elif line.startswith('=>PYTHONTEX') or line.startswith('=>DEPYTHONTEX'):
in_source = False
elif in_source:
sources[source_name].append(line)
source_path = pathlib.Path('pythontex_sources')
if not source_path.is_dir():
source_path.mkdir()
for source, lines in sources.items():
with open(source_path / source, 'w', encoding='utf8') as f:
f.write(''.join(lines))
Usage: python ./extract_source.py ./test.pytxcode
This feature has been on my list of features to add to PythonTeX and Codebraid for a while, so hopefully I'll have time to add built-in support at some point.
You could try something like this on the *.pytxcode:
Usage:
python ./extract_source.py ./test.pytxcode
This feature has been on my list of features to add to PythonTeX and Codebraid for a while, so hopefully I'll have time to add built-in support at some point.
Thank you, this code helped. Here is the updated version
#!/usr/bin/env python3
import collections
import pathlib
import sys
sources = collections.defaultdict(list)
pytxcode = sys.argv[1]
dstsource = sys.argv[2] if len(sys.argv)==3 else "pythontex_sources"
with open(pytxcode, encoding='utf8') as f:
in_source = False
spaces_to_trim = None
source_name = None
for line in f:
if line.startswith('=>PYTHONTEX#'):
in_source = True
spaces_to_trim = None
source_name = f'source_{line.split("#")[1]}_{line.split("#")[2]}.py'
elif line.startswith('=>PYTHONTEX') or line.startswith('=>DEPYTHONTEX'):
in_source = False
elif in_source:
if spaces_to_trim is None:
# Detect the number of leading spaces to trim using the first line
spaces_to_trim = 0
for c in line:
if c!=' ':
break
spaces_to_trim+=1
if len(line[:spaces_to_trim].strip()) != 0:
print(f"Can't find {spaces_to_trim} spaces at the beginning of line '{line}'")
else:
line=line[spaces_to_trim:]
sources[source_name].append(line)
if len(sources.keys())==1:
with open(dstsource, 'w', encoding='utf8') as f:
f.write(''.join(sources[list(sources.keys())[0]]))
else:
source_path = pathlib.Path(dstsource)
if not source_path.is_dir():
source_path.mkdir()
for source, lines in sources.items():
with open(source_path / source, 'w', encoding='utf8') as f:
f.write(''.join(lines))
Usage: python ./test.pytxcode ./extract_source.py