PythonCall.jl how to optimize reading long tuples from julia to python

Is your feature request related to a problem? Please describe. This code takes around 7s on a modern i9 the first time that it's executed. It's immediate the next times it's executed. It's also immediate to execute (zeros(20_000)...,) in julia. I have python 3.12 and julia 1.10.5

import juliacall
v=juliacall.Main.seval("(zeros(20_000)...,)");

Describe the solution you'd like Is there any way to speed up the initial compilation?

Describe alternatives you've considered I tried to use this code to to speed up reading data from julia to python by going through a tuple. Looping a long julia vector from python is much slower than looping a tuple ( around 4 times slower to read a 100_000 floats vector vs reading a tuple)

Nov 18 '24 14:11 dpinol

I can't explain the speed difference (seval is directly calling Julia to execute that code, there is essentially zero overhead from JuliaCall) but you shouldn't be creating tuples of thousands of elements in Julia anyway. What are you actually trying to do?

Nov 20 '24 09:11 cjdoris

What are you actually trying to do?

I tried to use this code to to speed up reading long vectors from julia to python by going. Looping a long julia vector from python is much slower than looping a tuple ( around 4 times slower to read a 100_000 floats vector vs reading a tuple). So I tried converting the vector to a tuple in julia before reading it. Indeed it's much faster, but the compilation overhead is huge. An alternative I found is, instead of the reading the long julia vector from python, I pass a python vector to julia and julia populates it. This results in being around x3 faster also.

Nov 20 '24 10:11 dpinol

Can you give some code to demonstrate what you're doing, i.e. some code that works but is slow that you'd like to be faster?

Nov 21 '24 19:11 cjdoris

get_v what I'd like to optimize (reading a long julia vector). set_v is the workaround that, by passing a python array to julia, the reading of the vector becomes 3 times faster.

from  juliacall import Main as jl
import timeit
import logging

jl.seval("get_v(v,n)=rand(n)")

jl.seval("""function setPyVector(py_v, jl_v)
             @inbounds for (i, v) in enumerate(jl_v)
                 py_v[i]=v;
             end;
         end;""")

jl.seval("function set_v(v,n) setPyVector(v, get_v(v,n)); return v end")


def benchmark(f, vector_len, retries) -> float:
	def workload():
	 sum = 0
	 v = vector_len * [None]
	 fn = f(v, vector_len)
	 for i in fn:
	     sum += i
	workload() # warmup
	print("running ", f)
	print("elapsed ", timeit.timeit(workload, number=retries))


benchmark(jl.get_v, vector_len=100000, retries=100)
benchmark(jl.set_v, vector_len=100000, retries=100)

This is what I get on a modern i9:

running  get_v
elapsed  6.038048821996199
running  set_v
elapsed  1.921932346012909

Nov 27 '24 10:11 dpinol

The slow part won't be the call to get_v but the summation you perform afterwards. This is because get_v is returning a lazy wrapper around the rand(n) array, so when you do the sum, each element is materialised lazily one by one by calling into Julia. In your case it will be faster to materialise the whole array as a Python list in one go, such as by defining get_v(v, n) = pylist(rand(n)).

Apr 11 '25 21:04 cjdoris