taichi icon indicating copy to clipboard operation
taichi copied to clipboard

AOT recording produces an empty file.

Open aespielberg opened this issue 2 years ago • 8 comments

Working on Ubuntu 20.04, CUDA 11.7, taichi 1.04.

When running the example descirbed here, the record.yml file is empty.

Full code:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

Here is the output:

[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/04/22 23:20:05.565 3135938] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/04/22 23:20:05.569 3135938] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU

And the yml file is empty. There is a comment about arch.cc not being supported for some reason?

aespielberg avatar Aug 05 '22 03:08 aespielberg

Hi aespielberg, I guess you'll have to add a ti.aot.stop_recording() which triggers the serialization.

jim19930609 avatar Aug 05 '22 05:08 jim19930609

In addition, we also have a more up-to-date AOT interface which is well maintained:

@ti.kernel
def run():
    ......

m = ti.aot.Module(ti.cpu)
m.add_kernel(run, template_args={'arr': arr})
m.save(dir_name, 'whatever')

Was wondering if there's any specific reason that you're using ti.aot.start_recording('record.yml')? Otherwise it would be strongly recommended to switch to our latest AOT interface.

jim19930609 avatar Aug 05 '22 08:08 jim19930609

I tried stop_recording which did not help; I will try the new interface soon and I will post the output of stop_recording.

aespielberg avatar Aug 05 '22 13:08 aespielberg

Okay, adding ti.aot.stop_recording() to the end of that script yields the output:

Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/05/22 13:43:05.420 3301647] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/05/22 13:43:05.421 3301647] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/05/22 13:43:05.731 3301647] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

If I change the code instead to:

import taichi as ti



ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()



m = ti.aot.Module(ti.cpu)
m.add_kernel(do_some_works)
m.save('.', 'record')

Then, I get three metatdata files and a .ll file. I haven't tried to compile them yet to see if they work correctly - is there documentation on how to use these and this new interface? I am sorry; I am having trouble finding it.

aespielberg avatar Aug 05 '22 19:08 aespielberg

Hi aespielberg, Do apologize that there's no official documentation for the serialized AOT files yet, since AOT has not been officially released yet.

In simple words:

  1. kernels (i.e. do_some_works() and compute_loss()) and some internal helper functions get compiled into LLVM IR, written to .ll files.
  2. Data structures (like ti.field) and some descriptive information are stored in metadata files. Personally I always read and analyze metadata files for debug or validation purpose - they're fairly human-friendly though.

jim19930609 avatar Aug 08 '22 02:08 jim19930609

Thank you, I understand the basic idea now, this is helpful. I guess, overall, I am also wondering if the possible is currently possible in taichi (even if not documented yet):

  1. Compile kernels and call them in Python later without having to re-compile.
  2. Compile code that calls multiple kernels in succession.
  3. Compute gradients of kernels or save gradient computations.

I know it's not officially released yet, but if there is any example code anywhere of just using the output files of simple functions, that would be super helpful.

aespielberg avatar Aug 08 '22 18:08 aespielberg

Hi aespilber, The answer is yes! We do support all three features - although some of them haven't been release yet.

  1. We have offline-cache mechanism, to be released soon. For an early access, you can turn it on via: ti.init(arch=...., offline_cache=True). By default, the cached files are stored under ~/.cache/taichi. (Note that offline-cache is only implemented on CPU and CUDA backends, other backends such as Vulkan, OpenGL does not have this yet.)

  2. For AOT purpose, we do have compute graph which allows you to trace the order of kernel execution and then execute the same way in C++ code. A small example you can probably start with is: https://github.com/taichi-dev/taichi-aot-demo/tree/master/comet . You can ignore most of the codes especially the guiHelper and focus on the comet_run().

  3. Our auto diff feature is designed for this! @erizmr is an expert on this topic, and was wondering if you can provide some simple examples for @aespielberg to start with?

jim19930609 avatar Aug 09 '22 02:08 jim19930609

Hi @aespielberg , Taichi supports computing gradients of kernels using reverse mode (forward mode will be supported in v1.1 release). The computed gradients are stored in the .grad, which is a field attached to the primal field, e.g., in your example, the gradients d loss / dx are store in x.grad. There are some starting examples and guidance in the doc: https://docs.taichi-lang.org/docs/differentiable_programming which might help.

erizmr avatar Aug 09 '22 03:08 erizmr

Hi @erizmr sorry, maybe there was a misunderstanding - I know how to use backward computation, I was simply wondering if there was an example on using it with AOT. (Very excited about forward mode btw.)

@jim19930609 Thank you for the cache info and comet example, those are very useful, and this looks like very cool functionality. If I may ask a few follow-up questions:

  1. I see the .ll files created in the cache folder, with name mangling. I am not sure how to navigate this. If I want to just call a particular compiled kernel again from python, in a different file (without the original definition), what is the best way to do this? Also, I am going to guess that this will throw errors if the kernels refer to globals that are not present?
  2. Is there a way to specify the file that this cache is set to? I see a reference to get_repo_dir() in https://github.com/taichi-dev/taichi/blob/master/taichi/program/compile_config.h but I'm not sure how to set this.

By the way, I know the conversation has diverged and this ticket was closed, but was the original issue resolved (and how)?

aespielberg avatar Aug 16 '22 03:08 aespielberg

Hi @aespielberg, For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip

Let me re-open this issue until it's verified to work.

jim19930609 avatar Aug 16 '22 04:08 jim19930609

As for the other 2 issues:

  1. Loading the cached kernel in a separate file then execute it sounds more like AOT, which is slightly different from OfflineCache (AOT has more flexibility). @ailzhang Please correct me if I were wrong, but I dont think we exposed AOT runtime interfaces to Python, but I feel like this is a fairly useful feature. Was wondering if you can make a feature request to us?
  2. You can set the cache directory in the following manner: ti.init(arch=ti.cpu, offline_cache_file_path="/tmp/aot/")

jim19930609 avatar Aug 16 '22 04:08 jim19930609

Hi @erizmr sorry, maybe there was a misunderstanding - I know how to use backward computation, I was simply wondering if there was an example on using it with AOT. (Very excited about forward mode btw.)

@jim19930609 Thank you for the cache info and comet example, those are very useful, and this looks like very cool functionality. If I may ask a few follow-up questions:

  1. I see the .ll files created in the cache folder, with name mangling. I am not sure how to navigate this. If I want to just call a particular compiled kernel again from python, in a different file (without the original definition), what is the best way to do this? Also, I am going to guess that this will throw errors if the kernels refer to globals that are not present?
  2. Is there a way to specify the file that this cache is set to? I see a reference to get_repo_dir() in https://github.com/taichi-dev/taichi/blob/master/taichi/program/compile_config.h but I'm not sure how to set this.

By the way, I know the conversation has diverged and this ticket was closed, but was the original issue resolved (and how)?

As for autodiff, I haven't tried any AOT demo with autodiff for now, and I feel like this is another feature request for AOT. Was wondering if we have any thoughts or future plans regarding autodiff-AOT? @ailzhang @erizmr

jim19930609 avatar Aug 16 '22 04:08 jim19930609

Hi @aespielberg, For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip

Let me re-open this issue until it's verified to work.

I get an empty record.yml and the following output:

$ python test.py
[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/16/22 01:32:38.688 2259730] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/16/22 01:32:38.700 2259730] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/16/22 01:32:39.009 2259730] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

aespielberg avatar Aug 16 '22 05:08 aespielberg

Hi @aespielberg, For the original issue, can you give a try on the following code? I was able to get "record.yaml" locally and was wondering if you can reproduce this result:

import taichi as ti


ti.aot.start_recording('record.yml')
ti.init(arch=ti.cc)
loss = ti.field(float, (), needs_grad=True)
x = ti.field(float, 233, needs_grad=True)

@ti.kernel
def compute_loss():
   for i in x:
       loss[None] += x[i]**2

@ti.kernel
def do_some_works():
   for i in x:
       x[i] -= x.grad[i]

with ti.ad.Tape(loss):
   compute_loss()
do_some_works()

ti.aot.stop_recording()

The file obtained locally: record.zip Let me re-open this issue until it's verified to work.

I get an empty record.yml and the following output:

$ python test.py
[Taichi] version 1.0.4, llvm 10.0.0, commit 2827db2c, linux, python 3.9.7
[I 08/16/22 01:32:38.688 2259730] [action_recorder.cpp:start_recording@26] ActionRecorder: start recording to [record.yml]
[W 08/16/22 01:32:38.700 2259730] [misc.py:adaptive_arch_select@747] Arch=[<Arch.cc: 3>] is not supported, falling back to CPU
[Taichi] Starting on arch=x64
[I 08/16/22 01:32:39.009 2259730] [action_recorder.cpp:stop_recording@33] ActionRecorder: stop recording

Interesting, I was using a Linux machine with nightly Taichi wheel locally.

may I know what OS and Taichi version you are using, so as to reproduce?

jim19930609 avatar Aug 16 '22 05:08 jim19930609

This is taichi 1.04; Ubuntu 20.04.3 LTS; running in Anaconda Python 3.9.7.

aespielberg avatar Aug 16 '22 06:08 aespielberg

Verified that ti.aot.start_recording() and ti.aot.stop_recording() generates empty yaml file with taichi 1.04 and 1.10. This is because we turned off TI_WITH_CC option when building for release package. One possible solution is to build taichi from source: https://docs.taichi-lang.org/docs/dev_install.

Or to make your life easy, I'm also able to send you a working python wheel through email (Too large to fit github's file size limit). If you'd prefer a pre-built wheel, please let me know the python version you are using.

jim19930609 avatar Aug 16 '22 06:08 jim19930609