DojoInitError: Cannot find the *.ast.json file for Theorem
While I am trying to use LeanDjo to interact with LeanDojo programatically, below issue occurred. I think I don't have the *.ast.json file. How to get those files?
Run Code:
from lean_dojo import *
from ReProver.common import *
URL = "https://github.com/leanprover-community/mathlib4"
COMMIT = "20c73142afa995ac9c8fb80a9bb585a55ca38308"
repo = LeanGitRepo(URL, COMMIT)
theorem = Theorem(repo, thm.theorem.file_path, thm.theorem.full_name)
with Dojo(theorem) as (dojo, init_state):
dojo.run_tac()
Error:
During handling of the above exception, another exception occurred:
DojoInitError Traceback (most recent call last)
Cell In[15], [line 25](vscode-notebook-cell:?execution_count=15&line=25)
[21](vscode-notebook-cell:?execution_count=15&line=21) unproof_theorems.append((thm_name, id))
[23](vscode-notebook-cell:?execution_count=15&line=23) return proof_theorems, unproof_theorems
---> [25](vscode-notebook-cell:?execution_count=15&line=25) proof_theorems, unproof_theorems = verified_all_theorems(dojo_matching_theorems[116:200])
Cell In[15], [line 16](vscode-notebook-cell:?execution_count=15&line=16)
[13](vscode-notebook-cell:?execution_count=15&line=13) proof_ls = split_proof(commented_proof, pf_pattern)
[14](vscode-notebook-cell:?execution_count=15&line=14) proof_steps = get_single_step_proof(proof_ls, pattern, tac_pattern)
---> [16](vscode-notebook-cell:?execution_count=15&line=16) thm_name, proof_finished = verified_single_theorem(theorem, proof_steps)
[18](vscode-notebook-cell:?execution_count=15&line=18) if proof_finished:
[19](vscode-notebook-cell:?execution_count=15&line=19) proof_theorems.append((thm_name, id))
Cell In[10], [line 6](vscode-notebook-cell:?execution_count=10&line=6)
[5](vscode-notebook-cell:?execution_count=10&line=5) def verified_single_theorem(theorem: Theorem, proof_steps: List[str]):
----> [6](vscode-notebook-cell:?execution_count=10&line=6) with Dojo(theorem) as (dojo, init_state):
[7](vscode-notebook-cell:?execution_count=10&line=7) result = init_state
[8](vscode-notebook-cell:?execution_count=10&line=8) for t in proof_steps:
File ~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:165, in Dojo.__enter__(self)
[163](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:163) traced_file = self._locate_traced_file(traced_repo_path)
[164](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:164) except FileNotFoundError:
--> [165](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:165) raise DojoInitError(
[166](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:166) f"Cannot find the *.ast.json file for {self.entry} in {traced_repo_path}."
[167](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:167) )
[169](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:169) self._modify_file(traced_file)
[171](https://file+.vscode-resource.vscode-cdn.net/home/user/~/miniconda3/envs/ReProver/lib/python3.11/site-packages/lean_dojo/interaction/dojo.py:171) # Run the modified file in a container.
DojoInitError: Cannot find the *.ast.json file for Theorem(repo=LeanGitRepo(url='https://github.com/leanprover-community/mathlib4', commit='20c73142afa995ac9c8fb80a9bb585a55ca38308'), file_path=PosixPath('src/lean/Init/Data/List/Erase.lean'), full_name='List.eraseP_cons_of_pos') in projects/.cache/lean_dojo/leanprover-community-mathlib4-20c73142afa995ac9c8fb80a9bb585a55ca38308/mathlib4.
System Version:
Lake version 5.0.0-306f361 (Lean version 4.17.0)
Lean (version 4.17.0, x86_64-unknown-linux-gnu, commit 306f36116535, Release)
lean-dojo v2.2.0
Python 3.11.8
Another thing, how could we dojo.run_tac() on multiple tactics or single line contain ;? Some seems to work while other doesn't.
Example:
- theorem
Finset.map_nsmul_piAntidiagcan run tactic\n classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hnbut if we split based on;and run_tac separately, it going to run into error. - Conversely, theorem
CommSemiring.toSemiring_injectivecannot run tactic\n rintro ⟨⟩ ⟨⟩ _; congr. However, if we split it and run separately, it going to lead to Proof_Finished. - For theorem
dist_le_Ico_sum_dist, how to run multiple line tactics like below:
induction n, h using Nat.le_induction with
| base => rw [Finset.Ico_self, Finset.sum_empty, dist_self]
| succ n hle ihn =>
calc
dist (f m) (f (n + 1)) ≤ dist (f m) (f n) + dist (f n) (f (n + 1)) := dist_triangle _ _ _
_ ≤ (∑ i ∈ Finset.Ico m n, _) + _ := add_le_add ihn le_rfl
_ = ∑ i ∈ Finset.Ico m (n + 1), _ := by
{ rw [Nat.Ico_succ_right_eq_insert_Ico hle, Finset.sum_insert, add_comm]; simp }
The code example is not runnable. I can take a look if you have a minimal runnable example to reproduce the error.
Usually the "Cannot find the *.ast.json file" error is caused by repos that are not properly traced (e.g., some processes were killed due to out-of-memory error when tracing the repo). Delete the traced repo from the cache and re-trace it (with larger memory if possible) may help.
Re
Another thing, how could we
dojo.run_tac()on multiple tactics or single line contain;? Some seems to work while other doesn't.Example:
- theorem
Finset.map_nsmul_piAntidiagcan run tactic\n classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hnbut if we split based on;and run_tac separately, it going to run into error.- Conversely, theorem
CommSemiring.toSemiring_injectivecannot run tactic\n rintro ⟨⟩ ⟨⟩ _; congr. However, if we split it and run separately, it going to lead to Proof_Finished.
A concrete, runnable piece of code would be helpful!
- For theorem
dist_le_Ico_sum_dist, how to run multiple line tactics like below:induction n, h using Nat.le_induction with | base => rw [Finset.Ico_self, Finset.sum_empty, dist_self] | succ n hle ihn => calc dist (f m) (f (n + 1)) ≤ dist (f m) (f n) + dist (f n) (f (n + 1)) := dist_triangle _ _ _ _ ≤ (∑ i ∈ Finset.Ico m n, _) + _ := add_le_add ihn le_rfl _ = ∑ i ∈ Finset.Ico m (n + 1), _ := by { rw [Nat.Ico_succ_right_eq_insert_Ico hle, Finset.sum_insert, add_comm]; simp }
This is a calculational proof in Lean (the calc keyword), which LeanDojo does not support.
The code example is not runnable. I can take a look if you have a minimal runnable example to reproduce the error.
Usually the "Cannot find the *.ast.json file" error is caused by repos that are not properly traced (e.g., some processes were killed due to out-of-memory error when tracing the repo). Delete the traced repo from the cache and re-trace it (with larger memory if possible) may help.
Thanks for your reply! Below is the code to reproduce.
from lean_dojo import *
from ReProver.common import *
URL = "https://github.com/leanprover-community/mathlib4"
COMMIT = "20c73142afa995ac9c8fb80a9bb585a55ca38308"
repo = LeanGitRepo(URL, COMMIT)
traced_repo = trace(repo)
thm = next(tt for tt in traced_repo.get_traced_theorems() if tt.theorem.full_name == "List.eraseP_cons_of_pos")
theorem = Theorem(repo, thm.theorem.file_path, thm.theorem.full_name)
with Dojo(theorem) as (dojo, init_state):
dojo.run_tac(init_state, "simp [eraseP_cons, h]")
Initially, I suspected about the memory bottleneck as well. But the problem is still persisted even thought scaling to 220GB RAM.
Looking at the theorem name, it should belonged to lean 4 repo itself but not the target traced_repo.
Re
Another thing, how could we
dojo.run_tac()on multiple tactics or single line contain;? Some seems to work while other doesn't. Example:
- theorem
Finset.map_nsmul_piAntidiagcan run tactic\n classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hnbut if we split based on;and run_tac separately, it going to run into error.- Conversely, theorem
CommSemiring.toSemiring_injectivecannot run tactic\n rintro ⟨⟩ ⟨⟩ _; congr. However, if we split it and run separately, it going to lead to Proof_Finished.A concrete, runnable piece of code would be helpful!
- For theorem
dist_le_Ico_sum_dist, how to run multiple line tactics like below:induction n, h using Nat.le_induction with | base => rw [Finset.Ico_self, Finset.sum_empty, dist_self] | succ n hle ihn => calc dist (f m) (f (n + 1)) ≤ dist (f m) (f n) + dist (f n) (f (n + 1)) := dist_triangle _ _ _ _ ≤ (∑ i ∈ Finset.Ico m n, _) + _ := add_le_add ihn le_rfl _ = ∑ i ∈ Finset.Ico m (n + 1), _ := by { rw [Nat.Ico_succ_right_eq_insert_Ico hle, Finset.sum_insert, add_comm]; simp }This is a calculational proof in Lean (the
calckeyword), which LeanDojo does not support.
Below is concrete code of above two cases.
First:
from lean_dojo import *
from ReProver.common import *
URL = "https://github.com/leanprover-community/mathlib4"
COMMIT = "20c73142afa995ac9c8fb80a9bb585a55ca38308"
repo = LeanGitRepo(URL, COMMIT)
traced_repo = trace(repo)
thm = next(tt for tt in traced_repo.get_traced_theorems() if tt.theorem.full_name == "Finset.map_nsmul_piAntidiag")
theorem = Theorem(repo, thm.theorem.file_path, thm.theorem.full_name)
tac = [\n classical rw [map_eq_image], exact nsmul_piAntidiag _ _ hn]
with Dojo(theorem) as (dojo, init_state):
state = dojo.run_tac(init_state, tac[0])
state = dojo.run_tac(state, tac[1])
Second:
thm = next(tt for tt in traced_repo.get_traced_theorems() if tt.theorem.full_name == "CommSemiring.toSemiring_injective")
theorem = Theorem(repo, thm.theorem.file_path, thm.theorem.full_name)
with Dojo(theorem) as (dojo, init_state):
state = dojo.run_tac(init_state, "\n rintro ⟨⟩ ⟨⟩ _; congr")
LeanDojo cannot be used to interact with theorems from the Lean 4 repo itself. See https://leandojo.readthedocs.io/en/latest/limitations.html#limitations
The classical in classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hn is a scoping tactic (see https://github.com/haruhisa-enomoto/mathlib4-all-tactics/blob/main/all-tactics.md#classical). It's equivalent to classical (rw [map_eq_image]; exact nsmul_piAntidiag _ _ hn) and should decompose to classical rw [map_eq_image] and classical exact nsmul_piAntidiag _ _ hn
In [4]: dojo, init_state = Dojo(theorem).__enter__()
2025-03-14 17:56:51.803 | DEBUG | lean_dojo.interaction.dojo:__enter__:241 - Initializing Dojo for Theorem(repo=LeanGitRepo(url='https://github.com/leanprover-community/mathlib4', commit='20c73142afa995ac9c8fb80a9bb585a55ca38308'), file_path=PosixPath('Mathlib/Algebra/Order/Antidiag/Pi.lean'), full_name='Finset.map_nsmul_piAntidiag')
2025-03-14 17:56:51.807 | DEBUG | lean_dojo.data_extraction.trace:get_traced_repo_path:219 - The traced repo is available in the cache.
2025-03-14 17:56:51.965 | DEBUG | lean_dojo.interaction.dojo:_modify_file:342 - Modifying `Mathlib/Algebra/Order/Antidiag/Pi.lean` into `/private/home/kaiyuy/.cache/lean_dojo/leanprover-community-mathlib4-20c73142afa995ac9c8fb80a9bb585a55ca38308/mathlib4/Mathlib/Algebra/Order/Antidiag/Piptgezrgv.lean`
In [5]: dojo.run_tac(init_state, "classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hn")
2025-03-14 17:57:19.177 | DEBUG | lean_dojo.interaction.dojo:_submit_request:461 - {"sid": 0, "cmd": "classical rw [map_eq_image]; exact nsmul_piAntidiag _ _ hn"}
Out[5]: ProofFinished(tactic_state_id=1, message='')
In [6]: dojo.run_tac(init_state, "classical rw [map_eq_image]")
2025-03-14 17:58:41.056 | DEBUG | lean_dojo.interaction.dojo:_submit_request:461 - {"sid": 0, "cmd": "classical rw [map_eq_image]"}
Out[6]: TacticState(pp="ι : Type u_1\ninst✝ : DecidableEq ι\ns : Finset ι\nm n : ℕ\nhn : n ≠ 0\n⊢ image (⇑{ toFun := fun x => n • x, inj' := ⋯ }) (s.piAntidiag m) =\n filter (fun f => ∀ i ∈ s, n ∣ f i) (s.piAntidiag (n * m))", id=2, message='')
In [7]: s1 = _
In [8]: dojo.run_tac(s1, "exact nsmul_piAntidiag _ _ hn")
2025-03-14 17:59:12.129 | DEBUG | lean_dojo.interaction.dojo:_submit_request:461 - {"sid": 2, "cmd": "exact nsmul_piAntidiag _ _ hn"}
Out[8]: LeanError(error='failed to synthesize\n DecidableEq (ι → ℕ)\nAdditional diagnostic information may be available using the `set_option diagnostics true` command.')
In [9]: dojo.run_tac(s1, "classical exact nsmul_piAntidiag _ _ hn")
2025-03-14 17:59:21.942 | DEBUG | lean_dojo.interaction.dojo:_submit_request:461 - {"sid": 2, "cmd": "classical exact nsmul_piAntidiag _ _ hn"}
Out[9]: ProofFinished(tactic_state_id=3, message='')
I'm not sure what's happening for the CommSemiring.toSemiring_injective example and will take a more detailed look when I have a chance (likely a few weeks later).