leanprover-community.github.io icon indicating copy to clipboard operation
leanprover-community.github.io copied to clipboard

stub page on NL translation

Open robertylewis opened this issue 2 years ago • 1 comments

This will be linked to from https://github.com/leanprover-community/doc-gen/pull/168

cc @zhangir-azerbayev , any suggestions for the wording?

robertylewis avatar Aug 21 '22 21:08 robertylewis

Here's a rough suggestion. Let me know if you want it to be shorter/longer or if there are parts that are unclear.

The mathlib docs is testing a beta feature that generates informal translation of mathlib theorem statements using OpenAI's Codex language model. To see the informal version of a theorem, click on the "informal statement" drop down below each lemma and theorem. These informal translations are automatically generated by a machine learning algorithm and should not be trusted.

We would greatly appreciate your feedback on the quality of informal translations so that we can improve this system in the future.

Codex is a causal language model, which means that given a string of text, its objective is to predict the next word. The model is trained to maximize next word prediction accuracy on the corpus of all Github source code. By iterating the next-word prediction task on a user's input, Codex can generate coherent text and code. To instruct Codex to perform the informal translation task, we used a technique called few shot prompting. Suppose we want the model to translate your favorite informal theorem into natural language. We would show the model a few (in our case, four) examples of formal-informal pairs, followed by the formal statement of your favorite theorem as:

Lean mathlib version: theorem exists_le_sylow {p : ℕ} {G : Type*} [group G] {P : subgroup G} (hP : is_p_group p P) : ∃ (Q : sylow p G), P ≤ Q := Translate the Lean mathlib version to a natural language version: "Let $P$ be a $p$-subgroup of $G$. Then $P$ is contained in a Sylow $p$-subgroup of $G$."

Lean mathlib version: theorem subset_of_open_subset_is_open (X : Type*) [topological_space X] (A : set X) (hA : ∀ x ∈ A, ∃ U : set X, is_open U ∧ x ∈ U ∧ U ⊆ A): is_open A := Translate the Lean mathlib version to a natural language version: "Let $X$ be a topological space; let $A$ be a subset of $X$. Suppose that for each $x\in A$ there is an open set $U$ containing $x$ such that $U\subset A$. Then $A$ is open in $X$."

-- A few more examples go here

Lean mathlib version: -- Your favorite formal theorem goes here

Picking up on the pattern, Codex will try to complete the text with an informal translation of your favorite formal theorem.

zhangir-azerbayev avatar Aug 21 '22 22:08 zhangir-azerbayev

It seems this PR did not converge and is no longer relevant, so I will close it. But do not hesitate to reopen if this project is still moving.

PatrickMassot avatar Jul 18 '24 15:07 PatrickMassot