alphageometry
alphageometry copied to clipboard
Some mistake in AR system coding
In ar.py line 354-361:
new_column = np.zeros([len(self.v2i), 2]) # N, 2
for v, c in vc:
new_column[self.v2i[v], 0] += float(c) # [1]
new_column[self.v2i[v], 1] -= float(c) # [2]
self.A = np.concatenate([self.A, new_column], 1) # add two columns
self.c += [1.0, -1.0] # coeeficients of linprog minimization objective
self.deps += [dep]
it seems that self.c += [1.0, -1.0] shall be replaced by self.c += [1.0, 1.0]
Explaination:
The quoted code above is a part of the "Traceback for algebraic deduction", as mentioned in Nature volume 625, pages 476–482 (2024), i.e., the Google's AlphaGepmetry paper. Speaking more precisely, this "Traceback" is done by doing linear progression to minimize the function $\Sigma_{i}(x_{i}+y_{i})$ (1) with restriction $A\circ(x-y)=b$, where $x_{i}$, $y_{i}$ are positive real numbers (storaged as float); $A$ is an $m \times n matrix$; $x$, $y$, $b$ are $n$-dimensional vectors. $m$ is the cardinality of the set of geometric variables used in linear deduction, $n$ is the cardinality of the set of rules (i.e., linear combination that sums to zero). (1) is apparently the Lasso regression. We may write this in a more understandable way: minimize $\Sigma_{i}|z_{i}|$ (2) with restriction $A\circ z=b$.
In the code, [1] and [2] are adding the positive (i.e., vector x) and negative (i.e., vector y) version of a new rule; self.A is the matrix A, self.c is the goal of minimization. In the code the restriction is written rather as $A\circ x+ (-A)\circ y = b$, where "$A$" is part [1] and "$-A$" is part [2]. Therefore, the minimization shall be both "$1$" (i.e., "positive" minimization, do not require times $-1$) for [1] and [2].
The current code is minimizing $\Sigma_{i}(x_{i}-y_{i})$ (3). Which does not make sense since if one increase both x_{i} and y_{i}, the restriction will still be satisfied, and the minimization goal is also unaffected. To compare with (2), one may translate this to minimize: $\Sigma_{i}z_{i}$ (4), which is absurd.
\emph{Remark.} $\circ$ represents matrix multiplication.