swift-apis
swift-apis copied to clipboard
Investigate optimizer numerical correctness vs Python reference implementations
https://github.com/tensorflow/swift-apis/pull/758 adds Python TensorFlow reference implementations for optimizer numerical correctness.
This issue tracks numerical differences between Swift optimizer implementations and the reference implementations. See references to TF-759 in Tests/TensorFlowTests/OptimizerTests.swift
for occurrences.
Some differences are larger than others. I think we should strive for exact numerical equality if possible, for the same optimizer parameters.
Current examples:
-
SGD(for: values, learningRate: 1e-3)
: big difference- Swift:
[0.49999535, -0.10000112, -3.000017]
- Python:
[ 0.49999967, -0.00999999, -0.01999998]
- Swift:
-
AdaGrad(for: values, learningRate: 1e-3, epsilon: 1e-7)
: big difference for the third value- Swift:
[0.061354622, -0.057095252, -0.061786927]
- Python:
[ 0.06179592, -0.05709525, -0.05987222]
- Swift:
-
AdaMax(for: values, learningRate: 1e-3, epsilon: 1e-7)
: small difference- Swift:
[0.9999907, -0.99999064, -0.9999907]
- Python:
[ 0.99999076, -0.99999064, -0.99999064]
- Swift:
-
Adam(for: values, learningRate: 1e-3, epsilon: 1e-7)
: smallest difference- Swift:
[0.9999906, -0.9999898, -0.99999064]
- Python:
[ 0.9999907, -0.9999898, -0.9999904]
- Swift:
Is it possible that the small differences are actually respective language's precision differences since Swift's default Float precision is Float32
but Python's Float precision is Float64
?
The Python TensorFlow optimizer reference implementation use tf.float32
precision, which should match Swift: https://github.com/tensorflow/swift-apis/blob/b7a9e56efc08f683733433ba3c7eee4966570213/Utilities/ReferenceImplementations/optimizers.py#L16-L17
These example float32
programs produce the exact same output, which give me hope that exact numerical equality is attainable:
import tensorflow as tf
x = tf.constant(1, dtype=tf.float32)
dx = tf.constant(0.1, dtype=tf.float32)
for _ in range(1000):
x += dx
print(x.numpy())
# 100.99903
var x: Float = 1
let dx: Float = 0.1
for _ in 0..<1000 {
x += dx
}
print(x)
// 100.99903