numphp
numphp copied to clipboard
Convert kmeans from numpy to numphp
Hi, i tried to port a kmeans algorithm from python to php using your numpy like library. Somehow I do not fully understand how to operate matrices correctly using your lib. Are the Parameters axis, keepdims of sum compatible to numpy? Can you help me get this code running with your lib? I put the Python version into comments.
function kmeans(NdArray $X, $K, $maxIter = 20, $beta = 1.0){
list($N, $D) = $X->shape;
$R = np::zeros($N, $K); // Responsibility Matrix
$exponents = np::nulls($N, $K); // Exponents
// initialize M with random values of X
$M = randomize($X, $K, $N);
for($i=0; $i< $maxIter; $i++){
// step 1: determine assignments / resposibilities
for($k=0; $k < $K; $k++){
for($n=0; $n < $N; $n++){
// Python: exponents[n,k] = np.exp(-beta*d(M[k], X[n]))
$exponents[$n][$k] = np::exp(-$beta * distance($M[$k], $X[$n]));
// Python: R = exponents / exponents.sum(axis=1, keepdims=True)
$R = $exponents->divide($exponents->sum());
}
}
// step 2: recalculate means
for($k=0; $k < $K; $k++){
// Python: M = R.T.dot(X) / R.sum(axis=0, keepdims=True).T
$M[$k] = np::transpose($R)->dot($X)->divide(np::transpose($R->sum()));
}
}
echo $M; die;
}
function randomize(NdArray $X, $K, $N)
{
$randomMeans = [];
for($k=0;$k < $K; $k++){
$randomMeans[$k] = $X[rand(0, $N-1)]->data;
}
return np::ar($randomMeans);
}
Hi @bluescreen,
Several modifications have been made to help a k-mean implementation.
Axisandkeepdimsparameters are now supported for thesum()method.
$a = np::arange(9)->reshape(3, 3);
print $a;
// sum all
print $a->sum() . "\n";
// sum axis 0
print $a->sum(0);
// sum axis 1
print $a->sum(1);
// sum all and keepdims
print $a->sum(null, true);
// sum axis 0 and keepdims
print $a->sum(0, true);
// sum axis 1 and keepdims
print $a->sum(1, true);
// Another way is to call sum() method as a np utility and give it a
// matrix as first parameter.
// Below, it's a sum on axis 0 with keepdims = true
print np::sum($a, 0, true);
Would output:
[[0 1 2]
[3 4 5]
[6 7 8]]
36
[9 12 15]
[3 12 21]
[[36]]
[[9 12 15]]
[[3 ]
[12]
[21]]
[[9 12 15]]
- Linalg
norm()method
$a = np::ar([0, 1, 0]);
$b = np::ar([1, 1, 0]);
print $a;
print $b;
// Euclidean distance
print np::linalg()->norm(
$b->subtract($a)
);
Would output:
[0 1 0]
[1 1 0]
1
power()method
$a = np::ar([sqrt(2), 2, 3]);
print $a;
// square a matrix, element-wise
print $a->power(2);
Would output:
[1.4142135623731 2 3 ]
[2 4 9]
- Array and matrix indexes now start at index 0
$a = np::arange(9)->reshape(3, 3);
print $a;
// print first element. both will output 0
print $a['0,0'];
print $a->data[0][0];
// print last element. both will output 8
print $a['2,2'];
print $a->data[2][2];
All these updates are available with composer require sciphp/numphp:dev-master
and will be packaged with the next release (0.3.0).
SciPhp manual is not yet updated but will be soon.
Hope this helps.
Thanks a lot. That will help. Only the np.random functions are missing but I think I can implement them.
And I suggest to keep the phpunit at version 7 in composer.json or update the tests. Otherwise there are lots of deprecated messages.
The @expectedException, @expectedExceptionCode, @expectedExceptionMessage, and @expectedExceptionMessageRegExp annotations are deprecated. They will be removed in PHPUnit 9. Refactor your test to use expectException(), expectExceptionCode(), expectExceptionMessage(), or expectExceptionMessageRegExp() instead.
There also seems to be missing a case for dividing matrices vertically. When I try
$X = np::ar([[1,2,3],
[4,5,6],
[7,8,9]]);
$Y = np::ar([[1],[2],[3]]);
echo $X->divide($Y);
It throws InvalidArgumentException : Matrices are not aligned.
Result should be:
[[1. 2. 3. ]
[2. 2.5 3. ]
[2.33333333 2.66666667 3. ]]
Could be implemented like this (not so efficient)
elseif ($m->ndim === $n->ndim && $shape_n[1] == 1) {
list($cols, $rows) = $m->shape;
$result = self::zeros($cols, $rows);
for ($i = 0; $i < $cols; $i++) {
for ($j = 0; $j < $rows; $j++) {
$result["$i,$j"] = $m["$i,$j"] / $n["$i,0"];
}
}
return $result;
}
NumPhp Manual
sum(), power() and indexing updates have been documented on https://sciphp.org
PHPUnit 8 warnings
- Exceptions are now binded with the recommended functions
- PHP 7.1+ uses
assertEqualsWithDelta()instead ofassertEquals()when appropriated (PHPUnit 7.5 and newest) - PHPUnit 5.7 and 6.5 are still supported, respectively for PHP 5.6 and PHP 7.0
Broadcasting
- A
np::broadcast_to()utility has been pushed and integrated to handle the missing case.
np::ar(
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
)->divide(
[[1],
[2],
[3]]
);
is equivalent to
np::ar(
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
)->divide(
[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]]
);
Would return, as expected
[[1 2 3 ]
[2 2.5 3 ]
[2.3333333333333 2.6666666666667 3 ]]
np::broadcast_to() is not yet documented as it has to handle more cases and integrated in some other methods.
Thanks for this nice catch 👍
Great! Now only np.random functions are missing for the library to be used for machine learning stuff! Thanks a lot!
You're right, random features are missing.
I saw that you implemented an interesting RandomTrait into bluescreen/numphp.
If you're interested in contributing to numphp, I would be very appreciative of it.
There are some little adjustments to make:
Random namespace
Random namespace should be SciPhp\Random, the RandomGenerator would become SciPhp\Random class.
Methods
Their implementations could be the same as LinAlg ones with a dedicated SciPhp/Random/ folder.
random()could be put there https://github.com/sciphp/numphp/blob/master/src/SciPhp/NumPhp/ExtensionsTrait.php- A
SciPhp\Randomwrapper like https://github.com/sciphp/numphp/blob/master/src/SciPhp/LinAlg.php - A
SciPhp\Random\Decoratordecorator to load traits https://github.com/sciphp/numphp/blob/master/src/SciPhp/LinAlg/Decorator.php - And then, the traits could reside in
SciPhp\Random
If you want to discuss about other implementations details, feel free to contact me on https://cybre.space/@landrok