PINK
PINK copied to clipboard
Very large data dimension size when initialising training
Hi,
I'm trying to train an SOM (just to get used to the way PINK works, hence the tiny file numbers and parameters below) but when I set the training run going, my output log file is growing in size at a rate of ~1Gb every 30 seconds, being filled with imformation like that shown in the quote below.
Number of data entries = 1096810496 Data dimension = 1097859072 x 1100480512 x 1102053376 x 1101004800 x 1099956224 x 1099431936 x 1101004800 x 1097859072 x 1098907648 x 1101004800 x 1101004800 x 1103101952 x 1104674816 x 1103101952 x 1105199104 x 1102577664 x 1101529088 x 1101004800 x 1098907648 x 1100480512 x 1101529088 x 1102053376 x 1094713344 x 1093664768 x 1103626240 x 1099956224 x 1088421888 x 1101004800 x 1099431936 x 1099956224 x 1104150528 x 1095761920 x 1102577664 x 1101529088 x 1101004800 x 1099956224 x 1099431936 x 1101529088 x 1103626240 x 1101004800 x 1101004800 x 1102577664 x 1097859072 x 1098907648 x 1101529088 x 1101004800 x 1097859072 x 1102053376 x 1099956224 x 1096810496 x 1101004800 x 1098907648 x 1097859072 x 1100480512 x 1100480512 x 1098907648 x 1101529088 x 1096810496 x 1097859072 x 1095761920 x 1099956224 x 1091567616 x 1098907648 x 1101529088 x 1099431936 x 1092616192 x 1096810496 x 1101529088 x 1099431936 x 1096810496 x 1100480512 x 1106247680 x 1101529088 x 1103101952 x 1101529088 x 1102053376 x 1102053376 x 1097859072 x 1101004800 x 1095761920 x 1096810496 x 1104674816 x 1097859072 x 1099956224 x 1102053376 x 1097859072 x 1099956224 x 1104674816 x 1099956224 x 1109917696 x 1094713344 x 1097859072 x 1111490560 x 1100480512 x 1102577664 x 1113325568 x 1099431936 x 1103626240 x 1113325568 x 1101529088 x 1103626240 x 1113849856 x 1105199104 x 1100480512 x 1111752704 x 1098907648 x 1097859072 x 1108606976 x 1096810496 x 1096810496 x 1104674816 x 1102053376 x 1097859072 x 1106771968 x 1101529088 x 1099956224 x 1102577664 x 1101004800 x 1096810496 x 1100480512 x 1097859072 x 1096810496 x 1103626240 x 1096810496 x 1100480512 x 1092616192 x 1099431936 x 1102577664 x 1091567616 x 1097859072 x 1098907648 x 1097859072 x 1098907648 x 1098907648 x 1094713344 x 1096810496 x 1099956224 x 1099956224 x 1098907648 x 1099431936 x 1105723392 x 1101529088 x 1099431936 x 1108344832 x 1101529088 x 1100480512 x 1105723392 x 1098907648 x 1098907648 x 1104150528 x 1103101952 x 1103101952 x 1107296256 x 1099431936 x 1098907648 x 1104150528 x 1098907648 x 1100480512 x 1104150528 x 1103626240 x 1099956224 x 1103626240 x 1097859072 x 1102053376 x 1099956224 x 1100480512 x
I set the training run going using the following:
$Pink --train _scripts/test.bin _pink_out/som.bin --som-width 10 --som-height 10 --num-iter 1 --numrot 4
So as you can see I'm keeping this little training run simple with a small gridsize as well as only 4 rotations per image.
This is for a training run where I'm just using 3 images of size 128x128, in float 32 format. So I'm confused as to why the number of data entries is so high in the first line of the quote above?
I'd really appreciate any help you can give on this!
Extra info
My images are stored in test.bin
using the following numpy raw binary example:
filename = "test.bin"
fileobj = open(filename, mode='wb')
stacked_images.tofile(fileobj)
fileobj.close()
... where stacked_images
has shape (128, 128, 3) or (3, 128, 128). (I've tried both ways to see if that was the problem but it still causes the issue above)