fast-tsetlin-machine-with-mnist-demo icon indicating copy to clipboard operation
fast-tsetlin-machine-with-mnist-demo copied to clipboard

Faster tm_initialize_random_streams

Open jcriddle4 opened this issue 5 years ago • 4 comments

On my system this greatly sped up processing by over 2X. I am using large values for S. Hopefully I haven't optimized so far that it is actually wrong. Note: I also changed S to be an integer and it will scale somewhat differently(see comment below).

static inline void tm_initialize_random_streams(struct TsetlinMachine *tm, int s)
{
	int next = 0;
	int r = random() % (s+1);
	for(int k=0; k<LA_CHUNKS; ++k){
		(*tm).feedback_to_la[k] = 0;

		while((next + r) / INT_SIZE <= k){
			if(r == s){
				next += (r - 1);  //Skip setting any bits.
			} else {
				next += r;
				(*tm).feedback_to_la[k] |= (1 << (next % INT_SIZE));
			}

			r = random() % (s+1);
		}
	}
}

jcriddle4 avatar Apr 03 '19 03:04 jcriddle4

If we want an approximately similar distribution to the old code and if my binomial distribution calculations are correct for a value like S=10 the code would look something like this. Is this over thinking it? Do we care if S's values scale differently?

    ...
    r = random() % (s + 4);
    ...
    if(r >= s){ 
        next += (s - 1);  //Skip setting any bits.
    }
....

If we still want to use floating point for S then maybe the code would be

...
 r = ((float)rand()/RAND_MAX) * s * 1.3486784401;  //binomial distribution calculation on no bits being set. S=10.0
 ...
 if(r >= s){ 
     next += (truncate_to_integer(s) - 1);  //Skip setting any bits.
 } else {
   next += truncate_to_integer(r); 
   la_feedback |= truncate_to_int(next % INT_SIZE);
 ....

jcriddle4 avatar Apr 03 '19 04:04 jcriddle4

Hi @jcriddle4! This looks very interesting. Thanks! Look forward to try out these modifications. Did you try it out on the MNIST dataset?

olegranmo avatar Apr 11 '19 05:04 olegranmo

On the MNIST dataset using the integer version with  r = random() % (s + 4);...
TimeBefore    TimeAfter    TestBefore   TestAfter
378.1              164.6          93.9             94.0
232.4              126.6          95.1             95.1
203.3              123.4          96.2             95.8
189.0              122.7          96.6             96.0
178.6              121.3          96.9             96.2
166.3              121.9          97.2             96.3

jcriddle4 avatar Apr 13 '19 05:04 jcriddle4

Edited: The original optimized version seemed to converge slower toward epoch 6 for MNIST. I switched to s=11 and then used this code and it matches or exceeds for the first 6 epochs. I also think it should be next += s instead of next += (s - 1);

static inline void tm_initialize_random_streams(struct TsetlinMachine *tm, int s)
{
	int next = -s;  //We were slightly off for setting the first few bits so we backed up to start
	int r = random() % (s + 4);
	for(int k=0; k<LA_CHUNKS; ++k){
		(*tm).feedback_to_la[k] = 0;

		while((next + r) / INT_SIZE <= k){
			if(r >= s){
				next += s;  //Skip setting any bits.
			} else {
				next += r;
				if(next >= 0) {
					(*tm).feedback_to_la[k] |= (1 << (next % INT_SIZE));
				}
			}

			r = random() % (s + 4);
		}
	}
}

jcriddle4 avatar Apr 13 '19 22:04 jcriddle4