spearmint icon indicating copy to clipboard operation
spearmint copied to clipboard

Will skipping the GPEIperSecChooser chooser break it?

Open Quanticles opened this issue 11 years ago • 1 comments

I'm using 4x threads + GPEIperSecChooser with cuda-convnet where I'm letting it pick the network layer sizes as some of it's variables. It's doing a great job of picking fast jobs, but once the number of completed iterations nears 100, spearmint becomes the bottleneck instead of cuda-convnet. At that point all 4 GPUs sit idle while waiting for spearmint to pick one job.

Ideally I wish I could push the chooser into its own thread and maybe use different types of choosers in parallel, but as a simpler solution I'm trying to keep the GPUs busy by picking jobs randomly when there are multiple open threads.

If I sometimes bypass chooser.next in main.attempt_dispatch, will this break GPEIperSecChooser? It looks like there isn't any internal state being saved when chooser.next is called, but I was hoping to double-check here.

I'm replacing this:

        # Ask the chooser to pick the next candidate
        log("Choosing next candidate... ")
        job_id = chooser.next(grid, values, durations, candidates, pending, complete)

with this (borrowing from RandomChooser):

        # Ask the chooser to pick the next candidate
        log("Choosing next candidate... ")
        job_id = 0
        if complete.shape[0] < 2:
            print "Initialization pick"
            job_id = int(candidates[0])
        elif options.max_concurrent - n_pending > 1:
            print "Too many open threads - choosing randomly"
            job_id = int(candidates[int(np.floor(candidates.shape[0]*npr.rand()))])
        else:
            print "Using chooser to pick"
            job_id = chooser.next(grid, values, durations, candidates, pending, complete)

It wouldn't be very useful to run these random jobs if the gathered information will not be used by the GPEIperSecChooser.

Thanks

Quanticles avatar Jan 17 '14 20:01 Quanticles

Aha, that's interesting! This should actually be fine. I have messed around with strategies like this as well. The chooser will not alter the state of the grid. Each time 'chooser.next' is called it will basically just look at which jobs are finished and which are currently running and propose a new job. Adding randomly spawned jobs will certainly be very informative to the model. If you're already doing exploration with random jobs though, it might make sense to switch over to the standard GPEIOptChooser.

I should add that there is a fantastic collaborative effort to develop (have developed) better ways to search the space of architectures. http://www.cs.toronto.edu/~kswersky/wp-content/uploads/hier-kern-workshop.pdf Hopefully we can release this code relatively soon.

On Fri, Jan 17, 2014 at 3:18 PM, Quanticles [email protected]:

I'm using 4x threads + GPEIperSecChooser with cuda-convnet where I'm letting it pick the network layer sizes as some of it's variables. It's doing a great job of picking fast jobs, but once the number of completed iterations nears 100, spearmint becomes the bottleneck instead of cuda-convnet. At that point all 4 GPUs sit idle while waiting for spearmint to pick one job.

Ideally I wish I could push the chooser into its own thread and maybe use different types of choosers in parallel, but as a simpler solution I'm trying to keep the GPUs busy by picking jobs randomly when there are multiple open threads.

If I sometimes bypass chooser.next in main.attempt_dispatch, will this break GPEIperSecChooser? It looks like there isn't any internal state being saved when chooser.next is called, but I was hoping to double-check here.

I'm replacing this:

    # Ask the chooser to pick the next candidate
    log("Choosing next candidate... ")
    job_id = chooser.next(grid, values, durations, candidates, pending, complete)

with this (borrowing from RandomChooser):

    # Ask the chooser to pick the next candidate
    log("Choosing next candidate... ")
    job_id = 0
    if complete.shape[0] < 2:
        print "Initialization pick"
        job_id = int(candidates[0])
    elif options.max - n_pending > 1:
        print "Too many open threads - choosing randomly"
        job_id = int(candidates[int(np.floor(candidates.shape[0]*npr.rand()))])
    else:
        print "Using chooser to pick"
        job_id = chooser.next(grid, values, durations, candidates, pending, complete)

It wouldn't be very useful to run these random jobs if the gathered information will not be used by the GPEIperSecChooser.

Thanks

— Reply to this email directly or view it on GitHubhttps://github.com/JasperSnoek/spearmint/issues/18 .

JasperSnoek avatar Jan 17 '14 21:01 JasperSnoek