datproject-discussions icon indicating copy to clipboard operation
datproject-discussions copied to clipboard

Building a repository for 3D behavioral imaging data

Open joehand opened this issue 8 years ago • 4 comments

From @alexbw on November 18, 2014 3:49

Hi all, my name's Alex, I'm a friend of Max's in Boston. I'm a PhD student in neuroscience at Harvard, and I've built a tool which allows you to record the behavior of animals in 3D using the Microsoft Kinect. Most of my research revolves around what to actually DO with that data, but getting nicely processed data in the first place has been a challenge.

Here's an example of what the data looks like once it's processed and ready for analysis: (yes, that's a little blobby lab mouse running around, as detected by the Kinect, and the inset is the extracted and aligned mouse)

https://www.dropbox.com/s/3v7kjwwyfrjp02u/mouse_clip.mp4?dl=0

Turns out, looking at behavior quantitatively in this way is shockingly new and useful to neuroscientists at large, so a lot of folks have been asking to collaborate with us. We've been overwhelmed. So, we started to partner with some labs and companies to build a platform for recording, uploading, storing, sharing and analyzing this data.

As you might guess, our main problem is the size of the data. We need to get it to a central location for processing (requires lots of computers crunching on hours of data to get results useful to researchers, currently. We're working on making it more efficient, but for now, we need much more than a desktop), and I just don't know an efficient way to get tens of gigabytes of data per day reliably to EC2 for storage and processing.

I know that Max has been working on this project with lots of brilliant people, and I asked Max on Twitter if we could talk about this problem, and he said to post an issue here. So, here's the issue!

To be clear, we

  • Do need a centralized storage location, mostly because processing requires big guns.
  • Do need to be able to handle ~10GB uploads from single users (data is already compressed, we're working on compressing further)
  • Do need to be able to allow users to download the raw data, and accompanying processed data, at their convenience

What do you think? Happy to answer any questions, provide more images/movies to illustrate.

Copied from original issue: maxogden/dat#219

joehand avatar Jun 17 '16 18:06 joehand