jsNet icon indicating copy to clipboard operation
jsNet copied to clipboard

Multidimensional input support? (e.g. CNN channels)

Open kwhitley opened this issue 7 years ago • 9 comments

Not sure if this is supported? Mind clarifying, and a quick example of how to structure an example input?

Thanks!!

kwhitley avatar Mar 15 '18 16:03 kwhitley

At the moment, the input layer is just an FC layer which does not carry out any summing/activations. Input therefore needs to be flattened, into a one dimensional array.

For example (keeping MNIST in mind, still), this is an example of a (very unoptimized) conv net, where the 28x28 images are passed in as 784 long arrays:

net = new Network({
    Module: Module,
    layers: [
        new FCLayer(784),
        new ConvLayer(1, {filterSize: 5, zeroPadding: 2, stride: 1, activation: "lrelu"}),
        new FCLayer(784, {activation: "sigmoid"}),
        new FCLayer(10) 
    ],
    learningRate: 0.01,
    lreluSlope: -0.0005,
    l2: true
})

This is due to jsNet originally only implementing MLPs. Would you prefer a way to give input as a 2d or 3d array?

DanRuta avatar Mar 15 '18 19:03 DanRuta

I mean, sure, if it could do matrix operations on multidimensional, that would be great. What about the channels option you have for Network initialization? I assumed that maybe allowed for at least a 784x3 input shape (if flattening the 28x28 as you suggest).

kwhitley avatar Mar 15 '18 21:03 kwhitley

In the MNIST example, the images happen to be grayscale, so it's only 1 channel. You'd be right in thinking that channels would be set to 3 for RGB images, where there's 3 channels.

And ok, I'll add volume as input to my TODO list. Working on v4.0, but with this, there may be enough for a v3.3.

DanRuta avatar Mar 15 '18 22:03 DanRuta

I've committed an initial implementation for this. You should now be able to pass in input data as a volume. You can try this out on the dev branch, for now, if you'd like. There's also an InputLayer now, which might optionally make the input layer clearer to set up (docs updated).

If you were going to use this for a particular public data set, let me know, and I could whip up some examples/demo files.

DanRuta avatar Mar 18 '18 13:03 DanRuta

Sounds great - I'll download and check out. Regarding the ConvLayer... convolutional layers are 3D by nature, no? HxW with depth being the channels. Likewise the 3D filter (filter size in HxW with matching depth of the # channels) shifts over the 2D grid, to convolv the matrix. So how does your convolution work if in 1D form?

I'm actually interested in using this for multivariate time series analysis (diff variables being in diff channels), so ideally inputting for me would be something like:

// number rows = number "channels"
[
  [a1, a2, a3... aN],
  [b1, b2, b3... bN],
  [c1, c2, c3... cN],
  [d1, d2, d3... dN],
  ...
]

Also, and slightly unrelated, Im under the impression that a single value output defaults to logistic or sigmoid or something, and any array defaults to softmax. That would work for most cases, but do you have the ability to output a vector or matrix of say, logistic activation? This would certainly be handy in multivariate regression, for instance.

Thanks again!

kwhitley avatar Mar 21 '18 22:03 kwhitley

K, so I don't quite follow how your multi-dimension syntax works for defining the InputLayer

you have:

new InputLayer(3, { span: 5 }) // 75 inputs

How does that work exactly?

Other libraries usually define an input "shape" vector or something... like [3,2,10] with each scalar pertaining to a dimension.

Could we parse an array as the multidimensional shape if the first param? (rather than the implied 1D input shape if an integer)

kwhitley avatar Mar 21 '18 23:03 kwhitley

Ah, I fixed the readme for the InputLayer. I think I mis-copy/pasted from dillinger. Let me know if changing it to a shape vector makes more sense. It would likely have just 2 elements, as filters are square, so the height/width is the same (so depth, and span). I do hope one day to get around to re-implementing the filters to work with different X and Y sizes, but right now it's just squares.

I removed the implicit softmax on the last layer as well. I've been meaning to change this for a while. It's now set via a {softmax: true} config in OutputLayer, returning the activations, otherwise. It also means that an activation function can be enabled/disabled for the last layer.

Regarding the 1D conv question, just to clarify, do you mean when the input map (across however many channels) is 1 x 1 'pixels'? If so, I haven't tested any such data sets, but keeping zero padding at 0 and filterSize at 1, this in theory should just do a dot product of the input values across channels, against the filter weights (one per channel). Is this your question, and if so, the expected behaviour?

Thanks for the feedback. Do let me know of things that need changing.

DanRuta avatar Mar 22 '18 19:03 DanRuta

Sorry it's taken so long to get to this, but about to dive in on the CNN testing now. With my data, I'm thinking multichannel 1D convolution would be best. Basically feed the time series window in as 1D arrays, but with each variable as separate channels. Would this be the best way to handle this?

Based on your docs, it appears that if I were to define multiple channels, the second "span" param implies the square dimension of a 2D grid. Any way to have multichannel 1D?

Secondly, assuming 1D multichannel support, how would you form the input data? Would it be like this (for a 2 channel, 128 width 1D convolution):

let example = {
  input: [
    [ 0, 1, 0, .. 1 ], // channel 1 = 128 width array
    [ 1, 1, 0, .. 0 ], // channel 2 = 128 width array
  ], 
  expected: [ 0, 1 ] // or whatever output
}

Thanks!!

kwhitley avatar Apr 07 '18 20:04 kwhitley

Having read more about 1D conv (I haven't personally used it before), I don't think that it would work with jsNet. At the moment, all filters and conv input are squares. If my understanding is correct, a 1D conv would need the filters and inputs to be a 1D array.

For this to work, there'd probably need to be a Conv1DLayer (and maybe Filter1D), or something similar, like in some other frameworks I've seen do this.

DanRuta avatar Apr 10 '18 09:04 DanRuta