Brian Chen comments

Results 925 comments of


                                            Brian Chen

Scope of this package

But wait, there's more! https://github.com/JuliaAI/StatisticalTraits.jl exists as well. We should try to get folks from each org on an ML call at some point to work all this out...

Renaming of normalization to standardization and implementation of actual normalization

I don't even think Flux uses `normalise` from MLUtils (it still has its own copy, which should itself be moved to NNlib), so there may not be much to fixup...

Renaming of normalization to standardization and implementation of actual normalization

The Flux function ought to be renamed to something, anything else. The existing name is just confusing. We can probably discuss that over in NNlib though.

API for user code to detect if it's being differentiated

I think my worries with a `cond`-like construct are twofold: 1. To avoid the `within_gradient` problem, it seems like you'd need specialized `cond`s for every kind of branch that might...

API for user code to detect if it's being differentiated

> 1. Wouldn't it be just `cond(somecondition && othercondition, () -> ..., () -> ...)`? Also, if you worry about compilation time due to multiple specializations, I believe `@nospecialize` should...

API for user code to detect if it's being differentiated

> Do you assume some other setup? My understanding of `cond(flag, true_fn, false_fn)` is that it obeys the following truth table: |differentiating?|flag|branch taken| |-|-|-| |T|T|true_fn| |T|F|false_fn| |F|T|false_fn| |F|F|false_fn| In other...

API for user code to detect if it's being differentiated

I'm not sure I understand how this tracing works then. To get a value for `active`, you ultimately need to call something which returns `true` when not differentiating and `false`...

API for user code to detect if it's being differentiated

> That's the point - `active` and differentiation are **independent**. > _Usually_, people set `active = true` while training and differentiate code while training, but strictly speaking nobody forbids you...

Break out primitives to separate packages

As a start, can we completely ignore the `*2onnx` functionality? Ideally there could also be some automation of the generation of the `onnx2*` functions as well. Instead of using Flux...

Add repeat vector

The lack of clarity on how this layer should behave suggest to me that we might not want it as a built-in. Now, the good news is that Flux doesn't...