Adding a new transform ppmt
Converts the distribution of a table to be more gaussian.
Hi @DianaPat, my name is Elias. I'm one of the manteiners of the TableTranforms.jl package. I just left my comments on your PR. I hope it helps!
Hi @DianaPat, my name is Elias. I'm one of the manteiners of the TableTranforms.jl package. I just left my comments on your PR. I hope it helps!
Hi Elias, I just finished correcting the details you commented. Tell me if you see something lacking!
That is true, you are using the same projection matrix in the structural removal in the lopp and in the stopping criteria with the ideal Gaussian case. Notice that it is not required, but it is a good idea anyways to avoid recomputing scores again in the loop.
I was trying to run the test, but realized that the ProjectionPursuit() can't be properly tested as I pulled out the Quantile() and EigenAnalysis() (so the covariance matrix is not approx. I). I was wondering if I should test it with images (as EigenAnalysis) or if it was unnecessary as the most used fuction would be PPMT().
Given that the paper by Friedman defines the transform with these preprocessing steps, maybe we should reconsider adding them back to the apply function of ProjectionPursuit, and then eliminate the PPMT. I thought initially that it would be useful to split the two, but it is causing more challenges than helping us.
Can you concentrate on the ProjectionPursuit transform and make sure that all tests pass?
I already ran both ProjectPursuit() and PPMT() and they work properly, and PPMT() passes the tests. So I want to confirm what to do with the ProjectPursuit() testing. Meanwhile, I'll commit my changes.
Codecov Report
Merging #105 (2204a25) into master (8716cf6) will decrease coverage by
4.52%. The diff coverage is54.45%.
@@ Coverage Diff @@
## master #105 +/- ##
==========================================
- Coverage 94.70% 90.17% -4.53%
==========================================
Files 27 27
Lines 831 896 +65
==========================================
+ Hits 787 808 +21
- Misses 44 88 +44
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/transforms.jl | 100.00% <ø> (ø) |
|
| src/transforms/projectionpursuit.jl | 54.45% <54.45%> (ø) |
|
| src/transforms/parallel.jl | 95.06% <0.00%> (ø) |
|
| src/transforms/sequential.jl |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@DianaPat try to update your local version to the version in this PR before you continue submitting changes. Or edit it directly here on GitHub by pressing "."
It is already nicer than the previous version. We don't need the isempty check though right?
The shorter the names of these variables in these small self-contained loops the better because we spend less time reading the names of the variables and more time understanding the big picture of the loop 👍🏻
Em sex., 23 de set. de 2022 08:53, DianaPat @.***> escreveu:
@.**** commented on this pull request.
In src/transforms/projectionpursuit.jl https://github.com/JuliaML/TableTransforms.jl/pull/105#discussion_r978554859 :
@warn "Maximum number of iterations ($count) reached."
breakend
push!(cache,(Q,colcache))
end
𝒯 = (; zip(names, eachcol(Z))...)
newtable = 𝒯 |> Tables.materializer(table)
newtable, (cache,tcache)
+end
+function revertfeat(::ProjectionPursuit, newtable, fcache)
names = Tables.columnnames(newtable)
ppcache, tcache = fcache
coltable = newtable
What about this notation? [image: image] https://user-images.githubusercontent.com/105749646/191954467-65739ab2-9962-49ed-9b62-63dd7dc2d6c2.png I tried a more functional notation
— Reply to this email directly, view it on GitHub https://github.com/JuliaML/TableTransforms.jl/pull/105#discussion_r978554859, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZQW3OG26XGBAH6GEI5YBLV7WK4RANCNFSM56BGYZPQ . You are receiving this because you commented.Message ID: @.***>
@DianaPat tests are failing, can you please double check?
I have been trying to solve this, but the problem is that in windows the test passes. I'll try in a linux!
I tried and the test fails in the windows, and even the eigenanalysis plot test fails so I think its version of ubuntu is too old to try and solve the problem in it. I'll use wsl.
@DianaPat I think the other issue in the stop criteria is more serious. Can you try to fix it first? The test image can be easily generated after the algorithm is working.
I already solved it.
@DianaPat I have left a few more suggestions. We are almost there.
@DianaPat let's see if we can cleanup the code a bit more by introducing a helper function that creates the basis vectors:
# j-th basis vector in R^d
basis(d, j) = 1:d .== j
This function can be used everywhere to replace the creation of the matrix E in the code.
@DianaPat we will need to rebase the PR again onto the latest master branch to introduce the new StatelessFeatureTransform type introduced this morning.
Closing in favor of #141