framework-reproducibility
                                
                                
                                
                                    framework-reproducibility copied to clipboard
                            
                            
                            
                        Changing class name/structure changes program functionality (using TensorFlow)
For example:
In my project (https://github.com/edwardyehuang/CAR/blob/master/carnet.py), line 163 (instantiate SegManaged).
If I create a wrapper class that inherited SegManaged and place it on line 163, e.g.
class gbb(SegManaged):
    pass
The performance (e.g. loss) will be different with the original one. However, I found it is depending on the first letter of the "wrapper class", if it starts with a-g (e.g. cbb, cbx, cxxx), the performance will be different, but the performance will be the same if it starts with h-z. Note that, upper/lowercase has no effect
Hi Edward, I've done a quick triage on this issue:
- This issue is related to TensorFlow (not PyTorch or another framework).
 - This issue is not about run-to-run reproducibility. The reported issue is that changing program code changes the functionality of the program.
 - The source of non-reproducibility has not been isolated. It could be related to TensorFlow or it could be a bug in Python itself (or somewhere else).
 - There is no minimal/simple reproducer program available. To reproduce, it's necessary to follow the relatively complex and time-consuming installation and configuration instructions here.
 - I don't know when or if I will get around to reproducing the issue and isolating the source. The debug tool I currently have can only find differences between runs of the same program and not differences between runs of two different programs, although I think it would not be too difficult to make that work.
 - If you are able to create a simple and self-contained reproducer program (e.g. a small, single-file program that runs in a colab and uses synthetic data generated in the program), that would help to accelerate a resolution.
 
One question: I assuming this issue shows up regardless of the accelerator-type you're running on (i.e. both CPU and GPU). Is that correct?
Will provide a minimal code (in colab) in next week