datatable
                                
                                 datatable copied to clipboard
                                
                                    datatable copied to clipboard
                            
                            
                            
                        Loss of column Name
- 
Did you find a bug in datatable, or maybe the bug found you? Loss of column names during some operations. What determines how a column name is changed? What operations will cause loss of column names? 
- 
How to reproduce the bug? 
# sample data 
data = {"id":[1,1,1,1,2,2,1,2,1],
      "code":range(10, 1, -1),
      'valA':range(1,10),
      'valB':range(10,19)}
DT = dt.Frame(data)
 id code valA valB
0 1 10 1 10
1 1 9 2 11
2 1 8 3 12
3 1 7 4 13
4 2 6 5 14
5 2 5 6 15
6 1 4 7 16
7 2 3 8 17
8 1 2 9 18
# apply an operation
DT[[:, -1 * f[:]]
   C0 C1 C2 C3
0 −1 −10 −1 −10
1 −1 −9 −2 −11
2 −1 −8 −3 −12
3 −1 −7 −4 −13
4 −2 −6 −5 −14
5 −2 −5 −6 −15
6 −1 −4 −7 −16
7 −2 −3 −8 −17
8 −1 −2 −9 −18
9 rows × 4 columns
- 
What was the expected behavior? You should not lose column names, even if it is applied to the whole dataframe. Some clarity on why column names are lost will be helpful, and what conditions cause loss of column names. 
- 
Your environment? Python version :'3.8.5 | packaged by conda-forge | (default, Sep 24 2020, 16:55:52) \n[GCC 7.5.0]'
The logic here is that any unary function/operator retains the name of the column. Thus, sum(f.X) or -f.X will produce a column named "X". On the other hand, binary functions/operators do not create any new columns. For example atan2(f.X, f.Y) or -1 * f.X will produce an unnamed column.
Could you explain a bit more @st-pasha ? If I multiply a column by a number, I changed the contents of that column; it doesn't mean I should lose the name. What's the reasoning behind unary vs binary?
Unary function operates on a single column, so it carries through the name of that column. For example cos(f.A) produces column "A" because the argument of function cos() is a column named "A".
A binary function, on the other hand, takes 2 columns as arguments. For example, f.X * f.Y. Since both of those columns can potentially have names, it is unclear what the name of the result should be. It can't be "X" or "Y" because that would be unfair to the other column. It can't be "X * Y", because applying this rule universally quickly produces bad results, like (f.X + f.Y)*f.Z -> "X+Y*Z". And that's not even taking into account columns with long complicated names.
Thus, the only choice is for f.X * f.Y to be unnamed. We could make special check that if one of the columns in the result is unnamed, then the outcome must bear the name of the other column, but it would mean that f.X * f.Y * f.Z is named "Z" which is very dubious.
I guess we could make a rule that if one of the arguments to a binary function is a scalar then the result is the name of the other column. This would mean that 2 * (f.X - 1) is still called "X".
@st-pasha much clearer now.
I guess we could make a rule that if one of the arguments to a binary function is a scalar then the result is the name of the other column. This would mean that 2 * (f.X - 1) is still called "X".
I think this is a good idea to be implemented.
Also, I feel this column name changes for operations should be documented somewhere(not sure on the exact location), so users are aware. Although, on second thought, it might just be me, and not really a issue for the library's user base.
Yeah, it should be documented. But where?
I think it should be included in transformation documentation, which is no. 5 on #2604 . Open to suggestions.
I guess this issue must be relabeled to documentation or something. After following the discussion, I understand that it's a feature not a bug :)
Yes @pradkrish ; still thinking of which part of the documentation to mention this.
closing this; the alias function can help with renaming