PlotlyJS.jl icon indicating copy to clipboard operation
PlotlyJS.jl copied to clipboard

Incorrect bar chart rendering with DataFrames

Open staticfloat opened this issue 5 years ago • 1 comments

Describe the bug When using two different DataFrames to plot into a bar chart as two different traces, the resultant bar chart does not show what I would expect; the bars are horizontal instead of vertical, and they do not show the proper numerical values.

Version info

julia> versioninfo()
Julia Version 1.2.0-rc1.0
Commit 7097799cf1 (2019-05-30 02:22 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
(test_data) pkg> status
    Status `~/src/SpeedCenter.jl/test_data/Project.toml`
  [a93c6f00] DataFrames v0.18.3
  [41391dba] InfluxDB v0.1.0 [`~/.julia/dev/InfluxDB`]
  [47be7bcc] ORCA v0.2.1
  [f0f68f2c] PlotlyJS v0.12.4

MWE:

using DataFrames, ORCA, PlotlyJS

df_slow = DataFrame(:conv2d => 2.5, :conv3d => 3.8, :gitsha => "1a2b3c4d")
df_fast = DataFrame(:conv2d => 0.2, :conv3d => 0.8, :gitsha => "5a6b7c8d")

slow_trace = bar(;x=[:conv2d, :conv3d], y=df_slow, name="v0.4.3")
fast_trace = bar(;x=[:conv2d, :conv3d], y=df_fast, name="master")

layout = Layout(;barmode="group")
p = Plot([slow_trace, fast_trace], layout)
PlotlyJS.savehtml(p, "comparison.html")
PlotlyJS.savefig(p, "comparison.png")

Output:

comparison

I must be doing something silly, but I can't figure out what it is. :(

staticfloat avatar May 31 '19 09:05 staticfloat

Hey @staticfloat thanks for bringing this up.

When I run your code an inspect the json that is generated we get

julia> print(JSON.json(p, 2))
{
  "layout": {
    "barmode": "group",
    "margin": {
      "l": 50,
      "b": 50,
      "r": 50,
      "t": 60
    }
  },
  "data": [
    {
      "y": {
        "columns": [
          [
            2.5
          ],
          [
            3.8
          ],
          [
            "1a2b3c4d"
          ]
        ],
        "colindex": {
          "lookup": {
            "gitsha": 3,
            "conv3d": 2,
            "conv2d": 1
          },
          "names": [
            "conv2d",
            "conv3d",
            "gitsha"
          ]
        }
      },
      "type": "bar",
      "name": "v0.4.3",
      "x": [
        "conv2d",
        "conv3d"
      ]
    },
    {
      "y": {
        "columns": [
          [
            0.2
          ],
          [
            0.8
          ],
          [
            "5a6b7c8d"
          ]
        ],
        "colindex": {
          "lookup": {
            "gitsha": 3,
            "conv3d": 2,
            "conv2d": 1
          },
          "names": [
            "conv2d",
            "conv3d",
            "gitsha"
          ]
        }
      },
      "type": "bar",
      "name": "master",
      "x": [
        "conv2d",
        "conv3d"
      ]
    }
  ]
}

notice that for y it is dumping a JSON version of the data frame 😆

Is the chart below what you are after?

image

Here's what I did to make that:

julia> df = stack(vcat(df_slow, df_fast), [:conv2d, :conv3d])
4×3 DataFrame
│ Row │ variable │ value   │ gitsha   │
│     │ Symbol   │ Float64 │ String   │
├─────┼──────────┼─────────┼──────────┤
│ 1   │ conv2d   │ 2.5     │ 1a2b3c4d │
│ 2   │ conv2d   │ 0.2     │ 5a6b7c8d │
│ 3   │ conv3d   │ 3.8     │ 1a2b3c4d │
│ 4   │ conv3d   │ 0.8     │ 5a6b7c8d │

Julia> p = Plot(df, Layout(barmode="group"); x=:variable, y=:value, kind=:bar, group=:gitsha)

Notes:

  • PlotlyJS knows how to plot long-form data frames passed as first argument
  • You specify a column using a symbol with the column name (e.g. x=:variable and y=:value)
  • The group keyword takes a column name and splits the data frame by that group, making one trace per group.

Does that help?

See docs for more info

sglyon avatar May 31 '19 13:05 sglyon