logstash-filter-csv icon indicating copy to clipboard operation
logstash-filter-csv copied to clipboard

Convert doesn't seem to work as well as mutate's convert

Open markwalkom opened this issue 9 years ago • 1 comments

I was playing around with some twitter analytics data and found that the convert in this filter doesn't see to do what is expected.

Data sample, altered;

"698020519266775043","https://twitter.com/BLAH/status/698020519266775043","LOUD NOISES","2016-02-12 05:47 +0000","372.0","8.0","0.021505376344086023","0.0","0.0","0.0","1.0","3.0","0.0","4.0","0.0","0","0","0","0","0","0","0","-","-","-","-","-","-","-","-","-","-","-","-","-","-","-","-","-","-"

Config, made super basic (sort of);

input {
  stdin {}
}
filter {
  csv {
    columns => [ "Tweet id","Tweet permalink","Tweet text","time","impressions","engagements","engagement rate","retweets","replies","likes","user profile clicks","url clicks","hashtag clicks","detail expands","permalink clicks","app opens","app installs","follows","email tweet","dial phone","media views","media engagements","promoted impressions","promoted engagements","promoted engagement rate","promoted retweets","promoted replies","promoted likes","promoted user profile clicks","promoted url clicks","promoted hashtag clicks","promoted detail expands","promoted permalink clicks","promoted app opens","promoted app installs","promoted follows","promoted email tweet","promoted dial phone","promoted media views","promoted media engagements" ]
    convert => {
      "impressions" => float
      "engagements" => float
      "engagement rate" => float
      "retweets" => integer
      "replies" => integer
      "likes" => integer
      "user profile clicks" => integer
      "url clicks" => integer
      "hashtag clicks" => integer
      "detail expands" => integer
      "permalink clicks" => integer
      "app opens" => integer
      "app installs" => integer
      "follows" => integer
      "email tweet" => integer
      "dial phone" => integer
      "media views" => integer
      "media engagements" => integer
    }
  }
  date {
    match => [ "time", "yyyy-MM-dd HH:mm Z" ]
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

Output;

{
                         "message" => "\"698020519266775043\",\"https://twitter.com/BLAH/status/698020519266775043\",\"LOUD NOISES\",\"2016-02-12 05:47 +0000\",\"372.0\",\"8.0\",\"0.021505376344086023\",\"0.0\",\"0.0\",\"0.0\",\"1.0\",\"3.0\",\"0.0\",\"4.0\",\"0.0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"",
                        "@version" => "1",
                      "@timestamp" => "2016-02-12T05:47:00.000Z",
                            "host" => "bender.local",
                        "Tweet id" => "698020519266775043",
                 "Tweet permalink" => "https://twitter.com/BLAH/status/698020519266775043",
                      "Tweet text" => "LOUD NOISES",
                            "time" => "2016-02-12 05:47 +0000",
                     "impressions" => 372.0,
                     "engagements" => 8.0,
                 "engagement rate" => 0.021505376344086023,
                        "retweets" => "0.0",
                         "replies" => "0.0",
                           "likes" => "0.0",
             "user profile clicks" => "1.0",
                      "url clicks" => "3.0",
                  "hashtag clicks" => "0.0",
                  "detail expands" => "4.0",
                "permalink clicks" => "0.0",
                       "app opens" => 0,
                    "app installs" => 0,
                         "follows" => 0,
                     "email tweet" => 0,
                      "dial phone" => 0,
                     "media views" => 0,
               "media engagements" => 0,
            "promoted impressions" => "-",
            "promoted engagements" => "-",
        "promoted engagement rate" => "-",
               "promoted retweets" => "-",
                "promoted replies" => "-",
                  "promoted likes" => "-",
    "promoted user profile clicks" => "-",
             "promoted url clicks" => "-",
         "promoted hashtag clicks" => "-",
         "promoted detail expands" => "-",
       "promoted permalink clicks" => "-",
              "promoted app opens" => "-",
           "promoted app installs" => "-",
                "promoted follows" => "-",
            "promoted email tweet" => "-",
             "promoted dial phone" => "-",
            "promoted media views" => "-",
      "promoted media engagements" => "-"
}

Note these three;

"retweets" => "0.0",
"replies" => "0.0",
"likes" => "0.0",

They are not integers.

If I do the following;

input {
  stdin {}
}
filter {
  csv {
    columns => [ "Tweet id","Tweet permalink","Tweet text","time","impressions","engagements","engagement rate","retweets","replies","likes","user profile clicks","url clicks","hashtag clicks","detail expands","permalink clicks","app opens","app installs","follows","email tweet","dial phone","media views","media engagements","promoted impressions","promoted engagements","promoted engagement rate","promoted retweets","promoted replies","promoted likes","promoted user profile clicks","promoted url clicks","promoted hashtag clicks","promoted detail expands","promoted permalink clicks","promoted app opens","promoted app installs","promoted follows","promoted email tweet","promoted dial phone","promoted media views","promoted media engagements" ]
  }
  date {
    match => [ "time", "yyyy-MM-dd HH:mm Z" ]
  }
  mutate {
     convert => {
      "impressions" => float
      "engagements" => float
      "engagement rate" => float
      "retweets" => integer
      "replies" => integer
      "likes" => integer
      "user profile clicks" => integer
      "url clicks" => integer
      "hashtag clicks" => integer
      "detail expands" => integer
      "permalink clicks" => integer
      "app opens" => integer
      "app installs" => integer
      "follows" => integer
      "email tweet" => integer
      "dial phone" => integer
      "media views" => integer
      "media engagements" => integer
    }
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

We see;

{
                         "message" => "\"698020519266775043\",\"https://twitter.com/BLAH/status/698020519266775043\",\"LOUD NOISES\",\"2016-02-12 05:47 +0000\",\"372.0\",\"8.0\",\"0.021505376344086023\",\"0.0\",\"0.0\",\"0.0\",\"1.0\",\"3.0\",\"0.0\",\"4.0\",\"0.0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"0\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"",
                        "@version" => "1",
                      "@timestamp" => "2016-02-12T05:47:00.000Z",
                            "host" => "bender.local",
                        "Tweet id" => "698020519266775043",
                 "Tweet permalink" => "https://twitter.com/BLAH/status/698020519266775043",
                      "Tweet text" => "LOUD NOISES",
                            "time" => "2016-02-12 05:47 +0000",
                     "impressions" => 372.0,
                     "engagements" => 8.0,
                 "engagement rate" => 0.021505376344086023,
                        "retweets" => 0,
                         "replies" => 0,
                           "likes" => 0,
             "user profile clicks" => 1,
                      "url clicks" => 3,
                  "hashtag clicks" => 0,
                  "detail expands" => 4,
                "permalink clicks" => 0,
                       "app opens" => 0,
                    "app installs" => 0,
                         "follows" => 0,
                     "email tweet" => 0,
                      "dial phone" => 0,
                     "media views" => 0,
               "media engagements" => 0,
            "promoted impressions" => "-",
            "promoted engagements" => "-",
        "promoted engagement rate" => "-",
               "promoted retweets" => "-",
                "promoted replies" => "-",
                  "promoted likes" => "-",
    "promoted user profile clicks" => "-",
             "promoted url clicks" => "-",
         "promoted hashtag clicks" => "-",
         "promoted detail expands" => "-",
       "promoted permalink clicks" => "-",
              "promoted app opens" => "-",
           "promoted app installs" => "-",
                "promoted follows" => "-",
            "promoted email tweet" => "-",
             "promoted dial phone" => "-",
            "promoted media views" => "-",
      "promoted media engagements" => "-"
}

Which returns;

"retweets" => 0,
"replies" => 0,
"likes" => 0,

markwalkom avatar Jul 21 '16 10:07 markwalkom

As a sidenote: The documentation is using commas as delimiters for the convert, whereas only newlines work. It gives a configuration error otherwise

Rishi321 avatar Sep 25 '16 07:09 Rishi321