logstash-filter-geoip icon indicating copy to clipboard operation
logstash-filter-geoip copied to clipboard

Output of geoip filter changed in logstash 5.4.2 (no more GeoJSON)

Open matejzero opened this issue 8 years ago • 12 comments

Hello,

I'm testing logstash 5.4.2 in my staging env and I've noticed that output of geoip plugin has changed. Instead of GeoJSON output, like I had in 5.4.1

    "location": [
      -122.3042,
      47.913
    ],

I now get a hash of lat / lon.

    "location": {
      "lat": 47.913,
      "lon": -122.3042
    },

My geoip configuration looks like this:

    geoip {
      source => [ "src_ip" ]
      fields => [ "country_code2", "country_name", "latitude", "longitude", "location" ]
    }

Looking at the latest commits, I saw that this was changed when some of the code was rewritten to Java. In old Ruby plugin, location was saved as array, but in new Java code, it's saved as Hash.

Java code: https://github.com/logstash-plugins/logstash-filter-geoip/commit/ec18789d42302eebb3b775ad5c49fb9aba394d40#diff-0db10ce49673bb86356e5398c93ad8b3R257 Ruby code: https://github.com/logstash-plugins/logstash-filter-geoip/commit/ec18789d42302eebb3b775ad5c49fb9aba394d40#diff-0ceaeedc6497c6d61fa5ae1d2db1dc58L212

Is this to be expected? Will ES still talk hash as geopoint?

For all general issues, please provide the following details for fast resolution:

  • Version: 5.4.1 & 5.4.2
  • Operating System: Linux
  • Steps to Reproduce: Just use geoip filter and look at location field.

matejzero avatar Jun 30 '17 09:06 matejzero

I looked at ES documentation and saw that it can take hash as geopoint type as well: https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html

matejzero avatar Jun 30 '17 10:06 matejzero

Hi @matejzero. Thanks for your posting, I'm on the same situation (see above).

I understand that ES works on both types anyway, but I don't think this issue should be closed so easily. As a result of the update, we've now got an index with two mixed mappings. IMHO, there should have been at least a warning about the type change.

maurom avatar Jun 30 '17 14:06 maurom

Do you have static field type defined for 'location' field or do you leave it to ES to guess?

All my indexes have custom templates where i set 'location' field as geopoint, so this change probably wont hit me the same it happened to you. Nonetheless, this seems like a breaking change acording to your post, so an input from developers would be welcome.

matejzero avatar Jun 30 '17 14:06 matejzero

Agreed this is a breaking change. /cc @suyograo

On Fri, Jun 30, 2017 at 7:56 AM Matej Zerovnik [email protected] wrote:

Do you have static field type defined for 'location' field or do you leave it to ES to guess?

All my indexes have custom templates where i set 'location' field as geopoint, so this change probably wont hit me the same it happened to you. Nonetheless, this seems like a breaking change acording to your post, so an input from developers would be welcome.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-geoip/issues/123#issuecomment-312289833, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIC6rcI1fab8X2yXdlj7t5u3-pjIRwYks5sJQyUgaJpZM4OKYp5 .

jordansissel avatar Jun 30 '17 16:06 jordansissel

I'm currently using the default template provided by logstash ("version" : 50001). (Edited, no pun intended)

maurom avatar Jun 30 '17 17:06 maurom

Interesting... location field is set to geo_point in that template.

"geoip"  : {
          "dynamic": true,
          "properties" : {
            "ip": { "type": "ip" },
            "location" : { "type" : "geo_point" },
            "latitude" : { "type" : "half_float" },
            "longitude" : { "type" : "half_float" }
          }
        }

How is location field mapped now that you upgraded? If you look at the latest index, what type is location field? It should be geo_point if you are using default template at least to my knowledge.

I might have to downgrade my logstash instances as well, since my indexes rotate tomorrow.

matejzero avatar Jun 30 '17 17:06 matejzero

I just tried this on my cluster and even with the latest logstash, location field still gets mapped as geo_point and I don't get mixed templates.

Are you sure your indexes use latest logstash template?

Can you paste your logstash template from elasticsearch: curl -XGET es_host:9200/_template/logstash

You might have an old template there that doesn't set geoip fields.

matejzero avatar Jun 30 '17 20:06 matejzero

I think so. I've attached the output of curl -XGET es_host:9200/_template/logstash?pretty

issue-123-logstash-template.txt

maurom avatar Jul 03 '17 11:07 maurom

Weird, this template should not produce mixed mappings for location field, since mapping is set in template.

Can you look at your mappings in indexes and check what type is location?

matejzero avatar Jul 03 '17 12:07 matejzero

Sure, here it is: issue-123-logstash-2017.06.21-mapping.txt

The offending index here is logstash-2017.06.21, the one related to the day when I upgraded to 5.4.2. geoip.location is set to geo_point. Now that I see it, the mapping seems to be fine, but I'm having differences in the format the values are stored (as discussed on the other ticket).

By the way, thanks for your help. I'm new on ES issues and still am a bit lost but slowly starting to grasp it.

maurom avatar Jul 03 '17 12:07 maurom

I went and read your issue again and now I get it. Your field type stayed the same, but values saved are different (hash instead of array). That's the reason reindexing fails.

I replied to your ticket, I think it's best to continue debate there there, since it's a different problem then what I have.

matejzero avatar Jul 03 '17 17:07 matejzero

The following config works with Logstash 5.4.1 but not with 5.4.2 (which I think is geoip 4.1.1)

filter {
    if [Source][IP] {
        geoip {
            source => "[Source][IP]"
            target => "Source"
        }
    }
    if [Destination][IP] {
        geoip {
            source => "[Destination][IP]"
            target => "Destination"
        }
    }

}

I get the following error with Logstash 5.4.2. Note that there is no elasticsearch output going on, so I know that it's not a template issue.

[
{"exception"=>"Missing Ruby class handling for full class name=java.util.HashMap, simple name=HashMap",
"backtrace"=>["org.logstash.Javafier.deep(org/logstash/Javafier.java:29)",
"org.logstash.Event.getField(org/logstash/Event.java:151)",
"org.logstash.filters.GeoIPFilter.applyGeoData(org/logstash/filters/GeoIPFilter.java:151)",
"org.logstash.filters.GeoIPFilter.handleEvent(org/logstash/filters/GeoIPFilter.java:143)",
"java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)", 
"RUBY.filter(/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-geoip-4.1.1-java/lib/logstash/filters/geoip.rb:122)",
"LogStash::Filters::Base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145)",
"LogStash::Filters::Base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145)", 
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164)", 
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164)", 
"org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)",
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161)",
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161)", 
"LogStash::FilterDelegator.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:43)", 
"LogStash::FilterDelegator.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:43)",
"RUBY.initialize((eval):26185)",
"org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)", 
"RUBY.initialize((eval):26181)",
"org.jruby.RubyProc.call(org/jruby/RubyProc.java:281)", 
"RUBY.filter_func((eval):7655)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:370)",
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:370)", 
"org.jruby.RubyProc.call(org/jruby/RubyProc.java:281)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:224)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:224)",
"org.jruby.RubyHash.each(org/jruby/RubyHash.java:1342)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:223)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:223)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:369)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:369)", 
"RUBY.worker_loop(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:350)", 
"RUBY.start_workers(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:317)",
"java.lang.Thread.run(java/lang/Thread.java:748)"]}
]

I can only suspect that it may be due to https://github.com/logstash-plugins/logstash-filter-geoip/commit/45ac7316632282ab26e3a93b584e1e011be7cba8#commitcomment-22920532

zaakiy avatar Jul 05 '17 05:07 zaakiy