eland icon indicating copy to clipboard operation
eland copied to clipboard

Could Eland help facilitate conversion between popular Python data visualization and Vega (Kibana)?

Open jt0dd opened this issue 3 years ago • 2 comments

Below is a question I've asked elsewhere (don't have an answer yet). Could Eland be a candidate for making this easier? Eland specifically focuses on Pandas, but I don't see why it couldn't also be a good project to focus on more generally on Python <-> ELK facilitation (such as conversion between libraries such as Matplotlib and Vega).


So we want to leverage the data-science and visualization strengths of Python on our ELK data and then use the Elastic API to send the resulting visualization to update a dashboard.

There are plenty of resources discussing this, only one of which actually provided a working example, however they leveraged now deprecated security bypasses, more of a hack. Specifically, this refers to setting the index name to .kibana, which now results in the error:

    AuthorizationException: AuthorizationException(403, 'security_exception', 'action [indices:data/write/bulk[s]] is unauthorized for user [elastic] with roles [superuser] on indices [.kibana_8.1.2_001,.kibana], this action is granted by the index privileges [create_doc,create,delete,index,write,all]')

We think this must be possible through normal API usage without disabling any security settings. We did try adding a user with every possible permission added, and it could not perform this action during our testing.

Here's the example referred to as well as the overarching project which inspired us to try this.

jupyter-kibana-workflow

Note that Vega is now a default feature of Kibana rather than a plugin, so this workflow should be even more viable now.

So our code goes like this:

    import eland as ed
    import datetime
    import altair as alt
    import eland as ed
    import json
    import numpy as np
    import matplotlib.pyplot as plt
    import vega_datasets
    from elasticsearch import Elasticsearch
    
    cloud_id = "secret"
    http_auth = ("username", "password")
    es = Elasticsearch(cloud_id=cloud_id, http_auth=http_auth)
    
    data = vega_datasets.data
    pd_df = data.cars()
    chart = alt.Chart(pd_df).mark_point().encode(
        x='Miles_per_Gallon',
        y='Horsepower'
    ).interactive()
    
    def saveVegaVis(client, index, visName, altairChart, resultSize=100, timeField=True):
        chart_json = json.loads(altairChart.to_json())
        visState = {
          "type": "vega",
          "aggs": [],
          "params": {
            "spec": json.dumps(chart_json, sort_keys=True, indent=4, separators=(',', ': ')),
          },
          "title": visName
        }
        visSavedObject={
            "visualization" : {
              "title" : visName,
              "visState" : json.dumps(visState, sort_keys=True, indent=4, separators=(',', ': ')),
              "uiStateJSON" : "{}",
              "description" : "",
              "version" : 1,
              "kibanaSavedObjectMeta" : {
                "searchSourceJSON" : json.dumps({
                  "query": {
                    "language": "kuery",
                    "query": ""
                  },
                  "filter": []
                }),
              }
            },
            "type" : "visualization",
            "references" : [ ],
            "migrationVersion" : {
              "visualization" : "8.0.0"
            },
            "updated_at" : datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S.000Z")
        }
    
    
        return client.index(index=index,id='visualization:'+visName,body=visSavedObject)
    
    saveVegaVis(es, 'test_visuals', 'def-vega-cars-1', chart, resultSize=1000)

After executing this code, we get a success message:

    ObjectApiResponse({'_index': 'test_visuals', '_id': 'visualization:def-vega-cars-1', '_version': 8, 'result': 'updated', '_shards': {'total': 2, 'successful': 1, 'failed': 0}, '_seq_no': 7, '_primary_term': 1})

But within ELK, the resulting object is not treated like a visualization, it's treated like a normal index entry.

We want it to show up like: enter image description here

But rather, we can only see it as a normal index entry, like so:

enter image description here

It seems to us that all of the traits for a visualization are there. To validate this, we exported a Vega visualization to observe the data structure (please excuse any oddities, the export leaves a lot of escape characters we tried to clean up):

    {
    	"attributes": {
    		"description": "",
    		"kibanaSavedObjectMeta": {
    			"searchSourceJSON": {
    			    "query ":{
    			        "query":"",
                        "language":"kuery"
                    },
                    "filter":[]
                }
    		},
    		"title": "TEST_VISUAL_PLZ_WORK",
    		"uiStateJSON": "{}",
    		"version": 1,
    		"visState": {
            "title":"TEST_VISUAL_PLZ_WORK",
            "type":"vega",
            "aggs":[],
            "params":{
                "spec":" {
    			"$schema": "https://vega.github.io/schema/vega/v3.json", n "width": 300, "height": 100, "data": [{
    				n "name": "vals",
    				n "values": [n {
    						"category": 50,
    						"count": 30
    					}, {
    						"category": 100,
    						"count": 80
    					}, {
    						"category": 150,
    						"count": 10
    					}, {
    						"category": 200,
    						"count": 50
    					}
    				]
    			}], "marks": [{
    			    "type": "rect",
    				"from": {
    					"data": "vals"
    				},
    				"encode": {
    					"update": {
    						"x": {
    							"field": "category"
    						},
    					    "width": {
    							"value": 30
    						},
    						"y": {
    							"field": "count"
    						},
    						"y2": {
    							"value": 0
    						}
    					}
    				}
    			}]
    
    		}
    		"}}"
    	},
    	"coreMigrationVersion": "8.1.2",
    	"id": "6e130cc0-b694-11ec-8df1-41f60ea92d87",
    	"migrationVersion": {
    		"visualization": "8.0.0"
    	},
    	"references": [],
    	"type": "visualization",
    	"updated_at": "2022-04-07T17:04:32.085Z",
    	"version": "WzYxOTQsMl0="
    } {
    	"excludedObjects": [],
    	"excludedObjectsCount": 0,
    	"exportedCount": 1,
    	"missingRefCount": 0,
    	"missingReferences": []
    }

And it seems our data structure matches this.

So are we going about this the wrong way? Is there some minor mistake? Is it possible to do (without disabling security protocols / 'hacking' it)?

jt0dd avatar Apr 07 '22 19:04 jt0dd

Someone in the forums solved this for us: https://discuss.elastic.co/t/jupyter-vega-kibana-visualization-without-disabling-security-how/301899

jt0dd avatar Apr 10 '22 22:04 jt0dd

I, personally, love this idea. It would be a killer feature for tighter integration with Eland & Elasticsearch.

I am not sure where it stands on priorities currently.

benwtrent avatar Apr 11 '22 11:04 benwtrent