VizAlerts
VizAlerts copied to clipboard
Capture VizAlerts operational data for analysis / reporting / alerting
Currently VizAlerts does not store information on what it is doing in a structured way. Text logs are written to, but they're not built in a structured way that one could do analysis on.
VizAlerts should be given the ability to describe the actions it has taken, how long they took, and whether the outcome was a success or failure to a relational database such as PostgreSQL so that the administrator can monitor the health of alerts, and alert authors can see the history of their own alerts.
This would allow for many possible use cases:
- Build custom alerts for alert failures (currently VizAlerts can send an email to the admin and subscriber should an alert fail, but it's not customizable)
- Monitor alert trends
- How many unique alerts are there?
- How many emails were sent to who, when? How large are they?
- How many data rows are being exported to CSV? How large are they?
- Which are Simple Alerts, and which are Advanced Alerts?
- Which alerts are triggered all the time? Which are never triggered and may need to be retired?
- A user complains that they did not receive the email. Did VizAlerts send it successfully?
Some of this data can be obtained from the Tableau Server repository database, but it is not retained for long enough and does not contain enough information to answer all of these questions.
Once this item is complete, ideally a Tableau Datasource file (.tds) would be added to the release which allows admins and users to connect to the database this info is stored in. User filtering would provide row-level security so that users could not see others' information.
I suggest as a first step rather than requiring a full-on database (with the extra overhead for installation & maintenance) that VizAlerts have a structured CSV file (or files) then a .tds could point to those files.
Yep, I agree--that sounds like the best plan in the near term.
Not that this isn't a good idea, but one could just push the logs into something like logstash (which is what we're planning on doing - simply because its where all our operational logs get sent), and one could easily knock off half of your list. Slight modifications to the underlying logger to be a bit more "descriptive", and a simple solution like this could go a long way with little to no change to the current codebase.
You could then take it a step further, and use webconnector to connect to the underlying ElasticSearch instance, and instead of Kibana charts and graphs, we get the graphical goodness of Tableau.
This is the approach I've been thinking about using anyway.
Interesting idea... I've got two questions:
-
What's the effort/cost required for installation, configuration & operation of logstash? The reason that I ask is that so far we've tried to be sensitive to what are required components, for example there are users who don't have internet access so they can't do a pip install to get the required Python libraries (never mind the optional one).
-
What's the security model? Here the challenge is that in some organizations bits of info like who gets emails and info about their contents is as proprietary as their actual data.
Jonathan
Our organization is built entirely on-top of AWS - this includes Tableau Server, and Elastic Search/Logstatsh, etc.
We use logstash for all of our logs, as it helps us consolidate our logs across all of the EC2 instances we run, etc. - AWS has both standalone AMIs with the whole "ELK" stack pre-configured which you can install on any EC2 instance that fits your budget/performance needs, but it also offers an Elasticsearch Service which scales and has its own pricing model ($0.018 per Hour at its lowest level).
Security, etc. is also handled both at an infrastructure (AWS/EC2) level, but can also be controlled on an App/App basis - so long as its supported. In this case there's several options when it comes to logstash/Elasticsearch.
My thinking is along these lines: If you coupled a database layer it could become a "required component". As it stands, the only 'enhancement' that VizAlerts would have to make revolves around making the logs a bit more robust and informative. This would then allow for a "query" of the log files.
I guess my suggestion was more of an idea on "how to accomplish this now with what you've got" - and I'd whole heartedly support adding in a backing data-layer, as it could probably go beyond just something that is queried for reporting/administrative purposes, but even go as far as automating/setting up advanced alerts, etc.
Yes, that's another part that is appealing to the whole operational logging / near-real-time querying feature--easier prevention of duplicate alerts being sent. Right now there is no way for your alerts to know if they've already been sent to their recipients. If you had access to data describing who'd been sent what alert, when, you could structure your alerts to be more robust and not so reliant on relative dates. It'd also make them more fault-tolerant, since, in the event that your alert fails after the N retries it was afforded, it won't be attempted again unless the data says it should be. If you had the data, you could loosen up the primary trigger requirements, then prevent dupes over multiple "tests" by checking to see if you'd already sent that alert out to the recipient.
That all gets more complicated when you consider the dynamic nature of recipient lists and consolidated emails, but still, pretty cool!
All the above considered, we can fairly easily output a separate, structured log that rolls over less frequently than the standard log, intended for reporting purposes. At the end of a VizAlerts cycle, all output from all threads (this all must be thread-safe) can be appended to the master ops log. Expose the data with row-level security with a published datasource on Server, and you'd be good to go on nearly all fronts.
Restructured the code for more modularity (pushed to the twbconfig branch), and while I was at it, improved our ability to implement more structured output requested in this Issue. Each VizAlert is now an instance of a VizAlert class, which can store any validation and errors it encountered at various stages. This feature will still take more work to implement, but it's one step closer.
Both Twilio and Mandrill allow for some reporting based on what SMS / emails you've asked them to send, so if either of those services are used, there is at least some operational data available on what VizAlerts is doing (when an alert actually fires, anyway).
Attaching a nearly-complete suggestion on what the structured output could look like.
Looks good! There are three-ish other things I can think of:
- The URL being called in the the case of content references like VIZ_IMAGE().
- In the case of advanced alerts knowing which row is failing is important so having enough fields like from/to/subject for email is key. Would the Rows field have a string of that info? 3-ish) Should we include any of the config info in this more structured log? What I’m thinking of is things like email server name, etc. I can go back and forth on including it. The particular case I’m thinking about is when first getting the Twilio integration going I had a problem where I’d fat-fingered one of the required Twilio IDs (swapped an 8 for 9 or something like that) and got error after error that could only be seen once I’d seen that value and compared it to the online one. If it’s not in the structured log then users would need to look at the regular log. So that’s the “for" argument. The “against” argument is that the Twilio ID’s include the private API key and that shouldn’t be going into a log that could conceivably be shared.
Jonathan
On Jun 26, 2017, at 11:52 AM, Matt <[email protected] mailto:[email protected]> wrote:
Attaching a nearly-complete suggestion on what the structured output could look like.
VizAlertsStructuredOutput.xlsx https://github.com/tableau/VizAlerts/files/1102695/VizAlertsStructuredOutput.xlsx — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tableau/VizAlerts/issues/8#issuecomment-311100976, or mute the thread https://github.com/notifications/unsubscribe-auth/AAW-MfLeUtNC8S-CqOSf2bu0U7YXMeIJks5sH9O9gaJpZM4HFBZe.
https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png https://github.com/tableau/VizAlerts https://github.com/tableau/VizAlerts/files/1102695/VizAlertsStructuredOutput.xlsx https://github.com/tableau/VizAlerts/issues/8#issuecomment-311100976
Good calls!
-
Added clarity on the input value column for content refs. We'll just include the raw text of it, I think, unless you see an issue with doing so.
-
I hadn't considered that, but it's a great idea. Consolidated emails don't have a specific line, however, or even a clear sequence from the original trigger data, because of the unique-ifying and re-sorting we do before processing them. So it might need to be NULL in those instances.
For Subject, how about a field called "Output Name"? This can be the Subject of an email, or the Filename of a content reference (as defined by the |filename param, or if not specified, the raw filename). This would be empty for SMS.
Since we're also trying to tackle multithreading within each alert, I was considering pre-logging each instance of an email to be sent as it's own logged line, even if we know it won't be sent because its recipients or content reference(s) failed to process. That way, no matter what happens, we know the full set of emails that were supposed to be sent, and which were or weren't sent successfully.
- I have planned to add in the ScheduledTriggerViews config info, but not the global config info from vizalerts.yaml. I was thinking that it would simply be a lot of duplicative logging if we added the global config stuff in there, because it rarely changes. Though interestingly, it would be useful for performance monitoring if you switched to a different Tableau Server or SMTP server. As an IT person, that stuff is super helpful. I'll commit to a firm "maybe" here. :)
new version, with all the ScheduledTriggerViews fields (though I didn't mock up example values for those...too much work)