VizAlerts
VizAlerts copied to clipboard
Export to file
A set of code already exists for this feature at https://github.com/jdrummey/VizAlerts/tree/file-export.
There are several cases for exporting a viz to a file:
- Using a Tableau viz as a data preparation tool where a CSV of the viz is used for some other process.
- Downloading snapshots of a Tableau viz as it exists at a point in time to create an archive. Some data sources do not preserve the necessary information needed to roll back to a point in time so having the archive is useful.
- Downloading PNGs of a viz that can be placed in a folder where a PowerPoint has links to those files. This enables the PowerPoint to always have the latest versions of the PNGs.
- Once Issue #20 is implemented then it's foreseeable that some users would want to export a TWBX to a share accessible by Tableau Reader users.
The code that I'd created does have a |noattach option to allow file export without attaching the email to file. I'd thought about having an option to not have emails delivered at all and just have file exports but didn't implement it. Now I think that would be good to have, for example we've got a PNG snapshot of a viz dropped in a file location every 15 minutes to have an up to date viz and sending out 96 emails per day just isn't efficient.
I propose adding a third Email Action * option to deliver the files. For ease of implementation it would be set up just like the existing email action (so there's an email address to potentially alert if things go wrong) with the 4 required fields and would go through everything to generate the email except actually send it. I implemented this in our code at SMHC and it took about 10 minutes.
Jonathan
My head is spinning trying to figure out what combinations of Action types, content references, and content reference parameters are going to do when combined! I feel like some kind of visual reference would be helpful, but so far I haven't been able to think of a way to represent it in a coherent way. :(
Initially when I created " Email Action *", I had intended it to be used only with a static value--"1", only if the author wanted the alert to be an Advanced (email) Alert. The thought was, once we support other actions that aren't email, we'd use different fields for it, for example " SMS Action *" or " File Export Action *", or somesuch. Each of these would live in their own folders along with the required and optional fields in the VizAlerts "helper" datasource. That way, a person could tell what fields they'd need in their viz to support the kind of action that they wanted VizAlerts to perform. And only one kind of Action would be allowed per View.
While I think that does make it a bit less confusing, the downside to that approach, I suppose, is that you can't change the behavior dynamically based on your data, since you can't add/remove fields based on conditions. They're either there, or not. Overloading a generic "Action" field would allow the data to determine what takes place, so long as all the fields required by that action are present.
Emailing users if a file export action is successful brings up something I'd thought about a long time ago when my thought process around VizAlerts was as more an IFTTT engine than anything else: Control-of-flow logic. So the idea would be a set of instructions in sequential order, where you could make them conditional upon each other. A great example of this is the ol' "archive and delete stale content" need. It might go something like:
Dataset = all old and unused workbooks, datasources on a Tableau Server instance (export action to copy .twbx/.tdsx to archive share) -> (if export successful, email action to the owner informing them of the change) -> (if export successful, delete workbook/datasource action)
How could that be done? I don't know. It'll be a fun thought exercise though! Maybe a couple new fields called ON_FAILURE() and ON_SUCCESS() that let you pass in other VizAlerts with their own Action types. As an admin responsible for stability of an enterprise server, the thought of that feature existing is super scary to me, but as a very hacky developer it sounds like a freakin' blast. :8ball:
"Once Issue #20 is implemented then it's foreseeable that some users would want to export a TWBX to a share accessible by Tableau Reader users." Will this violate the EULA for Reader? https://community.tableau.com/message/521962#521962
Good point, Toby--it'd seem that using VizAlerts and Reader for the purpose of sharing TWBX's would in fact violate the Reader EULA.
Agreed, we shouldn't add the export to TWB/TWBX because is makes that too easy to violate the Reader EULA. One question - I'd actually put in some stubs for TWB/TWBX export when I wrote the code for file export, should we get rid of those?
Jonathan
On Mon, Aug 29, 2016 at 11:36 AM, Matt [email protected] wrote:
Good point, Toby--it'd seem that using VizAlerts and Reader for the purpose of sharing TWBX's would in fact violate the Reader EULA.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tableau/VizAlerts/issues/26#issuecomment-243160950, or mute the thread https://github.com/notifications/unsubscribe-auth/AAW-MbI_xSdcPgDP1yyCP5fEKigs-LNAks5qkvxhgaJpZM4IGNTJ .
There's still the "archive / history" use case for exporting a TWBX / TDSX, I guess. But I think I agree...it doesn't seem compelling enough to me to invest the effort, and the risk of misleading users into unintentionally violating the Reader licensing agreement by putting the feature in. Yeah, we should probably remove the stubs and add comments as to why TWBX exports aren't supported.
With the document API in existence I think there's more of a use for twb/twbx export when we consider the notion of IFTTT-like functionality in VizAlerts. For example how about a thresholded alert that triggers a move from dev to test or test to production? This might want to wait for the IFTTT-type functionality tho.
Here are some thoughts on File Actions I emailed awhile ago. Let me know what you think of them:
- File Copy Action Fields
- File Source *
- File Destination *
- File Delete Action Fields
- File Path *
I propose the following new content references to start:
- FILE_LIST(\unc-path|recurse|filename-pattern|olderthan=N|newerthan=N|largerthan=N|smallerthan=N|name=newname)
- Outputs a CSV of all files and their properties in the unc-path
- Can be used as input to the File Delete Action
- FILE_ZIP(\unc-path|recurse|filename-pattern|olderthan=N|newerthan=N|largerthan=N|smallerthan=N|name=newname)
- Outputs a Zip file of all files meeting criteria in the parameters passed
- FILE(\unc-path|recurse|filename-pattern|olderthan=N|newerthan=N|largerthan=N|smallerthan=N|name=newname|append)
- Outputs the first file it finds (sorted by…something? Not sure yet) meeting criteria in the parameters passed
The parameters would work like this:
- unc-path is a static string, and must be matched by the pattern the admin sets for each user (\tableaufileshare\mcoles and subdirs would be all I could access, for example)
- recurse opens the search up to subfolders of the unc-path
- filename-pattern is a regex that allows for filename match filtering
- olderthan=N filters to files older than N seconds. Probably the most important to have, over the next three.
- newerthan=N filters to files newer than N seconds
- largerthan=N filters to files larger than bytes seconds
- smallerthan=N filters to files smaller than N bytes
- name =newname sets the filename to what you like (much like “name” does for the VIZ_ content refs)
- append , used only for write operations in the File Copy To * field. This one might be tricky...
That's a lot of stuff to dump out there, I know. But oh, the unprecedented POWER if we could do all that... :)
Here's a question about file actions spawned by discussions on issue #75. There are file-based actions that are on visible-to-Tableau-Server folders and then there are potentially delivery actions for things like sftp/ftp, Office365, AWS, etc. Do we want to logically separate certain parts of working with files from the "where" those files go (or come from)? I'm not sure I'm being completely clear here, I'm just wondering how we might specify that set of files X ends up going to \myshare\myfolder and set of files Y gets put onto an sFTP site.
One question I have is that for things like sftp/ftp, Office365, Zip, etc. where VizAlerts would need to have access to various credentials (e.g. the password for encrypting a zip file) what are the security concerns and requirements? For example does VizAlerts need to blank out the password when writing into the log file?
Some quick comments on Matt's File Actions post.
- love the integration of filename-pattern|older-than\etc. to enable multiple files to meet criteria
- It looks like the File Source and File Path fields for copy & delete actions would take either the FILE_LIST or FILE content references, correct?
- how does FILE_ZIP get fit into the order of operations?
- FILE_ZIP should support encrypting zips right away
- Is the destination just a copy to a folder or does it include the idea of other location such as sftp/ftp, etc.?
Jonathan
I guess the questions for me on whether to combine the File Copy action in such a way that it could handle multiple destinations (FTP, SFTP, O365, AWS, etc) are:
-
Multiple new configuration information would be needed to support them all. What info is needed for VizAlerts to move files to an FTP server, for example? With a UNC path, we just make sure our domain account has admin rights on all the shares, and based simply on the path, we can copy it there. For an FTP server, we need to know the server name, probably the port, potentially pass in the username and a password to authenticate to it? And we'd need additional config settings for SFTP, and possibly cert paths, and all that.
-
Would an admin want to enable file copies to a share, but not an FTP site? It would be easier to generate explicit messages to users if we used a separate Action for FTP stuff, for example, that the admin hadn't set up. "Trying to FTP? Sorry, we don't support that right now, talk to your Admin". But meanwhile the File Copy action (to a UNC path) would work fine.
-
I'm not an FTP /. O365 / etc guru...but I know there's ways to simulate standard filepaths with these services; mapping a network drive and what have you. Could we get most of the functionality we need simply by relying on those services to work with what we implement in as standard a fashion as possible?
-
Do other services such as O365 support "dir" listing type commands that we could use to filter to a certain set of files such as in FILE_LIST? How hard would they be to implement? Again, if they can simply act like a file system in the first place, as in (3), it'd be convenient!
I like the ease of use behind the idea of a file copy to different services that all interact with files. It'd be convenient to use the file action to copy to FTP or some other service that works with files.
Really, a lot to think about, and I'm glad you brought it up. I think the best way to get answers is to think about real-life use cases and think through what makes sense. I know what my use cases are for standard file copy actions to a windows-based UNC path, but not much more.
On your other bulleted questions:
It looks like the File Source and File Path fields for copy & delete actions would take either the FILE_LIST or FILE content references, correct?
Yep, definitely FILE_LIST--though I think originally I hadn't considered the FILE reference as input to File Delete, I don't see why it couldn't be.
how does FILE_ZIP get fit into the order of operations?
High level, I think the general OOO should be: File operations, SMS, email. File processing should happen first so that you can do the file copying work first, then the notification work (e.g., attach the final ZIP in an email)
I think FILE_ZIP operations should always take place against temporary files in the temp folder. So a FILE_ZIP reference entails a copy down of one or more files, then a zip operation on them. I think this ref could work in either the source or destination field for File Copy actions. You could reference a PNG, a CSV, and a merged PDF in the File Source field, then use FILE_ZIP in the File Destination field, which would take whatever ended up in the temp folder for the source stuff, zip it up, and copy it to the place you wanted it to go.
Which raises another question: Should we allow multiple references in the File Action fields? If we reference a FILE_LIST in the File Source field, can we still reference other files in the same field? I think it makes sense to...but how about in the File Destination field? That makes less sense to me. If we allow multiple references in the File Source field, and just specify a UNC path in the File Destination field (or even a single file!) then what happens? If there are existing files, do they all just get overwritten? If a single file is specified, should we just assume they want a zip file, and zip them up and copy them there?
FILE_ZIP should support encrypting zips right away
I like that idea a lot, but as with the IFTTT and possible REST API calls, security is somewhat lacking when I think how we'd do this. The author would need to specify the password they wanted to use, then use that in an action field somehow, which would dump it all over the VizAlerts temp folder, and logs, not to mention they'd have the data on Tableau Server as well. The admin would generally already have access to all that stuff already anyway, but it's really not a best practice because it exposes something sensitive to a lot more attack vectors. We could probably mitigate by trying to beat it into people with the docs, comments, videos, etc that anything sensitive they put into VizAlerts is really not all that secure, so take appropriate precautions.
Is the destination just a copy to a folder or does it include the idea of other location such as sftp/ftp, etc.?
I was thinking just folder, but see the comment above. It's a good question, I don't think I have the answer at the moment.
Jonathan Matt
I've been procrastinating on another project and thinking about the security issues and doing some research. First of all I'd really like to get some help on this from someone who knows more.
One idea would be to use something like Passlib https://pythonhosted.org/passlib/index.html where instead of having usernames & passwords in content references there are hashes. That raises the question of how do users generate the hashes in the first place, we could use TabPy to do this since it's really a general-purpose Python server.
However if we're using TabPy then it might be even better to use it and some custom code to generate an encrypted file of hashed username/password combinations and provide some sort of developer key so there's another layer of security - in content references the key would be passed instead of the username/password combinations. So an Advanced Alert creator would run a command (maybe by entering info into a Tableau viz that calls TabPy??) that would update the hash file and give the user their key. If we're using the Tableau viz idea then it's data source could be based on the VizAlerts ScheduledTriggerView and/or .yaml so admins would be able to control what was allowed for export and limit access to Office 365, or specifically whitelist certain sftp sites, etc.
Even with this latter setup the VizAlerts admin would still have enough access to do so some damage (not sure how much or how to further mitigate that risk, that's where I'd like some help) but at least we'd get out of having plaintext usernames & passwords in the trigger views.
This concept could ultimately be extended to the email & SMS info that is currently in the .yaml file to lock down those as well.
Jonathan
I think the difference between this and other password storage mechanisms is that VizAlerts has to be able to derive the original, valid, plaintext password somehow, since it will need to pass it to other services. If you're using a password for authentication to generate a session, like in any old webapp, you don't need to actually store the users password once you've hashed it and stored the hash. You just take their plaintext input from the (encrypted) webform, hash it, and compare the hash to the hash you have in your database. If they match, they authenticate successfully. If not, they don't.
In our scenario, we do need to store the users' passwords, because the services we connect to will expect a plaintext password / token to be passed in. So even if we store the passwords in some kind of obfuscated format, we'd need a way to calculate the original passwords so they can be passed into things like FILE_ZIP. That's better than storing the raw passwords in plaintext, because it protects against Tableau Server being compromised, or logs / config files etc being shared accidentally or for troubleshooting purposes, but if the VizAlerts host was compromised, it wouldn't do a whole lot.
For that reason, I don't think we can use a cryptographic, one-way algorithm to encrypt the passwords--we'd have no way to recompute the originals. We need to encrypt them based on a private key we generate, that's dynamically generated for each VizAlerts installation (and probably also the user's username). Then store the encrypted version of the password. When it's time to send the password to a service, obtain the encrypted version from the viz data, decrypt using the private key, and pass the raw password to the service over some encrypted network connection (or just locally to 7zip or what have you).
VizAlerts really should be converted into an always-on service that runs and can listen to this kind of stuff. I'm not sure if TabPy is the right answer for that, but it does need to happen, regardless. That'll be a decent amount of work, I believe!
One more thing to note--if the raw passwords were ever entered anywhere into Tableau Server, not only the admins would have access, but also Tableau Support workers if a backup was ever sent to them. And, in a more likely scenario, their passwords would be written to the server logs as well, since query information for vizzes gets written there.