LibPQ icon indicating copy to clipboard operation
LibPQ copied to clipboard

Compatibility with PowerBI service

Open sio opened this issue 5 years ago • 12 comments

Looks like LibPQ can not be used in PowerBI service because of providing dynamic arguments to Web.Content(). This page is the place to discuss, develop and test possible workarounds.

Important

I have no account with PowerBI service, so I will need outside help to do the tests

Roadmap

  • [x] Describe the testing sequence (@sio)
  • [x] Test LibPQ 1.2.0 with PowerBI service (@spythan?)
  • [x] Modify loader to use static base URL with Web.Contents (@sio)
  • [x] Test modified version of LibPQ with PowerBI service (not assigned)
  • [ ] Write documentation on using LibPQ with Power BI service

sio avatar Apr 27 '19 06:04 sio

Testing LibPQ with PowerBI service

  1. Add LibPQ to the report
  2. Add the following LibPQPath:
[
    Local = {},
    Web = {"https://github.com/sio/LibPQ/raw/master/Modules/"}
]
  1. Add the following test query:
LibPQ("Date.Parse")("27.04.2019")
  1. Check that the query above produces a valid date

sio avatar Apr 27 '19 06:04 sio

I have made some changes in the main LibPQ module. Now it uses static base URLs for web modules. If PBI service performs its checks at runtime, that should be enough. If it runs checks only once at import, we will need to define some hacky no-ops in the LibPQPath.

@spythan, please test both old and new LibPQ with the PowerBI service. Thank you very much!

sio avatar Apr 27 '19 08:04 sio

Mate I just saw these updates, I will review and answer in a couple of days!

spythan avatar May 01 '19 03:05 spythan

I tested both master and pbi-service versions as requested, both are working!

I also published a test project report online, connecting to external sources, and it works just fine.

When I first asked I wasn't even thinking about that Web.Content() detail to be honest, even though i happened to have read those articles. Having little experience with all these tools I was trying to find a way to access my files on the cloud service, since all the queries I use in the report contain just LibPQ("QueryName") import calls and all the code is managed in external files, and hadn't realized that using a private online repository to access the code would probably be a good solution, since credentials seem to be managed safely. At the time I was thinking of somehow uploading the files in scope of Power BI Service, like OneDrive or something. Edit: I seem to be able to use the local path in Power BI Desktop (in LibPQPath) instead of online repo to my code, publish the report to Power BI Service and everything still works - which confuses me and makes me understand that I don't grasp yet exactly how the queries are handled once the dataset/report is published to the cloud, it appears they don't require a connection to their source code, so I am again not sure if what I am trying to achieve is related with LibPQ. I thought if I changed the source code (in online repo) and refreshed the cloud dataset, the changes would be reflected, but it appears to not be so.

Regarding Scheduled Refresh though, this message appears: (both versions since it's unrelated)

image

Currently I have some ground to cover in my project before needing to solve this problem, but will probably come back to it when I need it. Any of your suggestions are very welcome.

For future reference, this resource will probably come in handy.

spythan avatar May 02 '19 00:05 spythan

Thank you very much!

The fact that local paths still work surprises me, but I'm not familiar with caching mechanisms PBI service uses. Could it be that by default it just publishes the results you've calculated locally unless you "force" the refresh in some way?

I'm pretty busy this week but I'll try to look more into this next week.

sio avatar May 05 '19 15:05 sio

I have much more information on PBI Service.

Because Refresh button isn't deactivated , nor any visual feedback if no refresh is actually ran, in PBI Service, I didn't realize that in my case the queries weren't running again at all, I only had the original data. That happened because refresh wasn't really enabled at all.

I set up both personal and enterprise gateways - a gateway is required if you want to refresh a on-premises source, like database, in PBI Service. Using either of those, I was able to refresh data and set-up automatic refresh for a cloud dataset connected to my database through the gateway. Only specific sources are supported for refresh though, as stated in the previous link, and I don't know what a query containing a LibPQ module import gets classified as, but it doesn't know how to refresh it, so you get this message: image

I found more info in the Refresh History pane after failed manual refresh attempts: image So that's where we stand :smiley:

spythan avatar May 07 '19 00:05 spythan

Well... I'm stuck.

I've signed up for the free tier of Power BI service and have gotten the same results as you. Looks like PBI service needs to be able to statically analyze the destination of each Web.Contents call in each query. The changes in the pbi-service branch are a step in the right direction, but it's not enough. Selecting which base url to fetch at runtime is not allowed at all and PBI service does not fall for the tricks that I have thought of before.

Here is the smallest demo of failing query:

let
    Choices = {"https://google.com", "https://github.com"},
    Random = Number.Round(Number.RandomBetween(0,1), 0),
    Address = Choices{Random},
    Lines = Lines.FromBinary(Web.Contents(Address))
in
    Lines

Same as with LibPQ it is not possible to detect which address will be fetched without executing the code. So PBI service plain out forbids executing it. I hoped that there would be a way to mark all possible destinations as allowed but PBI service does not seem to execute any of its checks at runtime.

Currently I see the following options:

  • Find/create user voice suggestion and hope that Microsoft will make some changes to PBI service
  • Somehow modify LibPQ to (optionally) use only a single module source and to pass its address statically everywhere
    • Note to self: the place to add short circuit is Module.FromPath -> SourceCode

I will think about this issue some more and will come back here next week.

sio avatar May 07 '19 13:05 sio

I have made another attempt at this and have failed again.

Turns out PBI service does not allow some functions/expressions in refreshable queries period, no matter the clever hacks and conditional no-ops.

Smallest demo of non-working query:

if false then Record.FieldNames(#shared) else "hello world"

This line will make the whole dataset non-refreshable in PBI service. Same as with #shared in the example above, "unsafe" invocations of Web.Contents, File.Contents are not allowed at all, even if they are statically verifiable never to be executed.

At this point I'm out of ideas on how to reconcile the service's restrictions with current LibPQ loader. I believe it is still technically possible to load LibPQ modules into PBI service, but the loader for the service may not be the same as for standalone workbooks and will have less functionality than the main loader.

As I'm not seeing much demand for this feature, I'm putting this issue on hold. Everyone is welcome to contribute the alternative loader though. If there are volunteers I can offer my expertise and try to answer any encountered questions.

sio avatar Jun 18 '19 15:06 sio

I believe that #shared execution is blocked on the Power BI Service. This issue was discussed somewhere on the Power BI/Power Query TechNet forums (cannot find an exact link), and Matt Masson mentioned that somebody in the earlier ages of PQ decided to do not allow #shared use in the Service by some security concerns. So, any library/function which requires the use of #shared via Expression.Evaluate or by other way won't work in the Service (but still can work in Desktop or Excel)

hohlick avatar Jun 18 '19 19:06 hohlick

I'm not sure if how many things in this library prevent it from running in the Power BI Service, but at least I have a solution for Web.Contents. To make sure functions pass static parameters to Web.Contents' you can use decorator pattern and accept contents` function as the first parameter.

WiqlRunQueryById = (
    contents as function, 
    url as text, 
    scope as record, 
    id as text, 
    optional options as record) as table =>
        let

Then, when clients wants to use this function they have to define contents function with static site root.

    
let
    url = "https://stansw.visualstudio.com", 
    collection = "",
    project = "vsts-open-in-powerbi",
    team = "vsts-open-in-powerbi Team",
    id = "d5349265-9c9d-4808-933a-c3d27b731657",

    // Create wrapper for VSTS.AccountContents function as a workaround for the static code analysis in Power BI Service.
    contents = (o) => VSTS.AccountContents(
        url, 
        [ 
            Version              = Record.FieldOrDefault(Record.FieldOrDefault(o, "Headers", []), "Referer", ""),
            IsRetry              = Record.FieldOrDefault(o, "IsRetry", false),
            ManualStatusHandling = Record.FieldOrDefault(o, "ManualStatusHandling", {}),
            Query                = Record.FieldOrDefault(o, "Query", []),
            RelativePath         = Record.FieldOrDefault(o, "RelativePath", null),
            Timeout              = Record.FieldOrDefault(o, "Timeout", null)
       ]),

    Source = Functions[WiqlRunQueryById](contents, url, [Collection = collection, Project = project, Team = team], id)
in
    Source

The obvious drawback is that this is more complicated than a simple call to a function, but the unquestionable advantage is that it passes all the checks and works everywhere.

I used this approach successfully in my PQ library for interacting with Azure DevOps. https://github.com/stansw/vsts-open-in-powerbi/blob/master/pq/Functions.VSTS.txt

StanislawSwierc avatar Aug 01 '19 11:08 StanislawSwierc

Oh, that's clever! Thank you for sharing with us!

Unfortunately, as @hohlick has pointed out, Web.Contents is not the only problem with using LibPQ in PBI service - #shared is also essential. So the overall issue remains unsolved.

You've got us one step closer though! That's cool! :-)

sio avatar Aug 01 '19 12:08 sio

I don't think that #shared will be enabled in PowerBI service. It would allow you to enumerate over all the functions including functions from non-public modules. You will need to find a different way :(

StanislawSwierc avatar Aug 01 '19 18:08 StanislawSwierc