opal icon indicating copy to clipboard operation
opal copied to clipboard

Redundant Configuration / Polling for Data Fetchers

Open phi1010 opened this issue 2 years ago • 16 comments

Hi,

I'm currently implementing a data fetcher for LDAP and testing it in my docker environment: https://github.com/phi1010/opal-fetcher-postgres

I'd like to trigger updates, and have two ideas -- could you tell me whether they can be done with the current OPAL framework?

  1. Polling in Python

Is there an intended way to schedule a later update form a FetcherProvider from within the process() step?

  1. Lightweight Web Hooks

I could also imagine triggering updates from Cron&Curl, or a Webhook URL that I can trigger with some other existing service.

However, I am not a fan of encoding the source's configuration multiple times -- as far as I can see, I'd have to include the configuration including the URLs and secret credentials in both the docker config for the initial update, and all other webservices triggering further updates from the same source again. This would be quite a hazzle to keep up-to-date.

Also, considering that both trusted SQL- and LDAP-like databases are good at enforcing restrictions, but not that ideal for triggering REST API calls; the data update will likely be triggered by some other tool modifying SQL/LDAP, which is not necessarily trusted that well and should not be able to change the configuration to load data from some other server. Also, not everyone who is allowed to change parts of the database necessarily is allowed to read the complete database, which is also compromised if every service triggering updates requires access to the credentials of the configuration.

For these reasons, I'd like to trigger an update only specifying the key (URL) or topic of an existing data source, can the remaining configuration be taken from the actual configuration, as in https://github.com/authorizon/opal-fetcher-postgres/blob/master/docker-compose.yml#L54 ?

In my attempts, I have only been able to trigger updates successfully when copying and re-specifying the complete configuration, as in https://docs.opal.ac/tutorials/trigger_data_updates

phi1010 avatar Aug 07 '21 13:08 phi1010

Hi @phi1010,

There's two separate questions here:

Q1: How can a publisher service monitor a data source in order to know when to trigger updates via OPAL?

At the moment, this is left unsolved by OPAL intentionally in order to keep the solution flexible. i.e: your LDAP publisher service can either do polling or webhooks, in the end you will trigger OPAL the same way using the REST API.

Q2: Can I somehow cache the credentials required by the fetch provider, and not provide them on every update?

  • At the moment, the current design requires you to re-specify credentials in every update message (i.e: each update message is a directive to re-fetch new data).
  • However, we could implement some mechanism to cache credentials based on the provider key and the url, without drastically changing how OPAL works.

@orweis WDYT? there's definitely valid points here raised by @phi1010. Caching the credentials based on (provider_key, url) shouldn't be too hard.

asafc avatar Aug 07 '21 14:08 asafc

I agree with @asafc 's answer here.

I think for the updates part you should prefer to keep the content of the update itself as simple as possible, and move as much of the complexity to the DataFetcher. This should provide the most decoupling and make the system more adaptive and flexible for change.

Within that I'd have a curl process or something similar trigger the event to OPAL.

By implementing a tailored for your needs Data-fetcher you can work with a simple indicator event to query the data.

This can also help with the second part around secrets/credentials for data access- which can be deployed with a tailored fetcher, as part of it's configuration or additional capabilities. For example you can configure it with a basic key and have it retrieve more secrets from a solution like Vault as part of the fetch call.

@asafc I think we can add more secrets support in OPAL, perhaps a unique place as part of the client or fetcher configuration. But we should avoid OPAL becoming a full blown secrets manager.

orweis avatar Aug 07 '21 14:08 orweis

Storing Credentials alone probably would not solve most security issues related to this; if an untrusted trigger can still change the configuration, it could simply let the fetcher send the credentials to some other server; or have wrong data loaded from another server.

phi1010 avatar Aug 07 '21 14:08 phi1010

@phi1010 i just want to make sure we are on the same page regarding security.

"if an untrusted trigger can still change the configuration" - any service using the trigger api must have a valid JWT token of type "datasource" that is cryptographically signed by the OPAL server - so no untrusted service can actually trigger updates.

Also - since you are the one deciding what fetchers to load in your OPAL configuration - you should only load fetchers that you trust. Please note that our example postgres fetcher is not allowed to write to postgres db, only to read. In essence, fetchers should only be able to read data, not modify, and you should scope their credentials in such way.

As i see it, there are two valid options:

  1. implementing credential storage in your fetcher, in the way i suggested (would not require a change of OPAL)
  2. implementing some form of simple credential storage by OPAL itself.

WDYT?

asafc avatar Aug 07 '21 14:08 asafc

@phi1010 also note the triggering component (e.g. a microservice that tracks your LDAP) the, the one send the update event , is authenticated to OPAL (via a valid OPAL-server token (see obtain-token) So only authenticated triggers can pass URLS and event configurations to OPAL-clients and their data fetchers.

In addition you can use URLs with signed SSL certificates to make sure your data-fetchers are talking to the real URL targets.

So I think the "it could simply let the fetcher send the credentials to some other server; or have wrong data loaded from another server." shouldn't be a real concern - please share more information on your network layout if we are missing something here.

That said - If you wanted extra protection - you can add a whitelist of URLS to your DataFecther - so it won't query unsolicited urls.

orweis avatar Aug 07 '21 14:08 orweis

implementing credential storage in your fetcher, in the way i suggested (would not require a change of OPAL)

I am currently trying to figure out how I can store the complete configuration provided from my Dockerfile in my fetcher -- I'm still a bit unsure about the lifetimes of the objects.

"if an untrusted trigger can still change the configuration" - any service using the trigger api must have a valid JWT token of type "datasource" that is cryptographically signed by the OPAL server - so no untrusted service can actually trigger updates.

Considering the trustworthyness of triggers, I think in the long term, a less privileged token would probably be useful. Maybe the following scenario explain a bit where I would expect a separation of privilege:

please share more information on your network layout if we are missing something here.

For me personally, it's currently a single website requiring authentication; where I can try out OPA and now have to get my data from LDAP / Keycloak there. My vision is that further projects of our makerspace may also get policies from this OPAL instance. The fetchers would have to be reviewed by an administrator; as well as the configuration of a trustworthy docker configuration with secret credentials. Those community projects may want to cause data updates -- e.g. inform OPAL to reload data about who has been trained an may use which machinery, but the community maintaining this service should not be able to modify data that the policy uses to decide whether someone has admin permissions to the webside or whether they are allowed to open a certain door.

However, this probably also applies to commercial customers, especially when different external data sources have different levels of trust. The fetchers themselves always require absolute trust -- they are python code; and can do everything by code execution -- however, the different databases may be secured differently and administered by people with differing privileges:

A scenario one could aim to avoid is when a Payment service can cause OPAL to trigger the trustworthy LDAP fetcher implementation, and provide a configuration to load data from paymentservice.com instead of ldap.mynetwork.com, and that way compromise administrator accounts. Rego policies may depend on LDAP data documents always being supplied by the correct LDAP only and payment services only being allowed to change payment data it is responsible for.

If one data source can trigger correct data to be reloaded unecessarily from another data source, this typically can be tolerated. Causing the data from another data source to be reloaded with another configuration might be critical in these scenario.

That said - If you wanted extra protection - you can add a whitelist of URLS to your DataFecther - so it won't query unsolicited urls.

This might be a simple solution, but would probably result in poor reusability and maintainability as open source.

implementing some form of simple credential storage by OPAL itself.

My initial expectation until I started experimenting probably would have looked like this:

  1. In my docker configuration, I install the fetchers I trust
  2. In my docker configuration (or my git), I configure some data sources with an ID, e.g. makerspace_opening_hours, a fetcher class, a document path where OPAL shall store the data and some trustworthy reviewed configuration of hostname/port/protocol/query/credentials according to what the data source operator requires. EDIT: Storing credentials in a git that people can access to send pull requests policies would be naive as well; forget that idea please. ;)
  3. There is an URL https://my-opal-sever/update-datasource/makerspace_opening_hours, maybe with an individual secret token allowed to only call this URL only, that the data source operator can trigger without further knowledge about OPAL/etc.
  4. People responsible for opening hours can modify opening hours and trigger updates, they can not influence who is allowed to open doors that, by rego policy, are unrelated to opening hours.

phi1010 avatar Aug 07 '21 15:08 phi1010

  1. There is an URL https://my-opal-sever/update-datasource/makerspace_opening_hours, maybe with an individual secret token allowed to only call this URL only, that the data source operator can trigger without further knowledge about OPAL/etc.

This would probably also be relevant to your examples -- commercial Payment Services, Service Desks (e.g. Jira), Github and other services may have some possibility to configure callback URLs, but what is sent to those URLs is typically determined by the service triggering the callback. When writing fetchers as adapters to get data from such services via their REST api; there might be an interest to pass the additional data passed to the URL as GET/POST parameters to the fetcher, so that the fetcher can decide whether reloading data is actually useful.

I currently cannot think of a way to get most of these external services that could be sources for custom data providers to call a webhook with information presented in a way that matches the format expected by the POST /data/config endpoint. It probably would help for flexibility if there is a template for such webhooks being parsed by the according configured fetcher -- which then could extract information and provide the data that opal expects (e.g. parse a github webhook https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads and set "user xyz created an issue in project abc" as a reason for the update.

With this, OPAL+OPA could be useful in many cases where the community provides fetchers for (ldap, github, etc.) -- I think it would be worth to avoid the necessity of developing and operating a fetcher-specific wrapper microservice responsible for parsing webhooks and passing them to OPAL in a format that OPAL can understand and forward to the fetcher. This functionality would be coupled so strongly to the fetcher itself, that functionality of understanding the webhooks of the according data source ideally could be part of the fetcher itself -- when there is a way to pass such data to the fetcher.

phi1010 avatar Aug 07 '21 16:08 phi1010

Hey @phi1010, you raise a lot of interesting points. Let me take a closer look tomorrow and come back to you :)

asafc avatar Aug 07 '21 16:08 asafc

Hi @phi1010, sorry for the delay - Friday and Saturday are weekend days for us, we try to be available within reason :)

You raise some interesting points, some of them completely separate from one another. Let me try to break it down and see how we can approach this together.

1) Allow data-source services (who trigger updates) to only affect some parts of the policy

Those community projects may want to cause data updates -- e.g. inform OPAL to reload data about who has been trained an may use which machinery, but the community maintaining this service should not be able to modify data that the policy uses to decide whether someone has admin permissions to the webside or whether they are allowed to open a certain door.

Sounds like a valid use case, however there is no way to achieve that with the current version of OPAL.

Current version behavior:

OPAL can currently check that a datasource token is valid (token forgery is not possible, i.e: only someone possessing a valid token can publish updates) - but you are right that all the services that trigger updates, once verified, can send any data update they wish (thus manipulating OPA cache as they wish).

Proposed solution (for next version):

We can create a new feature in the next version, and there are two approaches we can take:

1) First Approach: allow scoping in the OPAL JWT tokens signed by OPAL server

OPAL server signs on tokens using the POST /token API endpoint:

We can make OPAL server check more claims in the JWT token as part of the verification of a publish update request:

  • allowed_fetchers (list of strings): limiting the types of data fetchers that can be triggered by a token, i.e: only a certain service can bring new data from LDAP.
  • allowed_urls (list of regex): limiting the urls we can fetch data from in data source entries sent with a token.
  • allowed_topics (list of regex): limiting which data topics can be addressed with a token (i.e: which opal clients can be affected).
  • allowed_dst_paths (list of paths): limiting what paths of OPA can be modified by updates fetched with this token.

Please be aware - in OPAL's security model - you alone as the OPAL service owner should know the "master token" which is required to sign on new "normal tokens" that are used to verify requests to OPAL. To summarize:

  1. You alone as OPAL service owner know the "master token".
  2. For each service that you allow to send updates to OPAL, you use your secret "master token" to generate a scoped "datasource token".
  3. The service which trigger updates only knows the scoped "datasource token".

This is not difficult to implement, and i think this is the correct approach. Please Let me know if you want us to implement this or maybe implement this yourself.

2) Second Approach: using separate OPA tokens for each updater, and relying on OPA's authorization policy

OPAL can run OPA with an authorization policy deciding which API calls to OPA (even from OPAL itself) are allowed. This is a feature built into OPA, here is an example policy that you can write.

You can pass such configuration into OPA like so:

OPAL_INLINE_OPA_CONFIG='{"authorization":"basic","authentication":"token","files":["basic-authz.rego"]}'

Full explanation how to setup such "meta" policy can be found here.

However all API calls from OPAL to OPA use the same bearer token (provided in OPAL_POLICY_STORE_AUTH_TOKEN). If we pass the OPA token in the DataSourceEntry and pipe it in API calls made by this fetcher, it can work.

I don't like this approach as much, in short because it requires either:

  • vast changes to the policy updater (which handles static policies).
  • knowing upfront all the OPA tokens and have no flexibility in runtime.

2) Adding some kind of "update listener" functionality in OPAL server

With this, OPAL+OPA could be useful in many cases where the community provides fetchers for (ldap, github, etc.) -- I think it would be worth to avoid the necessity of developing and operating a fetcher-specific wrapper microservice responsible for parsing webhooks and passing them to OPAL in a format that OPAL can understand and forward to the fetcher. This functionality would be coupled so strongly to the fetcher itself, that functionality of understanding the webhooks of the according data source ideally could be part of the fetcher itself -- when there is a way to pass such data to the fetcher.

Even if we add some kind of webhook functionality to OPAL server, it will be different for each kind of service: payments, LDAP or something else. Because each service has its own format of webhooks, the only way to implement this is by allowing extensibility to the OPAL server itself (i.e: similar to the data fetch providers in the client, "update listener" providers in the server).

The code of such providers will be no different than the code you are required to include in your own microservice that listens to webhooks. The only difference will be in the way the code is loaded:

  • either as a plugin of OPAL
  • or as a separate service (as is required now)

I think the only advantage of such plugins is the community aspect, for example: somebody already published a webhook listener for LDAP, so i don't have to write one.

But we can achieve the same thing by publishing such code under the authorizon org, maybe in some "contrib" project? So no inherent advantage for a "provider" model.

WDTY?

asafc avatar Aug 08 '21 14:08 asafc

  1. A separate service would require additional administration effort at no advantage. The extensibility to the server probably is not necessary, if the server just parses the relevant data (topics, jwts) from the url, and passes the remaining parameters from a POST unchanged to the subscribed clients, where the fetchers could parse these triggers.

\1) This will take me some weekends to form an opinion; intuitively, I would have aimed at something similar to 1.1), since it avoids manual interaction across different layers of the system; but I can't currently compare it to 1.2), since i haven't yet worked with OPA authentication policies (and whether they are able to be managed dynamically, and if this would have any benefit).

phi1010 avatar Aug 09 '21 00:08 phi1010

Hey @phi1010,

Regarding (1), it is very easy to implement JWT scoping, and it is also the correct approach in my opinion. Let us know when it becomes a blocker for you.

Regarding (2), can you elaborate more how you think it would work?

if the server just parses the relevant data (topics, jwts) from the url

  • Each 3rd party service will have a different payload inside the webhook request, it will be impossible to parse the topics in a generic way.
  • Not all 3rd party service use JWTs to authenticate webhooks, some (like github) use opaque tokens.
  • Validating that a 3rd party has the permissions to push an update message is not generic. Each such validation will be specific to a 3rd party service (different token structure, different JWT claims).

Also i am not sure why the administration effort bothers you that much - it's not even necessarily a new service. One or more of your existing backend services probably already manages the authorization state (i.e: manages a permission table in db, manages connection with LDAP, etc). All you need is to add a webhook route (or a db notification or callback, etc) and trigger an update with OPAL. You don't necessarily have to deploy a new service just for that.

asafc avatar Aug 09 '21 07:08 asafc

2 / Topics and JWT Authentication

All necessary data for webhook clients could be encoded in the URL for GET and POST requests -- e.g. https://my-opal-sever/update-datasource/an-datasource-id/an-opal-jwt/topic1,topic2,topic3 -- comma delimiting is already used in the clientside environment variable, thus a datasource cannot contain topics. This will likely be compatible with most other services and the easiest way to integrate this into other applications often probably not even requiring any coding.

2 / Proprietary Webhook Authentication

If proprietary webhook authentication is to be supported; the validation of identity could be made by the fetcher -- a github fetcher could include code for github-specific authentication, for example. This, however, is just an Idea; my current ldap fetcher probably would not make use of this feature at all, since there is no generic LDAP-specific webhook.

Considering administration effort:

Currently, my web service does should not have any authorization layer -- in the next step, it will likely receive authentication via OIDC, and to keep it as lightweight as possible, my intent was not to add any application-specific ldap code, and instead query OPA for every authorization decision to be made.

By now, the project is public, see https://github.com/Makerspace-Door/door_commander, the LDAP fetcher is too, see https://github.com/phi1010/opal-fetcher-ldap -- Currently, some django-permissions and hardcoded permission checks in python are included, which however do not yet allow object-specific permission management. Ideally, I'd like to get rid of all that hardcoded and badly maintainable code, and replace it with a simple invocation of OPA.

As of now, the service has no knowledge of any LDAP data; and ideally, the publicly reachble part will not require any data or access to further autorization services apart from OPA; all LDAP lookups are currently implemented in OPA itself, to allow querying this data from within policies ( https://github.com/Makerspace-Door/door_commander/blob/main/opa/policy/ldap.rego )

I liked the idea of the concept diagram with only few connections: https://i.ibb.co/CvmX8rR/simplified-diagram-highlight.png

I now could implement some code to receive webhooks, and let them trigger OPAL, but this would require me to re-implement this in every future project, and it would cause an additional network dependency, raising the probability and complexity of potential bugs, as well, in every future project, not just once.

phi1010 avatar Aug 10 '21 19:08 phi1010

Regarding the scoping of JWT tokens (1.1), an opinion from a security standpoint of view, even though it will likely be irrelevant to the current project for the next few years:

I don't think encoding authorization in a JWT token is ideal, because any restriction of authorization of those external services would require revoking the whole JWT token. If revocation (i.e. stateful management of tokens) is not implemented, I'd have to change the master token, and thus all other OPAL client and REST API client tokens too, which would risk breaking larger-scale setups, and might be a dealbreaker for some use cases.

I'd probably rather separate authorization from JWT authentication, and statefully change a configuration in my dockerfile or via the OPAL API, where I assign different tokens (based on their sub or jti claim) different permissions, or remove them completely

(In principle, this stateful configuration could be part of an OPA policy itself -- although I'm not sure if I would want to be the one to fix any bootstrapping issues if this should ever break -- maybe such a circular dependency is best to be avoided.)

phi1010 avatar Aug 10 '21 19:08 phi1010

Regarding both 1.1 and 1.2: As I understand it, both of these try to decide:

Which REST API Client is allowed to change which data?

This might solve many cases, but I still think it falls short for some scenarios.

I have some assumptions -- if one of these do not match your ideas, please tell me:

A1) Fetchers fetch data from data sources that are eventually consistent, fetching data is idempotent, and there is no disadvantage from fetching data / updating client one time too often ... from the correct source (EDIT) (apart from potential performance issues and denial of service attacks by requesting too many updates).

This means, that in principle, authentication of notifications about data having been changed could almost work without any authentication at all. For security, some rate limiting might even suffice -- if a tradeoff between performance and up-to-dateness is acceptable. Revokable authentication/authorization would be better, but not necessary to fulfil security requirements of authorization itself. (Authentication would be necessary if the "reason" string has to be from an accountable third party due to audit logging requirements; otherwise, anyone could just specify fake reasons.)

A2) When I restart OPAL/OPA, I want data to be available as soon as possible, as consistent as possible, as up-to-date as possible. I don't want data to differ depending on whether OPAL just has been restarted.

Thus, on startup, always the same data source should be queried the same way, that is also queried on further updates. I can't imagine a case where I would want the startup configuration of my fetcher (e.g. from the Docker variables) and the runtime fetcher configuration (from a REST API call) to differ. Any difference would probably be unintended and might cause inconsistent data, unexpected failures and sporadic security vulnerabilities.

Due to this, I think the more interesting question is:

Which configuration (Fetcher, Data Source URL, Credentials, further configuration) is allowed/configured to write to which OPA document? (regardless of who triggers the update)

This question -- from my point of view -- is already answered by the configuration provided by the admin within the environment variable, and probably might not need any option to be changed. It would just require a way to trigger OPAL to reuse an already known configuration, which is probably easier than implementing authentication passthrough to OPA or JWT token checking that is complex to configure.


Technical feedback on 1.1, if still of interest:

This rather should be a list of 4-tuples, not a tuple of four lists. If some service is allowed to trigger an update for both different SQL, REST and LDAP data sources, it should still not be allowed to invoke an LDAP fetcher with an REST URL and write the result to the SQL data's target location. (And if anyone would require a list, they could just specify alternatives in their regex.)

Within those four lists, the enforcement of the further configuration (e.g. to prevent triggering services from changing the LDAP search query) would still be missing.

phi1010 avatar Aug 10 '21 20:08 phi1010

Hey @phi1010, if i understand you correctly, you propose a wide change to the way OPAL works.

Currently - data updates are completely driven by the OPAL server (url, credentials, etc). You suggest a static configuration in the OPAL client, to specify all data sources (url, credentials, etc) and then to simply send simple update notifications to refetch a pre-known source.

You say you don't understand the use-case where update sources become different, but what if the policy can be changed on the fly through a control-plane in the cloud, and the new policy needs new data sources? This is exactly what we do for our SaaS service (authorizon.com) and i can tell you that this architecture is very beneficial for us.

Since your use case is a bit complicated, i suggest we jump on a zoom call and discuss your needs so we can understand them better :)

feel free to book here

asafc avatar Aug 12 '21 10:08 asafc

Hi @phi1010 :)

I am feeling like there are no concrete action items until we have a proper discussion :) Please feel free to book one :)

asafc avatar Aug 22 '21 10:08 asafc