holochain icon indicating copy to clipboard operation
holochain copied to clipboard

Unable to manipulate entries defined in other zomes local to the DNA

Open pospi opened this issue 3 years ago • 15 comments

This is a limitation with cross-zome entry management in HC-RSM that prevents a modular architecture that was able to be achieved in HC-redux. Since this conversation has been on a loop for some years now, it may be best to point to an older thread. Briefly, assertions made in this forum post turned out to be false.

Typical objections:

  • Wouldn't you want to call into the foreign zome?
    • No. API calls through the zome API boundary are also callable directly by the host (human) user. There are some possibilities for semaphore-like locks between cells and higher-order cross-cell transactionality that may be best secured by coordinated actions between pluggable zomes which the user cannot intervene in. In other words, there are advanced use cases where the user being able to trigger this call themselves could lead to inconsistent database states or stalled replication.
    • See also Scenario A, below.
  • Would namespacing entry type IDs fix this?
    • Partially, but not entirely. There is still a need to share storage between zomes if aiming for modular and pluggable systems; which I hope the below two scenarios can make a case for.

Scenario A

Composable polymorphism for zome functionality. As a benchmark, the architectural model of mixin zomes for entry timestamping should be achievable without coding know-how (at the DNA bundling level).

Basically, we should want to allow developers to define common data structures which can behave in different ways depending on the behaviour plugged in.

Scenario B

In hREA I have implemented EconomicEvent and EconomicResource as separate zomes within the same DNA, in order that they can be pluggable (not everybody will need the inventorying functionality of EconomicResource, and there are some future plans for alternative storage logic for it as well). The catch is that the EconomicEvent zome is responsible for creating EconomicResource records too.

This used to work when entrydefs were non-namespaced and DNA-wide. Now they only seem to match defs in the current zome: the entrydef errors in my tests are solved if I duplicate the EconomicResource entrydef into the EconomicEvent zome, but no records are returned from those endpoints— they're being created as economic_event.EconomicResource entries, not economic_resource.EconomicResource.

Suggested solution

  • Implement some tuple structure to globally-uniquely identify entrydef IDs within a cell (not zome); call it struct CellEntryDefId(pub String, pub EntryDefId).
    • (an alternative here may be to change EntryDefId::App(String) to EntryDefId::App(String, String))
    • In either case, the important detail is that we add a new identifier string which will be the zome name within the DNA manifest.
  • Implement From<String> or whatever conversion traits are needed for this struct to fill the zome name string into the first part of the identifier; so that existing EntryDefIds can continue to be used without significant code changes.
  • Allow passing a manually constructed CellEntryDefId through to the low-level HDK methods such that the fully qualified entrydef can be specified by application code. (Or if taking the 'alternative' approach this will just happen naturally in changes to EntryDefId::App.)
    • Note that the standard pattern would be to place the zome name of the foreign zome in the DNA configuration as a zome property and then read this value in order to assign the zome-addressing portion of the identifier.

Addendum: debugging entrydefs

There is also a minor debugging issue that it would be good to improve upon in the course of resolving this limitation. The error message for entrydef mismatches is a bit too generic. All it tells you is "An error with entry defs" followed by (my guess) those defs registered in the current zome. If the error message could indicate which specific entrydef is the issue, and why (eg. "entry def missing: X") that would really help with troubleshooting.

pospi avatar Apr 19 '21 05:04 pospi

Related issue which turns out to also be blocking this https://github.com/holochain/holochain/issues/563#issuecomment-843137738

Basically, we need to remove all barriers preventing zomes from accessing each other's data in order to have robustly composable application architectures.

pospi avatar May 18 '21 21:05 pospi

for composability i'd expect at least a minimum of:

  • the interfaces/boundaries must be well defined and stable/predictable AND
  • the internals must be "black box" free to achieve the "contract" of the interface however makes sense

reading/writing data freely at the level of direct access to HDK functions in other zomes doesn't meet that because then:

  • zomes cannot expect their own abstractions to be respected downstream (unable to define an interface)
  • downstream zomes need to be aware of and respect abstractions (not black box)

one thing that comes to mind is that a zome (currently) cannot express how to walk a CRUD tree outside of logic in externs, so any zome "dialing in" without going through a defined extern will need to implement consistent tree walking logic, which seems to me to make composability more difficult not easier

perhaps we could plan to make such a thing expressable outside zome call externs, e.g. a "crud_walk" callback or similar, and we could step through every HDK function and carefully define how to express abstractions in the case of direct calls - but this work/thinking hasn't been done afaik

if the main objection to call is the visibility of externs then my preference would be to see if that concern can be addressed directly - e.g. we could probably implement some way to hide functions from the external facing interface to prevent GUI developers from accidentally breaking the local state of honest nodes

the way this issue is phrased reads to me as though the use of call is a security concern but it cannot be as wasm is driven by the conductor host, which is open source software running on the users' own machine, so the user can always intervene in any inter-wasm communications - anyone who can jailbreak an iphone could work around any call level security in a weekend (for their local node that is, it wouldn't impact remote honest nodes)

thedavidmeister avatar May 20 '21 12:05 thedavidmeister

the way this issue is phrased reads to me as though the use of call is a security concern [..] anyone who can jailbreak an iphone could work around any call level security in a weekend (for their local node that is, it wouldn't impact remote honest nodes)

Yeah, I should be more clear about the failure states I'm describing. This is really more of an "honest, but untrained" user error condition where I wonder about the repercussions of just having the possibility of accidentally triggering an internal action from the outside. We've no guidance or expectations on the kinds of tooling that will be made available for interacting with Holochain nodes, nor do I think we should have. But I can certainly see worlds where people are given a list of API calls they can make into a node and buttons to poke at them. I would rather have an ability to keep private interfaces private. If hiding things from the GUI interface can accomplish that then it's better but it's still pushing a lot of plumbing code onto application developers.

You raise good points on composability though. There is a tension there between purity and polymorphism; and me implementing the latter in the way I had done in hc-redux certainly leads to an unpredictable API contract based on side-effects. So, maybe having a way to expose method calls to other zomes but only from inside the WASM boundary is indeed the best way; it pushes the logic for dealing with the foreign zome state change into the foreign zome where it should be. More plumbing, but plumbing that creates functional purity.

pospi avatar May 21 '21 01:05 pospi

I think there might also be some advantages here I'm not able to quite articulate yet, in terms of how that enforced zome purity ripples outward into tools like The Compository and the kinds of guarantees it gives to zome and DNA developers...

pospi avatar May 21 '21 01:05 pospi

@pospi sure, from the conversation i had with @guillemcordoba it seems like both of your use cases would be addressed by being able to define something like custom callbacks

core callbacks:

  • should not be visible externally (if they are that's probably a bug)
  • sit at a known location (string function name) with known signature
  • support the ability to iterate over all zomes and reduce many results to a single unified result
  • do not expose any internal logic to core or require core to know anything about their internals beyond the I/O agreement
  • have a permissions system that strips out access to certain functions within wasm when called
  • are inspected by core to check for whether they exist without creating and invoking a full wasm instance (just the compiled module)

if you could do all that and create conventions for your own callbacks there'd be no danger about exposing things to the GUI and you'd be able to compose zomes by their interface

we'd want to make information about custom callbacks available to tooling i think - to facilitate the "permissionless zero code end-user lego" experience

thedavidmeister avatar May 24 '21 19:05 thedavidmeister

Sounds great. I think what @guillemcordoba and I are both seeking is a robust API contract for DNA-level application authors (ie. people editing *.yaml files, not people editing *.rs files). Something that is compile-time enforceable and surfaces internal bindings between zomes in the same DNA; in a way that is tightly bound by the rules zome developers wish to impose on DNA developers.

The rest of what you're saying above (known location, iteration/reduction, encapsulation, permissions, introspection) sounds like implementation details, which are fine by me so long as I get the above.


Perhaps one thing to tease apart-

support the ability to iterate over all zomes and reduce many results to a single unified result

If what you're implying here is a binding that is 1:1 between the "core callback" name and the DNA, I don't think that is necessary. (@guillemcordoba? You?) From my perspective, "core callbacks" need only be unique to the zome; and related zomes being configured by the DNA author should have some configuration properties for targeting the other zome they're related to. The deeper inconsistency is between the use of zome labels in *.yaml files and numeric zome indexes inside the DNA runtime. If we provided a utility method hdk::get_zome_index(zome_label: String) -> u8 that simply parsed a metadata struct that gets statically filled by hc dna pack, that could be an asset in doing other kinds of cross-zome data access and lookup.

To ground this a bit— in this case, one could configure the EconomicEvent zome with properties.economic_resource_zome = "name_of_resource_zome". I can also see a case for standardising this configuration data outside of zome properties, and wrapping up the hdk::get_zome_index() resolution so that the hApp developer can more naturally address zomes by their name.

pospi avatar May 25 '21 04:05 pospi

@pospi i see the things i'm saying as implementation constraints rather than details

sayiing "an abstraction must not be leaky" is not an implementation detail at all, it's a high level requirement that we can measure "good" and "bad" solutions against

i can see how a set of callbacks could make its way to a yaml file if we had some idea of interfaces at that level

you'd need a way to represent the serialized inputs/outputs, i'm thinking something along the lines of the ABI that accompanies solidity code on the EVM

"support the ability to iterate over all zomes and reduce many results to a single unified result"

currently core only has the option to call "this callback in all zomes" or "this callback in this zome" but it would be pretty easy to have a Specific(Vec<Zome>) variant on that enum to support calling just a few specified zomes

thedavidmeister avatar Jul 20 '21 07:07 thedavidmeister

Yeah and these constraints may be all changed if the "accessors" primitive gets finally implemented. If that's there available for us, maybe all we need to do is attach new accessors as needed to the particular zome we want to query in that DNA. There's a world of possibilities that opens up then :)

guillemcordoba avatar Jul 20 '21 08:07 guillemcordoba

for reference here is the ABI for the famous erc20 standard on ethereum - https://gist.github.com/veox/8800debbf56e24718f9f483e1e40c35c

an example of one function looks like:

    {
        "constant": false,
        "inputs": [
            {
                "name": "_spender",
                "type": "address"
            },
            {
                "name": "_value",
                "type": "uint256"
            }
        ],
        "name": "approve",
        "outputs": [
            {
                "name": "",
                "type": "bool"
            }
        ],
        "payable": false,
        "stateMutability": "nonpayable",
        "type": "function"
    },

Similarly, a loooong time ago i investigated whether FaaSlang, now called FunctionScript that might be useful here - https://github.com/acode/FunctionScript

It defines a way to spit out an ABI-like json definition based on comments in the code itself

thedavidmeister avatar Jul 20 '21 12:07 thedavidmeister

Oh I also was investigating things like rust-reflection for low code things and looked promising after some initial tests... We could use it as a first step towards an ABI maybe?

guillemcordoba avatar Jul 20 '21 12:07 guillemcordoba

@guillemcordoba hmm, i'm a bit concerned about how proc macros stack in rust, they tend to swallow debug info completely :(

have you tested integrating it with the existing HDK macros to make sure that's not an issue?

thedavidmeister avatar Jul 20 '21 12:07 thedavidmeister

Nope I haven't, they were early explorations... And I have little experience with Rust macros.

guillemcordoba avatar Jul 20 '21 13:07 guillemcordoba

@guillemcordoba if we could do it without too much macrology i think that would be ideal, no idea how easy/difficult it is to do the comment parsing strategy either...

thedavidmeister avatar Jul 20 '21 13:07 thedavidmeister

Where did we end up with this? Is there plans for a macro, perhaps a parameter to #[hdk_extern] that can be used to indicate that a callback is only callable by other zomes in the same DNA? Something servicing https://github.com/holochain/holochain/issues/743#issuecomment-847288059 sounds ideal, provided the external function name is still scoped to the zome's ID.

pospi avatar Aug 02 '21 04:08 pospi

@pospi mmm not sure if there are concrete plans but i haven't seen anyone come out and clearly say we shouldn't do it...

seems like it's just waiting for someone to pick it up and work on it

thedavidmeister avatar Aug 11 '21 02:08 thedavidmeister

I wonder whether what I was reaching for here is now facilitated by way of integrity zomes and standard Rust crate dependencies.

pospi avatar Feb 22 '23 01:02 pospi

@pospi let me know how you go with it, there's also the ability to call other cells that i don't think we had before

thedavidmeister avatar Feb 28 '23 15:02 thedavidmeister