bee icon indicating copy to clipboard operation
bee copied to clipboard

Topic confusion with PSS and Feeds APIs

Open agazso opened this issue 3 years ago • 6 comments

Summary

Currently there are two APIs (PSS and Feeds) that are using the concept of topics, however their definition is a bit different and that may be confusing. Also the current Feed implementation is slightly different than how it is defined in the Book of Swarm. It would be good to create some clarity around this and maybe unify the types or at least introduce consistency in the naming.

PSS API is expecting the topic to be specified as a string. Then the implementation hashes it using keccak256 to store it as 32 bytes in a trojan message (see BoS*, page 123, figure 51). Internally then this hash is called the topic with an appropriate Topic type and used in the pss module. The BoS refers to this as obfuscated topic id.

Feed API is expecting the topic to be specified as hex string, essentially hex encoded binary data. Then the implementation stores the topic as an arbitrary length byte slice and eventually hashes it (using keccak256) together with the index to get the feed id.

In the book however the feed topic as defined as 32 byte arbitrary byte array (page 109, section 4.3.1) which is different from the implementation. Also on page 110, figure 44 topic is defined as 20 bytes which adds to the confusion.

I think it would be beneficial for the users of these APIs if the two topics were the same type. For example in both cases the passed in topic parameter is used as an input for a keccak256 hash function, so the question arises why one is a string and the other is a hex encoded string? Could they be both arbitrary length strings or hex strings?

What do you think? @zelig @acud @janos

*Book of Swarm: https://docs.ethswarm.org/the-book-of-swarm-viktor-tron-v1.0-pre-release7.pdf

agazso avatar Jun 09 '21 16:06 agazso

i agree, lets unify and standardise as 32 byte hexencoded? @agazso ? i prefer the direct input of hashed topics via the API as they might be hashes of private info or even with preimage not known at time of publishing/search

zelig avatar Jun 10 '21 07:06 zelig

i agree, lets unify and standardise as 32 byte hexencoded? @agazso ? i prefer the direct input of hashed topics via the API as they might be hashes of private info or even with preimage not known at time of publishing/search

I like the 32 bytes hex encoded for the exact reason you mention. The only disadvantage I see with that is it will make it using the raw API (e.g. from curl) slightly more uncomfortable, but we can mitigate this with better tooling anyhow. Actually there is already support for optionally hashing the topic when provided as string in swarm-cli.

agazso avatar Jun 10 '21 09:06 agazso

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Nov 15 '21 01:11 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jan 18 '22 01:01 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Mar 23 '22 02:03 github-actions[bot]

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jun 22 '22 02:06 github-actions[bot]