openssl icon indicating copy to clipboard operation
openssl copied to clipboard

QUIC: convert implementations to use the list data type

Open paulidale opened this issue 1 year ago • 39 comments

This is instead of re-implementing a linked list in four or five different places.

  • [ ] documentation is added or updated
  • [ ] tests are added or updated

paulidale avatar Oct 11 '22 07:10 paulidale

There are two single linked lists the ACKM which should be replaced once a single linked list is implemented generically.

paulidale avatar Oct 11 '22 10:10 paulidale

Ping for reviews?

paulidale avatar Oct 13 '22 03:10 paulidale

Feedback addressed.

paulidale avatar Oct 13 '22 23:10 paulidale

The insurmountable contention here is over making the list's internal structures open so that code can directly access the fields versus having wrapper functions that keeps them opaque. The structures are defined openly so they can be included in other structures avoiding memory allocations, however I've named them obtusely (at least for the moment) to discourage this.

The two use cases look like:

    TYPE *p, *pnext;

    for (p = ossl_list_type_head(l); p != NULL; p = pnext) {
        pnext = ossl_list_type_next(p);
        ...
    }

versus:

    TYPE *p, *pnext;

    for (p = l->head; p != NULL; p = pnext) {
        pnext = p->type.next;
        ...
    }

This boils down to:

  • slight saving in characters typed with direct access (I don't consider this relevant, programmers don't write many lines of code per day and extra typing is never an issue plus IDEs autocomplete nicely these days);
  • no performance difference (inline functions vs direct access);
  • no abstraction with direct access, so future changes or updates aren't possible (which I do consider to be relevant).

There is no talk about allowing direct access except for the head, tail, prev and next fields (although the num_elems field would also make sense). All the remaining access/helper functions will remain unmodified.

Maintaining the current level of abstraction allows additional features to be added if we decide we want them:

  • ~in debug builds, list elements could know which list they belong to and throw errors or asserts if there is a mismatch (I think this would be quite helpful)~ EDIT: added this in a separate commit.
  • more generally, if list elements know what list they belong to they could be automatically removed when inserted into another list (I am not so keen on this)
  • likely a more we've I've thought of

I also want to avoid direct access because I've got a vision where we have different kinds of lists and access to any of them is essentially identical from the programmer's viewpoint -- the only change is the use of DEFINE_LIST and OSSL_LIST_MEMBER macros, the generated functions could even keep the same names if we wanted:

  • Double linked list as here (DEFINE_LIST_OF & OSSL_LIST_MEMBER)
  • Thread safe linked list (DEFINE_ATOMIC_LIST_OF & OSSL_ATOMIC_LIST_MEMBER)
  • Skip list (DEFINE_SKIP_LIST_OF & OSSL_SKIP_LIST_MEMBER)
  • Single linked list (DEFINE_SLINK_LIST_OF & OSSL_SLINK_LIST_MEMBER)
  • Lists without the number of element counter (DEFINE_UNCOUNTABLE_LIST_OF & OSSL_UNCOUNTABLE_LIST_MEMBER) couldn't resist this name :grin:
  • & more

Currently, some of the alternative versions could be beneficial:

  • A pair of singly linked lists already appear in the QUIC code, so it might be worthwhile adding this.
  • Some of the lists converted here never have their element count accessed and could be uncountable lists (slight space and time saving).

Looking beyond our current code base, I can see uses for more:

  • I've a vague feeling that our sorted lists might result in DoS CVEs eventually which could be addressed by a skip list.
  • I suspect that a thread safe list would be useful in various places in our code, but haven't investigated enough to be certain.

The closest analogy to our implementation that I can find is BSD's sys/queue.h. This isn't producing inline functions and has a richer set of operations (which we should look at and adopt where sensible).

We also use linked lists in all manner of other places in the code which ought to be converted in due course.

Anyway, thoughts @mattcaswell @levitte?

paulidale avatar Oct 19 '22 02:10 paulidale

This is a nice writeup.

I also want to avoid direct access because I've got a vision where we have different kinds of lists and access to any of them is essentially identical from the programmer's viewpoint

I'm skeptical that this really adds anything. For example, in sys/queue.h, you have different macros for the different types of list, LIST_INSERT_HEAD, SLIST_INSERT_HEAD, etc. In fact I would even argue it makes the code easier to follow if it's explicit what kind of list is being used. I feel like trying to create genericity here is a case of overusing the preprocessor.

Some perspective: there are innumerable smart pointer classes in C++ but you don't see people trying to abstract over pointers in C because it's beyond what's really sensible in the language. By comparison if we were talking about a hash table a strong abstraction boundary would be a no-brainer. But the reason for this is because the design space for hash tables is large and admits many possible points within it without changes to the semantics. By comparison I see a doubly linked list as essentially being a 'natural' data structure with a design space comprising a single point. A naively designed abstraction often also ends up limiting possible evolution anyway more than was foreseen. (An interesting example of this is C++'s std::unordered_map, the API of which accidentially precludes use of open addressing, making contemporary implementations suboptimal in terms of what most people actually want.)

I don't expect 'drop in' data structure representation changes in calling code to be common and if it does occur the changes from list_foo_bar to skiplist_foo_bar or so on would be uniform and trivial. It's a non-issue for much the same reason that 'extra typing is a nonissue'.

For what it's worth I find the sys/queue.h makes a lot more sense to me, aside from some oddities like LIST_PREV. What we have here is a lot more unusual because it's embedding the field name into the function names and "type" of the list.

Regarding threading: How would you propose to implement an atomic list API? We do not have atomics in OpenSSL in the general case, but always fall back on having a mutex available. Which means we either add a mutex argument to these list APIs to support threads, defeating the point of having a unified API for all underlying representations, or allocate a mutex inside each list structure, which turns linked lists from being zero-allocation to having an allocation inside (not to mention the fact that such a mutex would almost always be at the wrong level of granularity, as the caller would have a better idea of how and when to lock and at what unit of scope).

I'd also note you're currently relying on initialising list structures via OPENSSL_zalloc, so you're relying on an all-zero-bits value being a valid value, so the abstraction layer isn't quite complete as things stand.

Conciseness is not really about saving typing as about improving code readability. Some of this could be addressed by adding FOREACH macros, including DELSAFE variants (for example).

hlandau avatar Oct 21 '22 12:10 hlandau

I am with Hugo on this. I do not see us ever replacing the internal list implementation. Yes, we might want to replace the underlying data structure used to implement some of the QUIC modules from linked lists (double or single) to something else, but in that case it will not be a list anymore and replacing an ossl_list_ call in some module or direct dereference of prev or next is practically the same work.

t8m avatar Oct 21 '22 13:10 t8m

I do not see us ever replacing the internal list implementation.

The debug version I've added does exactly this???

paulidale avatar Oct 21 '22 23:10 paulidale

The implementation of threading isn't critical -- it's an implementation detail. Locks is one option. Lock free lists can also be done (lock free data structures are pretty interesting but hard to get right).

I've added an element initialisation function to complete the abstraction. I've not used it yet because it's Saturday and I've other things I should be doing :)

paulidale avatar Oct 21 '22 23:10 paulidale

What are the compelling arguments in favour of the new suggestion? The accessor functions are the incumbent, changes from that generally require a compelling reason. So far the reasons seem to boil down to:

  • I don't like it (which is a logical fallacy and a rather underwhelming reason)
  • we'll never change it

I find the latter far less than compelling. I've outlined possible changes above (which have, frankly, been floccinaucinihilipilified). I've already added debug sanity checks which adequately refute the second point IMO -- these cannot be done without accessors.

Please, produce some meaningful reasons for why we should shift away from what is already?

paulidale avatar Oct 22 '22 01:10 paulidale

OTC: which path should be chosen?

paulidale avatar Oct 22 '22 01:10 paulidale

Reading through the code, this is an improvement and accessor functions make sense (to me) - given that there is no real disadvantage and the abstraction has a clear advantage. We shouldn't have code doing their own implementation of the same thing repeatedly. If one other person agrees then you can remove the OTC hold and let it through unless those that are objecting want to add one.

t-j-h avatar Oct 22 '22 02:10 t-j-h

and squashed down....

paulidale avatar Oct 22 '22 03:10 paulidale

My take away is that reducing code block copies is a good thing. (I've ranted quite strongly about the bad habit of code block copying before)

That alone has me agree with this sort of refactoring.

The debacle over certain accessors seems like fighting over fairly minute details. Me, I'm seeing a gain in consistency, where you'd otherwise have to constantly choose between accessors and direct field access.

levitte avatar Oct 22 '22 03:10 levitte

Regarding the OTC hold, do you want to keep it, @hlandau, or has this PR been argued enough?

(yes, I know that it wasn't @hlandau that raised it...)

levitte avatar Oct 22 '22 03:10 levitte

I've dropped the hold. Even if approved I won't merge until there is a chance to add it anew.

paulidale avatar Oct 22 '22 03:10 paulidale

@levitte, my understanding was that @hlandau was going to revisit all of the seperate implementations and change them to use list.h in due course. The debacle is the point of contention.

paulidale avatar Oct 22 '22 05:10 paulidale

Still approved

t-j-h avatar Oct 22 '22 07:10 t-j-h

Still approved (making me have to work harder at reviewing the changes ...)

t-j-h avatar Oct 24 '22 06:10 t-j-h

Oops. I switched the memsets to call the new init_elem function instead.

paulidale avatar Oct 24 '22 06:10 paulidale

ping for second review

paulidale avatar Oct 25 '22 20:10 paulidale

To be clear, I never had any intention of raising an OTC hold on this.

The implementation of threading isn't critical -- it's an implementation detail. Locks is one option. Lock free lists can also be done (lock free data structures are pretty interesting but hard to get right).

I've added an element initialisation function to complete the abstraction. I've not used it yet because it's Saturday and I've other things I should be doing :)

This doesn't seem like it completes the abstraction as things stand because it returns void and thus cannot fail. Thus you can never allocate a lock inside the representation. Implementation of threading is not an implementation detail as this demonstrates. This API cannot sensibly support it. The API design implications are significant. Claims that an API will allow threading in the future without changes don't hold water unless the feasibility of that API design for threading has been confirmed.

If you want a linked list API which supports threading you're going to need some kind of per-linked-list lock (which is insane, the wrong level of granularity, and means linked list initialisation can fail), or API changes to make use of a contextual lock. I don't see what the latter gains over the caller managing their own locks. I don't see anything desirable about the proposal that some day, existing use of linked lists may magically become thread-safe.

I would rather rule out this API ever being used for threading and define the zero-initialised value as a valid initial state.

I think debug validation demonstrates a valid and potentially useful use case for accessors. That's not the same as saying we should allow the representation to change in future, or forbid direct access where accessors cannot be used, such as where a pointer to a link pointer is needed.

I'd be OK approving this PR if

  • we say the zero-initialised value is a valid initial state; we keep initialisation as an infallible operation
  • we add FOREACH[_DELSAFE] macros to make iteration more ergonomic, something like:
LIST_FOREACH(node, list)
LIST_FOREACH_DELSAFE(node, nnext, list)
LIST_FOREACH_REVERSE(node, list)
LIST_FOREACH_REVERSE_DELSAFE(node, nprev, list)

/* example */
TXE *node, *nnext;

LIST_FOREACH_DELSAFE(node, nnext, qtx->pending) {
  /* ... */
}

I have concerns for the intentions behind this design, as have been expressed above, but those don't directly relate to this PR and can be resolved at any later time.

hlandau avatar Oct 26 '22 09:10 hlandau

The documentation talks about zeroing the structures in the init function descriptions. If a future flavour ever doesn't require this, it would need to be documented. Good point about failures: the list init ought to return an int.

I agree about element's not getting locks -- that would be crazy. This doesn't prevent a lockless implementation.

I'll look at adding loop macros at some point. I'd lean towards making them DELSAFE always rather than having two macros for each.

paulidale avatar Oct 26 '22 09:10 paulidale

Good point about failures: the list init ought to return an int.

Please don't. If we're accepting zero bytes as a valid representation of a list/node structure, which the documentation states, there's no reason to do this, and use of OPENSSL_zalloc is also valid.

I agree about element's not getting locks -- that would be crazy. This doesn't prevent a lockless implementation.

It does? Any lockless implementation would require atomics. We do not require support for atomics in OpenSSL and always have the ability to fall back to a lock.

hlandau avatar Oct 26 '22 09:10 hlandau

It's future stuff. Since this is all internal, we can change the return type later if required.

paulidale avatar Oct 26 '22 09:10 paulidale

Sure, we can keep the list init void for now. If we for some reason need a linked list init to be able to fail we can address that in the future.

hlandau avatar Oct 26 '22 09:10 hlandau

I'm still with Hugo on this and would even raise the hold if the dispute is unresolved but the PR is approved.

t8m avatar Oct 26 '22 10:10 t8m

@paulidale I think I prefer the iterator approach.. See linux kernel list.h for example.. Have you seen the xor variant (would not make things easy to debug :)).

slontis avatar Oct 27 '22 00:10 slontis

The delsafe iterators will be quite tricky & involve some ugly comma expressions. The normal ones are straightforward. Happy to add these in a separate PR to avoid confounding the question here.

There are a number of additional instances of lists in the QUIC PRs that will need to be converted & the iterators can go in with them.

paulidale avatar Oct 27 '22 00:10 paulidale

Yes, I'm aware of the XOR variant. It's not compatible with modern versions of C because pointers are not allowed with the ^ operator anymore.

paulidale avatar Oct 27 '22 01:10 paulidale

~~OTC: Do we want the simple single and double linked lists to require explicit initialization?~~ OTC: Do we want accessor functions for simple next and prev list member access?

And if OTC decides that we want accessor functions, do we want list iterators before merging this?

t8m avatar Oct 27 '22 11:10 t8m