`IbkrWsKey` should have `conid` for market data subscriptions
Problem description
Market data subscriptions require a complex key smd+conid+{...}. The IbkrWsKey enum does not account for the possibility of having a conid as a part of the channel.
The inconsistent usage of keys in the QueueController and SubscriptionController makes this issue a bit harder to mitigate. The QueueController is using the IbkrWsKey enum as keys. While the SubscriptionController is using the key.channel string as keys.
Trying to be smart and getting the conid from the data dictionary:
class MyIbkrSubscriptionProcessor(SubscriptionProcessor):
def make_payload(self, channel: str, data: dict) -> str:
conid = data['conid']
json_data = data.copy()
del json_data['conid']
payload = f"{channel}+{conid}+{json.dumps(json_data)}"
return payload
def make_subscribe_payload(self, channel: str, data: dict = None) -> str:
return f"s{self.make_payload(channel, data)}"
def make_unsubscribe_payload(self, channel: str, data: dict = None) -> str:
return f"u{self.make_payload(channel, data)}"
subscription_data = {
'conid': gc_futures[0]['conid'],
'fields': [
FIELD_BID_PRICE,
FIELD_ASK_PRICE,
FIELD_BID_SIZE,
FIELD_ASK_SIZE,
FIELD_LAST_PRICE,
FIELD_VOLUME,
]
}
processor = MyIbkrSubscriptionProcessor()
key = IbkrWsKey.MARKET_DATA
ws_client.subscribe(channel=key.channel, data=subscription_data, subscription_processor=processor)
accessor = ws_client.new_queue_accessor(key)
results in unexpected behavior that generates additional logging:
IbkrWsClient: Handled a channel "md+712565966" message that is missing a subscription. Message: {'server_id': 'q1', 'conidEx': '712565966', 'conid': 712565966, '_updated': 1721113980282, '6119': 'q1', '85': '5', '88': '11', '86': '2452.50', '84': '2452.20', 'topic': 'smd+712565966'}
actual value from get(...): {712565966: {'conid': 712565966, '_updated': 1721113980282, 'topic': 'smd+712565966', 'market_data_marker': 'q1', 'ask_size': '5', 'bid_size': '11', 'ask_price': '2452.50', 'bid_price': '2452.20'}}
which forces the silencing:
logger = logging.getLogger('ibind.ibkr_ws_client')
logger.level = logging.ERROR
Solution
QueueControllerandSubscriptionControllershould use the same keys.IbkrWsKeyshould provide a way to incorporate aconidfor market data subscriptions
Recommendation
Add the option to skip key remapping and just return the raw message as is. If I supply field '84' (bid price) I would like to get a dictionary with key '84' and not 'bid_price'. Having the fields as constants is more than enough to write readable code. No need for the processing overhead.
Hey @teodorkostov many thanks for describing your issue in detail 👍
Market data subscriptions require a complex key smd+conid+{...}. The IbkrWsKey enum does not account for the possibility of having a conid as a part of the channel.
Indeed it does require it. The solution I've been running with is to include the conid with the channel string, such as:
ws_client.subscribe(channel='mh+265598', ...)
# or
key = IbkrWsKey.MARKET_DATA
ws_client.subscribe(channel=f'{key.channel}+265598', ...)
The reason being - from standpoint of both our client and IBKR servers (as far as I could empirically see) if a conid is present, it is married into the channel name:
- We store the channel+conid pair internally and manage subscriptions, unsubscriptions and resubsriptions based on it
- We unsubscribe using the same channel+conid pair
- IBKR gives us a server (or an equivalent of) for each channel+conid pair
In essence, the channel+conid pair becomes a de-facto channel name that we subscribe to. It was a choice of simplicity to treat it as just the channel argument.
You can see it being used in this form in these two examples:
- https://github.com/Voyz/ibind/blob/master/examples/ws_02_intermediate.py
- https://github.com/Voyz/ibind/blob/master/examples/ws_03_market_history.py
Can you see anywhere in the code that would benefit from having these two stored separately?
The inconsistent usage of keys in the
QueueControllerandSubscriptionControllermakes this issue a bit harder to mitigate.
This is a fair observation, and I'm glad you've pointed it out. I've gone over the code once again to see why was this the decision. Indeed, the answer is related to channel+conid pairs.
Why SubscriptionController uses strings?
- Subscriptions make sense if we treat them as strings, containing either channel or channel+conid. A subscription to the same channel but to a different coind should be treated as a separate entry and handled individually. The unique key to a subscription is indeed the channel and its parameters.
- Hence, the
SubscriptionControlleruses strings - to represent channels with its parameters (if present), such astrormh+265598.
Why QueueController uses enums?
- However, when we receive the data from IBKR we don't get it nicely segmented based on conid - it is just fed to us through the WebSocket. We do a little bit of parsing to figure out which channel it came from, but other than that it is just data. If I recall correctly some channels' data doesn't even contain the conid in its messages despite being needed for subscribing.
- Hence, the part of the interface that exposes data consumption -
QueueController- usesIbkrWsKeyenum with a 1-1 mapping to channel names. It doesn't support channel parameters, such as conids (or channel+conid strings), since we cannot assume that all data sent to us will have it. If it does you'll need to do the parsing in your application, which usually is quite trivial. Instead of enums we could use strings that only contain channel names, but I find enums safer and more readable.
Would you have any suggestions on how to improve this?
Sorry, I only just noticed your recommendation is related to a slightly different issue:
Add the option to skip key remapping and just return the raw message as is. If I supply field '84' (bid price) I would like to get a dictionary with key '84' and not 'bid_price'. Having the fields as constants is more than enough to write readable code. No need for the processing overhead.
Yes, absolutely, we can do this. Got it implemented, I'll publish it in the next release.
I do agree that it is possible to make it work. However, think about the ibind API from the perspective of a new user. The person does not know how the internals work. To make it work some advanced examples have to be read. Some of the framework code has to be explored.
It would be better for the design to guide the user.
What if the methods accept a dataclass and not the enum/string?
@dataclass(frozen=True)
class Subscription:
key: IbkrWsKey
conid: str = None # or List[str]
data: ... = None
def __post_init__(self):
val(self)
subscription = Subscription(...) # should throw an error if the optional fields are not defined for the special keys
ws_client.subscribe(subscription)
accessor = ws_client.new_queue_accessor(subscription)
Then the classes could decide what key to use internally. The user of the API does not have to know.
The sooner the user gets an error that something is wrong the better.
What do you think?
Better use a library for that. Something like Pydantic. Data validation is a common problem to solve. It is a certainty that the users of an API will mess up or even exploit it. Minimizing that possibility is good choice.
Sorry, I'm currently travelling, hence sparsely I find time to look into this. Things should be easing out for me in the upcoming months.
What if the methods accept a dataclass and not the enum/string?
Yeah, that seems like a good idea worth exploring. I agree that having a unified interface would be nicer.
Would you like to give it a shot implementing the change? Otherwise, I'm happy to do it once I'm back working.
And as for this:
Better use a library for that. Something like Pydantic. Data validation is a common problem to solve. It is a certainty that the users of an API will mess up or even exploit it. Minimizing that possibility is good choice.
Sorry, I'm not sure if I understand what you're referring to here. Could you elaborate? You wanna do data validation for the users' input?
Hey @teodorkostov just FYI I'm working on this already, you can see the WIP branch here: https://github.com/Voyz/ibind/tree/subscription-overhaul along with v0.2.0rc1 supporting the new system. I'll fill in more details once I'm more or less happy with it. Would you have time to do a PR review on it once it's ready?
Hi Voyz,
I have actually just run into this same confusion where it is not clear as a new user how to subscribe to market data from looking at end user doc string, examples (hard coded string) or the reference. Only having read this issue has the current required process of a dynamic channel become clear.
It is a bit confusing to use a different 'channel' to subscribe/unsubscribe vs access the queue accessor.
I see the work on this stalled. It would be nice to have the data class solution implemented as the main mechanism for subscribing, unsubscribing, and accessing the queue accessor.
Did you run into a roadblocker or design flaw?
thanks
hey @lach1010 sorry to hear that the existing model isn't intuitive enough. Thanks for bringing the dynamic channels missing from docs to my attention, I'll add it in.
As for this change - there were no roadblocks or design flaws. While I think that the change proposed here could make a bit more sense, I personally had no issues using the channels in the existing format hence I had no incentive to continue pushing this change forward seeing that the author of this issue didn't follow up and hasn't replied to my last three messages.
If you'd find it beneficial, I just merged the master into this branch - please give a review of the https://github.com/Voyz/ibind/tree/subscription-overhaul and v0.2.0rc2 and let me know if it works for you 👍
Hey @Voyz thanks for getting the current code in a reviewable state so quickly!
As owner, the API is of course totally up to you, but my personal opinion is that optional keyword arguments, should be used only where they are truely optional (so far as this is possible). I like the code to be self documenting, and to prevent consumers of the API from providing invalid combinations of parameters. As a result I'm not a huge fan of the optional conid keyword parameter.
Moreover if you are considering make a material breaking change to the ws api like this then id suggest we move closer to the IBKR terminology not further away given that our main source of API reference is their own.
They document the websocket api as having the following nomenclature - which is consistent with what I have seen in other projects:
""" TOPIC+[TOPIC_TARGET]+{PARAMETERS}
where:
TOPIC is the identifier of the topic (the type of data) to be subscribed. The plus symbol + is used as a separator of the message elements.
TOPIC_TARGET identifies a specific resource, typically an account or instrument, as the subject of the subscription. Certain topics do not use a target.
{PARAMETERS} is a JSON-formatted string that describes filters or other modifiers to the topic subscription. If no parameters are available for the topic, or none are desired for your subscription, this is sent as an empty {} object. """
I think it would be nice to move towards topics as the main identifier rather then key. But I'm sure you have intentionally gone down the "key" route as I noticed "topic" being referenced in a comment in the diff.
The other comment here would be that because this would already be a breaking change (hence v0.2 RC), id propose changing the subscription back to what I see was initially attempted, whereby subscription objects are passed in. My suggestion here would be to have multiple classes for the subscriptions rather than one. These would replace the existing seperate public enums members. This way the public api could be self documenting - each class would only implement parameters relevant to the specific topic/channel.
Eg the simplified constructors could be like below:
class MarketDataSubscription(self, conid: str, fields: Iterable[str]) -> None: ...
vs
class ProfitAndLossSubscription(self) -> None: ...
In these examples the required parameters are clear.
The base class would hold the logic to generate the json string from the class contents:
MarketDataSubscription to_json(self) -> str: return super().to_json(topic=self.topic, target=self.conid, params={"fields":list(self.fields)})
ProfitAndLossSubscription to_json(self) -> str: return super().to_json(topic=self.topic, target=None, params=None)
where self.topic is a predefined class variable.
this concept would be extended reflect full websocket api
If that's all a bit involved, and we are concerned it would be to much effort for users to switch to then we can settle for a clarification in the docs (maybe provide concrete examples for each topic) as well as change the current "channel" nomenclature such that the arguments to get a queue accessor and subscribe, have different names, reflecting their subtlely different requirements.
Thanks for your consideration!
Thanks for your thoughts @lach1010, let me address these:
optional keyword arguments, should be used only where they are truely optional (so far as this is possible)
I think there's a middle ground where keyword arguments are sometimes optional and sometimes required, but I understand your point and I think it's a good philosophy to have in mind - thanks for sharing it.
I think it would be nice to move towards topics as the main identifier rather then key.
I agree! Nice suggestion. I think it would be useful to clarify the reasoning behind some choices here - the WebSocket client has originated from one I've had for a different broker which indeed called things 'channels' and 'keys'. But that's just an etymological anecdote, I agree we should rename these to try to stick to IBKR's nomenclature where possible.
My suggestion here would be to have multiple classes for the subscriptions rather than one. These would replace the existing seperate public enums members.
That's not a bad idea either. The only problem here is the one I describe earlier in this issue: we still need a topic-only identifier for when we consume the data. Even for channels that require conid to be specified, IBKR doesn't always send conid back in messages, meaning that we cannot always identify exactly which topic+conid subscription it came from. As a result, all messages from a particular topic are fed into a single queue, and it is user's responsibility to decide how to filter that data later.
Hence, while we could have class based subscriptions for subscribing, we still would need to have some kind of identifiers - in current version these are the IbkrWsKey enums - that uniquely identify the topic, but without suggesting that they have anything to do with a conid or a specific subscription.
We could pass the new subscription class, eg MarketDataSubscription instead of passing the IbkrWsKey when acquiring a QueueAccessor, however I'm conscious of how this could mistakenly suggest that only messages for that exact subscription will be returned, when that's not true. Let me demonstrate:
s1 = MarketDataSubscription(conid=1)
s2 = MarketDataSubscription(conid=2)
ws_client.subscribe(s1)
ws_client.subscribe(s2)
qa1 = ws_client.new_queue_accessor(s1) # passing the subscription class for conid 1, expecting data only for conid 1
qa2 = ws_client.new_queue_accessor(s2) # passing the subscription class for conid 2, expecting data only for conid 2
I think it would be reasonable for user to assume that this is the way to consume data, but it is incorrect - we don't always know which conids the data messages should be associated with. Instead the user should do:
qa = ws_client.new_queue_accessor(s1) # or s2, it's equivalent
... # add logic to sort messages in the qa between messages belonging to conid 1 and conid 2
That is the confusion I'm worried about. Using a IbkrWsKey, eg.:
qa = ws_client.new_queue_accessor(IbkrWsKey.MARKET_DATA)
doesn't suggest that the QueueAccessor will send us data only for one particular subscription.
Given that is the case, I'm trying to analyse what benefit would the new subscription classes really bring. Changing:
s1 = {'key': IbkrWsKey.MARKET_DATA, 'conid': 1}
ws_client.subscribe(**s1)
qa = ws_client.new_queue_accessor(s1['key'])
...
to
s1 = MarketDataSubscription(conid=1)
ws_client.subscribe(s1)
qa = ws_client.new_queue_accessor(s1.key)
...
These subscription classes would be a simple in-place replacement for dictionaries or parameters, in which I cannot see any inherent flaw at the moment. I'm not sure there's a clear benefit here. If you think there is one, can you help me see it? Is it the point you brought earlier, of removing ambiguity for when conid is optional and when it isn't?
Hey @Voyz
Thanks for your thoughts.
Agree with everything you have mentioned.
2 things
-
I haven't actually run the ws client yet (writing the rest of my system atm). If there's no conid then how should I be identifying what data is for what securities?
-
Yep, given the limitation in 1, the value is just the explicit self documenting of the API. There are more parameters across the WebSocket api that could be included, not just the target. This would be inline with the IBind rest API where parameters are provided as arguments to functions. If there's a reason they should stay as being passed as a generic dictionary then let's stick with existing structure, and just consider docs and channel vs key vs topic.
Thanks mate!
Ps enjoy those coffees mate, well earned!
I haven't actually run the ws client yet (writing the rest of my system atm). If there's no conid then how should I be identifying what data is for what securities?
Great question @lach1010 - currently for the topics that don't send it you can't. I think IBKR should be the one to address this. I can't recall the exact topics - I think it was one of the market data ones - but once you run into this, feel free to send them a support ticket in this regard.
If there's a reason they should stay as being passed as a generic dictionary then let's stick with existing structure, and just consider docs and channel vs key vs topic.
While I appreciate the self-documenting and non-ambiguous optional arguments you bring up, I can think of two arguments against the class-based interface:
-
My main argument is that I feel that every change needs a good reason. I'm not sure if the benefit behind introducing self-documenting parameters change would be sufficient enough - even more so as it is a breaking change. To me that improvement - while I honestly think it could make things better - is more of a second-class, "oh it would be nice to have" type, rather than concrete and unambiguously necessary.
Either interface - existing or the proposed one - would need to be learned, and while I agree that self-documenting one would be easier to some extent, I don't think the improvement is sufficient enough to justify introducing it, rewriting tests, updating the WiKis, asking the users to relearn it, and handling the version discrepancy in the future.
-
We don't lock the API in case IBKR would change things up. Over the past month I've had two users come back with 'this endpoint doesn't exist!' comment, which to me demonstrates that IBKR do change their API unannounced and there's a need for some way to override the default behaviour of IBind's API.
Passing a dictionary solves handles that elegantly, as we don't suggest that our API/documentation is the source of truth. No, we encourage the user to go to the IBKR docs and learn the correct parameters there, whatever they may currently be.
I have to admit that this second point is slightly less convincing even to me, as this is exactly something we do in the REST API, where parameters for methods are indeed concretely specified. So yeah, mainly the first point I brought up is what's convincing me here. If there are other arguments for class-based interface, I may see it differently though and I'm open to the discussion.
(Ah the coffees were from you? Nice, thanks! Actually having one as I write this cheers! 🥳)
@Voyz
Yeah I agree and figured a breaking change was going to be a bold call.
Only mentioned the class based change given the existing proposed API change in the branch and RC moving to v0.2
If you could just make note of the way each socket topic should be called in the Wiki that would be great!
Thanks mate.
Happy to close.
Only mentioned the class based change given the existing proposed API change in the branch and RC moving to v0.2
@lach1010 And I appreciate you bringing it up! It was productive to spend some time thinking about this interface and how to best bite it. I think your solution could likewise be totally valid, and I'll keep it in mind should it turn out to be useful in the future.
If you could just make note of the way each socket topic should be called in the Wiki that would be great!
You mean to outline its parameters?
Happy to close.
Just to clarify - are you saying to continue with the 0.2.0 change, introducing conid as parameters as opposed to dynamic topic string, or are you saying to forfeit this change completely?
Hey @Voyz
I wouldn't add conid. Rather if you did want to make a small change here it could be to have a target parameter instead, as would be used in say the account summary topic for the account id.
I'd be against either as it's a small change that doesn't in my opinion provide enough value (as not all topics use targets) despite being a breaking change.
I think just a clarification in the docs that
1 - some topics also have targets, which should be appended to the topic in the standard format. And maybe list those affected with the associated target.
2 - queue accessors are accessed by topic only given the issue with responses missing target information.
Would be enough for now. If the web api improves with more topics in the future then I would suggest reconsidering the class design for clarity and constancies sake with the REST client.
@lach1010 understood, thanks 👍 Can you have a quick look at this section and see if it serves the need you're describing? https://github.com/Voyz/ibind/wiki/Ibkr-Ws-Client#conid-in-channels Please ignore, for the time being, the wording 'channel' instead of 'topic' - I'm gonna need to sit down to that rewrite when I find a moment.
Ah looks like Pydantic was brought up here, would like to link to https://github.com/Voyz/ibind/issues/105
Would love your thoughts @teodorkostov in that feature request.