SAI
SAI copied to clipboard
Update saistatus.h
These helper Macro's logic are not correct when _WIN32 is not #defined as is the case for all of our Sonic/BRCM based platforms. Therefore all SAI_STATUS_XXX error codes are negative due to "#define SAI_STATUS_CODE(S) (-(S))" operation. This result in incorrect helper error range checking. For example, SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 is defined as value of -0x00030000L and SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX as -0x0003FFFFL. Assume the SAI returns error code of SAI_STATUS_ATTR_NOT_IMPLEMENTED_1 which is -0x00030001L (where SAI code should return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(1)). The old SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(-0x00030001L) macro would result in -0x00040000L == SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 which is not true where SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 is defined as -0x00030000L. The new helper macro will correct this logic. The new logic would work regardless #ifdef _WIN32 is defined or not.
can you add some unittesting to back this up?
The description has given an example which was actually the case we had issue in our local SAI code. By fixing the code logic to the new one presented in this code push, we had then the expected test result. Our changes before was confined to our local SAI implementation and did not made it to this SAI API .h file. Since Google complained about incorrect the SAI status error code issue where they also found out that the SAI should return (SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(atrrib_idx) instead of (SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attrib_idx), so we have to fix this in our SAI code. We have realized that the complete fix also need to include the changes of these helper macros if any one is using them. That led to this code pull request. Not sure what kind of UT you would like to be added here. I can write a stand alone test code to invoke both old and new macros and print out compare results to see which one meet the expectation if that what you want.
i think some test cases checking that "SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(atrrib_idx)" have correct value as expected in saisanitycheck.c
I am not sure what is the required change
Right now our SAI code is written to return for example return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attr_index let's consider attr_index = 1 The result is 0xfffffffffffd0001 When evaluated in SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(x) (((x) & (~0xFFFF)) == SAI_STATUS_ATTR_NOT_IMPLEMENTED_0) this seems to work fine
Do we want to change the usage from return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attr_index to return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(attr_index) ? This is not backward compatible
Currently, both BRCM Sonic/SAI and some other customer experience issue with the old "return (SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attr_index)" logic. For the same example where attr_index ==1, SAI returns (SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + 1), which we would think it represents SAI_STATUS_ATTR_NOT_IMPLEMENTED_1. However, when the return code gets to Sonic caller, the value become -0x0002FFFFL which is SAI_STATUS_INVALID_ATTR_VALUE_MAX, which no longer between SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 ~ SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX. If Sonic caller explicitly check if (sai_status == SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX), it will take the intended SAI_STATUS_ATTR_NOT_IMPLEMENTED_1 as SAI_STATUS_INVALID_ATTR_VALUE_MAX and handle it differently (unexpectedly), even the old help macros works fine. That was why we fixed our SAI return error with SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(attr_index) This fix in SAI would break the old "helper" macros thus the pull request to change the helper macros.
So our main issue that the checks aren't consistent Checking the value is between SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 SAI_STATUS_CODE(0x00030000L) and SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX SAI_STATUS_CODE(0x0003FFFFL) will fail, as you explained Checking using the macro SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(x) (((x) & (~0xFFFF)) == SAI_STATUS_ATTR_NOT_IMPLEMENTED_0) will succeed It will be nice that both checks will work However, this is breaking backward compatibility for all vendors who did, like us, return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attr_index I also think using return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + attr_index is much more native than using return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + SAI_STATUS_CODE(attr_index)
So, option 1 - Not check by range, check only with SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(x) option 2 - modify to SAI_STATUS_CODE(attr_index) Both aren't perfect
Maybe there is better solution ?
You have summarized it perfectly. I would think some SAI provider assumed Positive SAI return code (as we did early) and some may realized later they are actually all Negative values (as we and some other customer found out later) and there would be no control who use old (SAI_STATS_XX + attrib_index) and who use new (SAI_STATS_XX + SAI_STATUS_CODE(attrib_index)) schemes and how the SAI API caller examines the return code (using helper macro or direct value or range checking. Will investigate more.
I agree that consistency is the problem here. With the current implementation, my understanding is that on non-_WIN32 systems SAI_STATUS_IS_INVALID_ATTRIBUTE(SAI_STATUS_NOT_EXECUTED)
would evaluate to true, which seems like it's not the desired behavior here. I'd also generally expect the SAI_STATUS_INVALID_ATTRIBUTE_*
values to be in the range of SAI_STATUS_INVALID_ATTRIBUTE_0
to SAI_STATUS_INVALID_ATTRIBUTE_MAX
.
If the proposed change isn't palatable idea might be to something like:
#define SAI_STATUS_INVALID_ATTR_VALUE_0 SAI_STATUS_CODE(0x00020000L)
#define SAI_STATUS_INVALID_ATTR_VALUE_MAX (SAI_STATUS_INVALID_ATTR_VALUE_0 + 0x0000ffffL)
This would probably break some folks as well (anyone using SAI_STATUS_INVALID_ATTR_VALUE_0 + SAI_STATUS_CODE(i)), and also would require shifting SAI_STATUS_INVALID_ATTRIBUTE_0 to avoid overlap with the lower-numbered attributes.
Probably better would be to somehow discourage direct use of SAI_STATUS_INVALID_ATTR_VALUE_0 and an index, and instead do something like:
#define SAI_STATUS_INVALID_ATTR_VALUE(i) (SAI_STATUS_INVALID_ATTR_VALUE_0 + /*whatever the right behavior is*/)
That all said, I'm fairly sure that there's not a "fix" here that both fixes the headers to match the intent and preserves backward compatibility. My preference would be to make the headers logically consistent, but I understand that this may be disruptive to other folks.
Is this a fix of an issue? Could an issue be filed?
visual aid:
unix
SAI_STATUS_FAILURE -1 0xffffffffffffffff
SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 -196608 0xfffffffffffd0000
SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1 -196607 0xfffffffffffd0001
SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX -262143 0xfffffffffffc0001
SAI_STATUS_CODE(0x00030000L+1) -196609 0xfffffffffffcffff
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(-1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_FAILURE) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_CODE(0x00030000L+1)) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0-1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+SAI_STATUS_FAILURE) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX-1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX-2) 0
win32
SAI_STATUS_FAILURE 1 0x1
SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 196608 0x30000
SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1 196609 0x30001
SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX 262143 0x3ffff
SAI_STATUS_CODE(0x00030000L+1) 196609 0x30001
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(-1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_FAILURE) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_CODE(0x00030000L+1)) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0-1) 0
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+SAI_STATUS_FAILURE) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX-1) 1
SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX-2) 1
seems like this solution is not working properly all the way in any win/unix, take a look at MAX-2 value seems to me that intermediate values should be defined as SAI_STATUS_CODE(x+n) since that how defined are ranges _0 and _MAX, and this is how we define them in saimetadata.h:
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_1 SAI_STATUS_CODE((0x00030000L + 1))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_2 SAI_STATUS_CODE((0x00030000L + 2))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_3 SAI_STATUS_CODE((0x00030000L + 3))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_4 SAI_STATUS_CODE((0x00030000L + 4))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_5 SAI_STATUS_CODE((0x00030000L + 5))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_6 SAI_STATUS_CODE((0x00030000L + 6))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_7 SAI_STATUS_CODE((0x00030000L + 7))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_8 SAI_STATUS_CODE((0x00030000L + 8))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_9 SAI_STATUS_CODE((0x00030000L + 9))
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_10 SAI_STATUS_CODE((0x00030000L + 10))
i actually implemented them like this using the schema for _0 and _MAX, but didn't actually checked IS macro :(
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 SAI_STATUS_CODE(0x00030000L)
#define SAI_STATUS_ATTR_NOT_IMPLEMENTED_MAX SAI_STATUS_CODE(0x0003FFFFL)
seems logical to me that next value should be defined as #define SAI_STATUS_ATTR_NOT_IMPLEMENTED_1 SAI_STATUS_CODE((0x00030000L + 1)), by proposed convention, but this breaks abbility to do just "+ attr_index"
but on unix, then SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED is not actually working for those values :(, and SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1) is working in both win/unix by coincidence, well - by math exactly, but have issues with last MAX values
there is no good fix here, i think proper way would be to fix macros in saistatus.h headers and also fix vendor/sonic implementation where attr indexes are actually using + instead SAI_STATUS_CODE, yea seems ugly, and probably will brake couple things :(
in sairedis those IS macros are not used at all, only place in entire sonic this is used is orchagent, and only 2 places, which are actually not critical, just error logging, error is just error which we can't do much about it, only exception to that is BUFFER_OVERFLOW which actually provides data back
in case of those error ranges, there are 5 of them and they all describe attribute index, which in most cases will be used in sai apis:
- create / bulk create
- get / bulk get
- flush_fdb_entries
- sai_object_type_get_availability
- sai_bulk_get_attribute
there are no other SAI apis that take attr_list, and currently sonic treats error, just as and error, for example if you pass N attribute with some of them can be not implemented, error is just treated now as error, for create, there is no back logic, for example if 4'th attribute is not implemented but is not mandatory, then theoretically we could get rid of that attribute from the list and try create again without it, but currently there is no such logic, and if the value is not supported - currently we don't have logic to compensate that, for example 5th attr enum value is X but is not supported, so maybe we could use Y instead
so right now even thou error codes attr ranges are messed up, it's slides through without making any mess right now in sonic 2 places of usage: https://github.com/Azure/sonic-swss/blob/master/orchagent/portsorch.cpp#L1695 https://github.com/Azure/sonic-swss/blob/master/orchagent/crmorch.cpp#L468
Are there thoughts on how we can close this out?
@rlhui can you pls chk the suggestions from @kcudnik and let us know if the changes can be taken up in the sai/sonic update.
This got a little bit stale, and i have question here: why not make sai_status_t as an regular enum and make it's values positive, where success is 0, and rest is >0, this would be breaking change but it would solve all the issues with windows and unix, and it would also allow to do "SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 + 1" which would be correct, and all the macros SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED would also work. And currently i dont see any harm flipping those values to positive, success in both cases is zero, other value than zero is error, simple ass that. What arguments would be against that?
What was original idea to make sai_status_t negative ?
ping
I don't know the history, though by some conventions Unix-like systems use negative values for error. That said, I don't see a reason these couldn't just all be made positive as suggested above.
only downside of having is as enum would caue this warning:
invalid conversion from âintâ to sai_status_t {aka _sai_status_t} [-fpermissive]
if we do SAI_STATUS_ATTR_NOT_IMPLEMENTED_0+1, but we can keep sai_status_t as int as is, to allow that, but flip values to positive anyway
I think what you're suggesting is changing:
#define SAI_STATUS_CODE(_S_) (-(_S_))
to something like
#define SAI_STATUS_CODE(_S_) (_S_)
and perhaps changing sai_status_t to be UINT32 to prevent issues should the uppermost bit ever be used? Yes, that seems like it should work and would be a the minimum-delta fix.
yes, exactly what im suggesting