Adding ACL-1.3: Large Scale ACL with TCAM profile
Readme Location: https://github.com/openconfig/featureprofiles/blob/main/feature/acl/otg_tests/acl_large_scale/README.md
Have raised an issue below for the addition of deviation in the script: https://partnerissuetracker.corp.google.com/issues/422165468 https://partnerissuetracker.corp.google.com/issues/423896542
For other issues: https://partnerissuetracker.corp.google.com/issues/416164360
Logs attached: https://partnerissuetracker.corp.google.com/issues/415458482
Pull Request Functional Test Report for #4306 / f9428bc9c710c5b55aaf30d8f77be39c41c84953
Virtual Devices
| Device | Test | Test Documentation | Job | Raw Log |
|---|---|---|---|---|
| Arista cEOS | ACL-1.3: Large Scale ACL with TCAM profile |
|||
| Cisco 8000E | ACL-1.3: Large Scale ACL with TCAM profile |
|||
| Cisco XRd | ACL-1.3: Large Scale ACL with TCAM profile |
|||
| Juniper ncPTX | ACL-1.3: Large Scale ACL with TCAM profile |
|||
| Nokia SR Linux | ACL-1.3: Large Scale ACL with TCAM profile |
|||
| Openconfig Lemming | ACL-1.3: Large Scale ACL with TCAM profile |
Hardware Devices
| Device | Test | Test Documentation | Raw Log |
|---|---|---|---|
| Arista 7808 | ACL-1.3: Large Scale ACL with TCAM profile |
||
| Cisco 8808 | ACL-1.3: Large Scale ACL with TCAM profile |
||
| Juniper PTX10008 | ACL-1.3: Large Scale ACL with TCAM profile |
||
| Nokia 7250 IXR-10e | ACL-1.3: Large Scale ACL with TCAM profile |
Pull Request Test Coverage Report for Build 20102132183
Details
- 0 of 98 (0.0%) changed or added relevant lines in 4 files are covered.
- 4 unchanged lines in 1 file lost coverage.
- Overall coverage decreased (-0.01%) to 10.034%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| internal/deviations/deviations.go | 0 | 9 | 0.0% |
| internal/cfgplugins/bgp.go | 0 | 16 | 0.0% |
| proto/metadata_go_proto/metadata.pb.go | 0 | 33 | 0.0% |
| internal/cfgplugins/policyforwarding.go | 0 | 40 | 0.0% |
| <!-- | Total: | 0 | 98 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| proto/metadata_go_proto/metadata.pb.go | 4 | 0.0% |
| <!-- | Total: | 4 |
| Totals | |
|---|---|
| Change from base Build 20090883156: | -0.01% |
| Covered Lines: | 2227 |
| Relevant Lines: | 22195 |
💛 - Coveralls
@ASHNA-AGGARWAL-KEYSIGHT - Can you please fix the CI/CD failures "Go/static analysis failures" so it can be validated?
@ASHNA-AGGARWAL-KEYSIGHT - Can you please fix the CI/CD failures "Go/static analysis failures" so it can be validated?
Fixed the issues
@ASHNA-AGGARWAL-KEYSIGHT - Can you please fix the CI/CD failures "Go/static analysis failures" so it can be validated?
Fixed the issues
I see the issue still there.. the ci/cd checks are not yet passing fully.. Go / test (pull_request) is failing
@ASHNA-AGGARWAL-KEYSIGHT - Validation has failed on google environment and i have attached the test logs to the bug https://partnerissuetracker.corp.google.com/issues/415458482; There are couple of issues.
--- FAIL: TestAclLargeScale (6135.16s)
--- PASS: TestAclLargeScale/ACL-1.1.1_-_ACL_IPv4_Address_scale (2167.23s)
--- PASS: TestAclLargeScale/ACL-1.1.2_-_ACL_IPv6_Address_scale (2193.16s)
--- FAIL: TestAclLargeScale/ACL-1.2.1_-_ACL_IPv4_Address_scale_using_prefix-list (878.78s)
--- FAIL: TestAclLargeScale/ACL-1.2.2_-_ACL_IPv6_Address_scale_using_prefix-list (863.79s)
- The test with prefix-lists are failing for ipv4, ipv6 address family
- The IPv4 access lists has only "permit ip any any" which should not be the case.. We really need to have some ip configured. Have attached the access list it is created on the device while running the test to the bug.
- The test is taking too long to complete. This needs to be debugged. How much time it took for you to run end to end.
@ASHNA-AGGARWAL-KEYSIGHT - Validation has failed on google environment and i have attached the test logs to the bug https://partnerissuetracker.corp.google.com/issues/415458482; There are couple of issues.
--- FAIL: TestAclLargeScale (6135.16s) --- PASS: TestAclLargeScale/ACL-1.1.1_-_ACL_IPv4_Address_scale (2167.23s) --- PASS: TestAclLargeScale/ACL-1.1.2_-_ACL_IPv6_Address_scale (2193.16s) --- FAIL: TestAclLargeScale/ACL-1.2.1_-_ACL_IPv4_Address_scale_using_prefix-list (878.78s) --- FAIL: TestAclLargeScale/ACL-1.2.2_-_ACL_IPv6_Address_scale_using_prefix-list (863.79s)
- The test with prefix-lists are failing for ipv4, ipv6 address family
- The IPv4 access lists has only "permit ip any any" which should not be the case.. We really need to have some ip configured. Have attached the access list it is created on the device while running the test to the bug.
- The test is taking too long to complete. This needs to be debugged. How much time it took for you to run end to end.
@ram-mac could you please share which Arista image you have used, as in my setup, I am not seeing the "CLI error msg". For 2 & 3, I will get back to you. Need to check how we can reduce the runtime
@ASHNA-AGGARWAL-KEYSIGHT - Validation has failed on google environment and i have attached the test logs to the bug https://partnerissuetracker.corp.google.com/issues/415458482; There are couple of issues.
--- FAIL: TestAclLargeScale (6135.16s) --- PASS: TestAclLargeScale/ACL-1.1.1_-_ACL_IPv4_Address_scale (2167.23s) --- PASS: TestAclLargeScale/ACL-1.1.2_-_ACL_IPv6_Address_scale (2193.16s) --- FAIL: TestAclLargeScale/ACL-1.2.1_-_ACL_IPv4_Address_scale_using_prefix-list (878.78s) --- FAIL: TestAclLargeScale/ACL-1.2.2_-_ACL_IPv6_Address_scale_using_prefix-list (863.79s)
- The test with prefix-lists are failing for ipv4, ipv6 address family
- The IPv4 access lists has only "permit ip any any" which should not be the case.. We really need to have some ip configured. Have attached the access list it is created on the device while running the test to the bug.
- The test is taking too long to complete. This needs to be debugged. How much time it took for you to run end to end.
@ram-mac could you please share which Arista image you have used, as in my setup, I am not seeing the "CLI error msg". For 2 & 3, I will get back to you. Need to check how we can reduce the runtime
@ASHNA-AGGARWAL-KEYSIGHT - I have attached the logs to the bug415458482, you can check the version and other details from there.
@ASHNA-AGGARWAL-KEYSIGHT - Validation has failed on google environment and i have attached the test logs to the bug https://partnerissuetracker.corp.google.com/issues/415458482; There are couple of issues.
--- FAIL: TestAclLargeScale (6135.16s) --- PASS: TestAclLargeScale/ACL-1.1.1_-_ACL_IPv4_Address_scale (2167.23s) --- PASS: TestAclLargeScale/ACL-1.1.2_-_ACL_IPv6_Address_scale (2193.16s) --- FAIL: TestAclLargeScale/ACL-1.2.1_-_ACL_IPv4_Address_scale_using_prefix-list (878.78s) --- FAIL: TestAclLargeScale/ACL-1.2.2_-_ACL_IPv6_Address_scale_using_prefix-list (863.79s)
- The test with prefix-lists are failing for ipv4, ipv6 address family
- The IPv4 access lists has only "permit ip any any" which should not be the case.. We really need to have some ip configured. Have attached the access list it is created on the device while running the test to the bug.
- The test is taking too long to complete. This needs to be debugged. How much time it took for you to run end to end.
@ram-mac could you please share which Arista image you have used, as in my setup, I am not seeing the "CLI error msg". For 2 & 3, I will get back to you. Need to check how we can reduce the runtime
Hi Ram, Could you please let me know what TCAM profile you are using? Since with traffic policy is enabled in my TCAM, I can apply it to the interface, not facing the issue "Failed to apply policy on Ethernet2/1"
interface Ethernet1/1 description DUT to ATE Port1 traffic-policy input ACL_IPV4_Match_using_prefix_list_prfxv4-1
Also, could you please let me know about the "ACL_IPV4_Match_high_scale_statements", in which the IP should be configured from one of the prefix blocks?
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use?
TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use?
TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
Attached the tcam profile fp_config_tcam.txt
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use? TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
Attached the tcam profile fp_config_tcam.txt
Ok, i ran the test and it again fails with the same issue;
--- FAIL: TestAclLargeScale (5575.62s) --- PASS: TestAclLargeScale/ACL-1.1.1_-ACL_IPv4_Address_scale (1888.81s) --- PASS: TestAclLargeScale/ACL-1.1.2-ACL_IPv6_Address_scale (1910.71s) --- FAIL: TestAclLargeScale/ACL-1.2.1-ACL_IPv4_Address_scale_using_prefix-list (879.02s) --- FAIL: TestAclLargeScale/ACL-1.2.2-_ACL_IPv6_Address_scale_using_prefix-list (865.28s)
The tcam profile attached here has a lot of features enabled other than the ACL ones. We need to figure out which one is the right one to be added and then add the configuration to the test itself via CLI. I think after you add the TCAM profiles you also need to restart the device to take it into effect. Can we add these changes to this PR and let me know, so i can validate it once again.
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use? TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
Attached the tcam profile fp_config_tcam.txt
Ok, i ran the test and it again fails with the same issue;
--- FAIL: TestAclLargeScale (5575.62s) --- PASS: TestAclLargeScale/ACL-1.1.1_-ACL_IPv4_Address_scale (1888.81s) --- PASS: TestAclLargeScale/ACL-1.1.2-ACL_IPv6_Address_scale (1910.71s) --- FAIL: TestAclLargeScale/ACL-1.2.1-ACL_IPv4_Address_scale_using_prefix-list (879.02s) --- FAIL: TestAclLargeScale/ACL-1.2.2-_ACL_IPv6_Address_scale_using_prefix-list (865.28s)
The tcam profile attached here has a lot of features enabled other than the ACL ones. We need to figure out which one is the right one to be added and then add the configuration to the test itself via CLI. I think after you add the TCAM profiles you also need to restart the device to take it into effect. Can we add these changes to this PR and let me know, so i can validate it once again.
Have added the changes in the PR
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use? TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
Attached the tcam profile fp_config_tcam.txt
Ok, i ran the test and it again fails with the same issue; --- FAIL: TestAclLargeScale (5575.62s) --- PASS: TestAclLargeScale/ACL-1.1.1_-ACL_IPv4_Address_scale (1888.81s) --- PASS: TestAclLargeScale/ACL-1.1.2-ACL_IPv6_Address_scale (1910.71s) --- FAIL: TestAclLargeScale/ACL-1.2.1-ACL_IPv4_Address_scale_using_prefix-list (879.02s) --- FAIL: TestAclLargeScale/ACL-1.2.2-_ACL_IPv6_Address_scale_using_prefix-list (865.28s) The tcam profile attached here has a lot of features enabled other than the ACL ones. We need to figure out which one is the right one to be added and then add the configuration to the test itself via CLI. I think after you add the TCAM profiles you also need to restart the device to take it into effect. Can we add these changes to this PR and let me know, so i can validate it once again.
Have added the changes in the PR
@ASHNA-AGGARWAL-KEYSIGHT - Is the test passing with the new Changes? Please share the passlog with the changes. Also can you please resolve the conflicts for this PR?
@ASHNA-AGGARWAL-KEYSIGHT: The TCAM profile used is as below. Can you share which TCAM profile is in use? TCAM PROFILE: hardware counter feature ecn out hardware counter feature ip out layer3 hardware counter feature ip in layer3 ! hardware access-list mechanism tcam !
Attached the tcam profile fp_config_tcam.txt
Ok, i ran the test and it again fails with the same issue; --- FAIL: TestAclLargeScale (5575.62s) --- PASS: TestAclLargeScale/ACL-1.1.1_-ACL_IPv4_Address_scale (1888.81s) --- PASS: TestAclLargeScale/ACL-1.1.2-ACL_IPv6_Address_scale (1910.71s) --- FAIL: TestAclLargeScale/ACL-1.2.1-ACL_IPv4_Address_scale_using_prefix-list (879.02s) --- FAIL: TestAclLargeScale/ACL-1.2.2-_ACL_IPv6_Address_scale_using_prefix-list (865.28s) The tcam profile attached here has a lot of features enabled other than the ACL ones. We need to figure out which one is the right one to be added and then add the configuration to the test itself via CLI. I think after you add the TCAM profiles you also need to restart the device to take it into effect. Can we add these changes to this PR and let me know, so i can validate it once again.
Have added the changes in the PR
@ASHNA-AGGARWAL-KEYSIGHT - Is the test passing with the new Changes? Please share the passlog with the changes. Also can you please resolve the conflicts for this PR?
Logs location(latestLogsACL1.3): https://partnerissuetracker.corp.google.com/issues/415458482
@ASHNA-AGGARWAL-KEYSIGHT - With the latest changes looks like we are loosing connectivity and also gnmi connectivity with the device. I think we will have to check out why connectivity is getting lost. Logs attached here https://partnerissuetracker.corp.google.com/issues/415458482#91
https://partnerissuetracker.corp.google.com/issues/415458482#91
@ram-mac Are we enabling the timeout while executing the test?
https://partnerissuetracker.corp.google.com/issues/415458482#91
@ram-mac Are we enabling the timeout while executing the test?
@ASHNA-AGGARWAL-KEYSIGHT - Yes, i have given the timeout of 2hours for the test to run. But i suspect the gnmi connectivity is getting effected due to new changes; Earlier at least the connectivity was stable throughout the test.
https://partnerissuetracker.corp.google.com/issues/415458482#91
@ram-mac Are we enabling the timeout while executing the test?
@ASHNA-AGGARWAL-KEYSIGHT - Yes, i have given the timeout of 2hours for the test to run. But i suspect the gnmi connectivity is getting effected due to new changes; Earlier at least the connectivity was stable throughout the test.
Hi Ram,
I have run the scripts multiple times in our setup without encountering any issues. From the attached logs, I noticed that GNMI disconnected after the CLI configuration. However, in my tests, I utilised the existing CLI helper function for configuration. I suspect that the issue may occur when you run a test from your internal framework, which could be causing the connection loss to the DUT. We execute tests directly from the feature profiles.
I suggest we schedule a call so that I can demonstrate how I executed the test, and we can discuss this further..
https://partnerissuetracker.corp.google.com/issues/415458482#91
@ram-mac Are we enabling the timeout while executing the test?
@ASHNA-AGGARWAL-KEYSIGHT - Yes, i have given the timeout of 2hours for the test to run. But i suspect the gnmi connectivity is getting effected due to new changes; Earlier at least the connectivity was stable throughout the test.
Hi Ram,
I have run the scripts multiple times in our setup without encountering any issues. From the attached logs, I noticed that GNMI disconnected after the CLI configuration. However, in my tests, I utilised the existing CLI helper function for configuration. I suspect that the issue may occur when you run a test from your internal framework, which could be causing the connection loss to the DUT. We execute tests directly from the feature profiles.
I suggest we schedule a call so that I can demonstrate how I executed the test, and we can discuss this further..
@ASHNA-AGGARWAL-KEYSIGHT - I know that your environment is different, but there is definitely some issue with the ACL's being configured causing the gnmi connectivity to be lost. Also, I had pointed out another issue where there is delay in applying the configuration. Lets have a call sometime tomorrow
@ASHNA-AGGARWAL-KEYSIGHT - I have sent invite to debug this issue on our setup. Lets identify the issue today.
@ASHNA-AGGARWAL-KEYSIGHT - I have sent invite to debug this issue on our setup. Lets identify the issue today.
@ram-mac as discussed, added the changes and rebased the PR
@ASHNA-AGGARWAL-KEYSIGHT - I have sent invite to debug this issue on our setup. Lets identify the issue today.
@ram-mac as discussed, added the changes and rebased the PR
Thanks @ASHNA-AGGARWAL-KEYSIGHT - I will validate it today and see if there is any progress.
@ASHNA-AGGARWAL-KEYSIGHT - I have sent invite to debug this issue on our setup. Lets identify the issue today.
@ram-mac as discussed, added the changes and rebased the PR
Thanks @ASHNA-AGGARWAL-KEYSIGHT - I will validate it today and see if there is any progress.
@ASHNA-AGGARWAL-KEYSIGHT - The test fails again. I think the issue might be when we are applying the HW TCAM profile the gnmi client connectivity is getting lost. We can ask vendor also to help with the debug.
@ASHNA-AGGARWAL-KEYSIGHT - I have sent invite to debug this issue on our setup. Lets identify the issue today.
@ram-mac as discussed, added the changes and rebased the PR
Thanks @ASHNA-AGGARWAL-KEYSIGHT - I will validate it today and see if there is any progress.
@ASHNA-AGGARWAL-KEYSIGHT - The test fails again. I think the issue might be when we are applying the HW TCAM profile the gnmi client connectivity is getting lost. We can ask vendor also to help with the debug.
@ASHNA-AGGARWAL-KEYSIGHT - This tcam configuration cannot be used to test in google environment. The port-channel interface stays down with this configuration and loosing the connectivity. Please remove it from the test. If there is failure we need to check with Arista on those failures please.
@ASHNA-AGGARWAL-KEYSIGHT - Can you please make the necessary changes as per the discussion regarding the CLI based traffic-policy configuration generation and then run the test
@ASHNA-AGGARWAL-KEYSIGHT - Can you please make the necessary changes as per the discussion regarding the CLI based traffic-policy configuration generation and then run the test
@ram-mac Added the changes. Logs attached: NewChangesACL1.3(https://partnerissuetracker.corp.google.com/issues/415458482) Please let me know if any other changes are required