outline-sdk
outline-sdk copied to clipboard
WIP: Draft PR for improving connectivity testing to capture resolved IPs and all attempts
Note: The code is still messy not ready for merge. I made this PR to discuss upcoming high-level changes to the testing approach and format for connectivity testing pursuant to https://github.com/Jigsaw-Code/outline-sdk/discussions/209#discussioncomment-9216652
I moved the loop
that iterates over ip addresses outside of the intercept dialer. This way in each iteration I am generating a new intercept dialer that takes one of the resolved addresses and use that to continue the end-to-end test.
In this approach, I am allowing each attempt to run end-to-end. The test may fail earlier for example at connect attempt. In that case, it won't continue to the next stage. However each attempt could continue to the end and test for connect/read/write operations for each resolved IP.
We also need to decide if test for all resolved IPs or abort at success.
Format
I am proposing below format that we can use for both udp
and tcp
.
List of suggested format changes:
- Each attempt should include a start-time and duration
- The error captured for each attempt should be a
ConnectivityError
type that includesop
,PosixError
,Error
fields. - I am uncertain about keeping
selected_address
in the main report. It is hard to tell which address this should be populated with. - I am uncertain what
error
field in the main report should be populated with. What if the test makes 3 attempts and all of them fail with different types of errors. Does 'error' then reflect? I guess when one of the attempts succeederror
field should benil
.
TODO: The code is slow and I am planning to optimize it such that tests run in an independent treads in parallel.
TODO: Update the format in the code to match the proposal
{
"resolver": "8.8.8.8:53",
"proto": "udp",
"transport": "ss://[email protected]:65496?prefix=",
"endpoint": {
"host": "pod2.exampledomain.com",
"port": "65496"
},
"attempts": [
{
"address": {
"host": "2606:4700:3032::6815:xxxx",
"port": "65496"
},
"time": "2024-05-02T05:02:13Z",
"duration_ms": 1000,
"error": {
"op": "connect",
"msg": "i/o timeout"
}
},
{
"address": {
"host": "2606:4700:3032::ac43:xxxx",
"port": "65496"
},
"time": "2024-05-02T05:02:13Z",
"duration_ms": 1000,
"error": {
"op": "read",
"msg": "reset"
}
},
{
"address": {
"host": "172.67.138.xx",
"port": "65496"
},
"time": "2024-05-02T05:02:13Z",
"duration_ms": 1000,
"error": {
"op": "connect",
"msg": "i/o timeout"
}
},
{
"address": {
"host": "104.21.38.xx",
"port": "65496"
},
"time": "2024-05-02T05:02:13Z",
"duration_ms": 1000,
"error": {
"op": "connect",
"msg": "i/o timeout"
}
}
],
// maybe we should remove this; what if both IPv4 and IPv6 addresses succeed,
// then which one is the selected address?
"selected_address": {
"host": "104.21.38.199",
"port": "65496"
},
// This would be the start-time of the main test and the duration for the whole test
"time": "2024-05-02T05:02:13Z",
"duration_ms": 20006,
// maybe we should remove this one too or make it a summary of all errors in the all attempts.
// I am not sure what this field would reflect is we have two failed attempts. I guess this should be `nil`
// if at least of the tests pass?
"error": {
"op": "connect",
"msg": "all connect attempts failed. no more addresses to try"
}
}