[WIP] IP Add-on Finder: make scans asynchronously
Currently, the IP Add-on Finder scans sequentially: binding by binding, interface by interface. This PR implements scanning in an asynchronous way. Refs: https://github.com/openhab/openhab-core/issues/3936
Initial PR implementation is by @andrewfg, as mentioned in https://github.com/openhab/openhab-core/pull/3943#issuecomment-1943951291, thanks!
This is WIP, we can discuss further improvements in this PR.
@andrewfg DCO check does not allow putting you as the main author when I submit the change, I changed Author and Co-Author to make it pass.
There are some technical questions not yet clarified, e.g. how to make sure that discovery of different add-ons at the same time does not cause problems. Just imagine two add-ons using the same server port, etc... @andrewfg WDYT - is this a realistic scenario, especially considering that not too many bindings will use this finder?
And the implementation before had a 20sec timeout to avoid hick-ups during OH start (when the xml is parsed, setAddonCandidates is called several times). So maybe we should try to delay this as well....
I will have a look into this the next days.
imagine two add-ons using the same server port, etc...
It is not that likely, but you are right that we need to plan for the worst. The current scan task comprises a sender and a listener. I'm not 100% sure if the socket code requires exclusive send/receive access on a port (in which case a second task on the same port would fail due to an access violation), or if it re-uses sockets (SO_REUSE..) (in which case there could be crosstalk between data from two tasks).
the implementation before had a 20sec timeout to avoid hick-ups during OH start
The code cancels prior tasks before initiating new ones. I think the only issue would be if a cancelled task had not yet fully completed being cancelled by the time a new task for the same port would be started. In that case the issue becomes exactly the same as the issue above (two tasks trying to access the same port). However this issue could be resolved if stopScanJobs() would wait for cancelled futures to actually complete. Note the cancel call is with allowInterrupt so it should interrupt the normal socket timeout on not yet completed tasks, so this should complete fast enough.
imagine two add-ons using the same server port
@holgerfriedrich the following code shows how I would synchronise multiple tasks to wait for (say) the same listening port (not tested)..
private static final class PortLock {
private final Set<Integer> lockedPorts = new HashSet<>();
public void lock(int port) {
synchronized (lockedPorts) {
while (!lockedPorts.add(port)) {
try {
lockedPorts.wait(100);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
}
}
public void unlock(int port) {
synchronized (lockedPorts) {
lockedPorts.remove(port);
lockedPorts.notifyAll();
}
}
}
private final PortLock portLock = new PortLock();
private void doIpMulticastScan(AddonInfo candidate, String type, String request, String requestPlain,
String response, int timeoutMs, InetAddress destIp, int destPort, int listenPort, String localIp) {
try {
portLock.lock(listenPort);
..
} finally {
portLock.unlock(listenPort);
}
}
the only issue would be if a cancelled task had not yet fully completed being cancelled by the time a new task for the same port would be started. .. this issue could be resolved if stopScanJobs() would wait for cancelled futures to actually complete
@holgerfriedrich the following code shows how to make stopScanJobs() wait until all tasks have been cancelled (not tested)..
private synchronized void stopScanJobs() {
scanJobs.stream().filter(j -> !j.isDone()).forEach(j -> j.cancel(true));
try {
CompletableFuture.allOf(scanJobs.toArray(new CompletableFuture<?>[scanJobs.size()])).get();
} catch (InterruptedException | ExecutionException e) {
}
scanJobs.clear();
}
I would still urge to implement limiting the number of interfaces being used, similar to what is now done in https://github.com/openhab/openhab-core/pull/4036. I am working on extending the setup wizard to include a step to set the main network, and then allow some time to get the suggestion refreshed.