usg-kpn-ftth icon indicating copy to clipboard operation
usg-kpn-ftth copied to clipboard

Provision loop after initial force provision and reboot

Open jschilperoord opened this issue 4 years ago • 11 comments

This place is probably my last resort to ask. So bear with me if you can :-)

I am running controller version atag_5.13.29_13635 with and software 4.4.51.5287926 on my USG. My controller is on the internet. So I have to connect my unprovisioned USG to the experiabox. Provision without the gateway-config.json. Copy the scripts. And copy the json file and then force provision. Before I reboot I connect the USG directly to the fiber modem.

After the reboot everything comes online and works like a charm Internet, IPTV and ipv6. But after I apply a config change that requires a provision on the USG it gets stuck in a provision loop and stops talking to the controller. I see errors in the log like:

user.err syslog: ace_reporter.reporter_fail(): Unknown[11]

Anyone seen this behavior?

jschilperoord avatar Jul 23 '20 13:07 jschilperoord

Yes, and I have found no solution. Using a remote controller and a gateway.config.json together (which is specifying two wan interfaces) does not seem to work together.

coolhva avatar Jul 24 '20 20:07 coolhva

I have the same issue here, whenever I do want to exit this provisiong state, I have to manually re-enter the PPPoE settings in the USG directly, this will make the USG realize that it can make a connection. Doing this will however temporarily drop all traffic as you are re-initiating your PPPoE session and then getting the provisioning settings again. After this the USG will show up as being online and connected, however for every change, this needs to be done.

afbeelding This is of course only a band aid and not the fix, I am still searching for a solution as well.

Abchrisabc avatar Jul 30 '20 22:07 Abchrisabc

Well it is better then nothing, thanks for updating and sharing :)

coolhva avatar Sep 08 '20 07:09 coolhva

Thanks for sharing @Abchrisabc. I chose to run a controller on a PCEngines apu2 in my local network. To solve this issue. Thanks for this great config @coolhva 👍 Do we want to keeps this open for future reference. Or should I close?

jschilperoord avatar Oct 21 '20 21:10 jschilperoord

Do we want to keeps this open for future reference. Or should I close?

@jschilperoord keep it open. I just stumbled accross this and have been banging my head against the walls why my USG was going into provision state every now and then. Now I know why...

vertizio avatar Nov 13 '20 08:11 vertizio

Ran into the same issue today when replacing an old USG3P for a new one. It looks like the USG3P is using 127.0.0.1 for DNS after (or already during) provisioning which prevents it's from connecting to the remote cloudkey. Had to stop investigating due to the COVID curfew and temporary put in an experiabox. Will probably place a local cloudkey to resolve this for the client. I am not sure if I will be able to do some further testing, but maybe me noticing the DNS changing will trigger some clever thinking.

rlaarman avatar Feb 07 '21 20:02 rlaarman

Is there by any chance a fix for this already?

My cloud key is in my remote office. My Home connection with USG - provisioned by the cloud key in the office - needs to be able to pick up a gateway.config.json (for KPN IPTV)

StreborStrebor avatar Nov 25 '21 10:11 StreborStrebor

Also very curious if this has been resolved. Since I am running my UniFi Controller in Azure.

tariklehaine avatar Nov 25 '21 12:11 tariklehaine

I've tried a couple of things but haven't been able to fix the core issue yet. In the process of debugging I did create a local workaround that will probably suit nobody but myself, but I'd like to mention anyway.

I created a site on a local webserver with a reverse proxy towards my internet controller (http://lan-ip:8080 will forward everything to http://controller-ip:8080). I then created a cron that invokes an "mca-cli-op info" command via SSH on the USG every 5 minutes. As soon as the status Unknown[11] is detected, it will send an "mca-cli-op set-inform http://lan-ip:8080/inform" to the USG. The USG will then continue provisioning and report a healthy "Connected" in the controller.

sAnexeh avatar Dec 05 '21 08:12 sAnexeh

So, after some time has passed I tried to resolve this issue with the help of Ubiquiti Support. Unfortunately they are unwilling to help out due to using the custom config with config.gateway.json which according to them is unsupported.

It's not a DNS issue, not a routing issue, it doesn't seem to be a MSS Clamping issue. I can see the USG doing a POST on my remote controller. I can see the headers and the x-binary data. The remote controller responds with a 200 OK, but the set-inform fails with Unknown[11].

Because running a cron every 5 minutes doing a check on "mca-cli-op info" seems like a bit over the top, I decided to take a different approach to resolve the issue. Since I'm not using the USG as DNS on my LAN I decided to edit the /etc/hosts on the USG to change the A-record of my remote controller to my LAN proxy by editing the config.gateway.json. I added the following part (where remote-controller.tld has the LAN IP of the webserver that will proxy the traffic to the remote controller):

"system": { "static-host-mapping": { "host-name": { "remote-controller.tld": { "inet": ["192.168.178.2"] } } },

I'm using the remote-controller.tld configured in the controller (Settings -> System -> Advanced -> Inform Host -> Override: remote-controller.tld). It won't work with direct IP as we can't manipulate that. I'm using a simple LAN webserver running on port 8080 that uses mod_rewrite to proxy incoming traffic to my remote controller. The content of the .htaccess:

RewriteEngine on RewriteRule ^(.*)$ http://remote-controller.tld:8080/$1 [P]

The USG will do the set-inform to http://remote-controller.tld:8080/inform (which because of the edited host in /etc/hosts is actually the self hosted webserver on LAN). Because the webserver is not using the USG as DNS, it will resolve the remote-controller.tld to the actual IP of the remote controller there. In my case, the set-inform then succeeds. That's all.

I'm aware this still won't fix it for those not having a webserver locally but in my case this works as the sites I'm using in my controller at least all have some sort of LAN device (QNAP NAS, for example) that is able to run a webserver. If you are using the USG as primary DNS server there are still enough options to get it working, but it might take a little bit more effort.

sAnexeh avatar Mar 27 '23 15:03 sAnexeh

@sAnexeh This solution works like a charm

However, I used Nginx instead of Apache with rewrite. Both options give the same result, so it should not matter which option is used.

Nginx configuration example;

server {
    listen       8080;
    listen  [::]:8080;
    resolver                8.8.8.8;
    location /inform {
        proxy_pass http://remote-controller.domain.tld:8080$1;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

kriegsmanj avatar May 31 '23 12:05 kriegsmanj