thingsboard-edge icon indicating copy to clipboard operation
thingsboard-edge copied to clipboard

[Bug] After migrating from v3.5.1.1 to 3.6.0 edge fails to start

Open AndreMaz opened this issue 1 year ago • 6 comments

Describe the bug

After migrating the TB-Cloud to v3.6.0 (went fine, no errors in the logs) an then migrating the TB-Edge to v3.6.0 I started to see the following error in the logs:

 2023-09-24 14:04:05,945 [SpringApplicationShutdownHook] INFO  o.t.s.s.q.DefaultTbRuleEngineConsumerService - [SequentialByOriginator] Removing consumer for topic: TopicPartitionInfo(topic=tb_rule_engine.sq, tenantId=Optional[13814000-1dd2-11b2-8080], partition=Optional[3], fullTopicName=tb_rule_engine.sq.3, myPartition=true)
Exception in thread "ts-service-ts-callback-25-thread-1" java.util.concurrent.RejectedExecutionException
    at java.base/java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1880)
    at java.base/java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1921)
    at java.base/java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2453)
    at org.thingsboard.server.actors.TbActorMailbox.tryProcessQueue(TbActorMailbox.java:150)
    at org.thingsboard.server.actors.TbActorMailbox.enqueue(TbActorMailbox.java:128)
    at org.thingsboard.server.actors.TbActorMailbox.tell(TbActorMailbox.java:265)
    at org.thingsboard.server.actors.ruleChain.DefaultTbContext.tellNext(DefaultTbContext.java:193)
    at org.thingsboard.server.actors.ruleChain.DefaultTbContext.tellSuccess(DefaultTbContext.java:175)
    at org.thingsboard.rule.engine.telemetry.TelemetryNodeCallback.onSuccess(TelemetryNodeCallback.java:50)
    at org.thingsboard.rule.engine.telemetry.TelemetryNodeCallback.onSuccess(TelemetryNodeCallback.java:43)
    at org.thingsboard.server.service.telemetry.DefaultTelemetrySubscriptionService$4.onSuccess(DefaultTelemetrySubscriptionService.java:441)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1138)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
2023-09-24 14:04:07,463 [SpringApplicationShutdownHook] INFO  o.t.s.a.service.DefaultActorService - Actor system stopped.
2023-09-24 14:04:07,498 [sql-queue-0-cloud events-23-thread-1] INFO  o.t.s.dao.sql.TbSqlBlockingQueue - [Cloud Events] Queue polling was interrupted

Migration logs show that everything is fine

Starting ThingsBoard Edge upgrade ...
  ______    __      _                              ____                               __
 /_  __/   / /_    (_)   ____    ____ _   _____   / __ )  ____   ____ _   _____  ____/ /
  / /     / __ \  / /   / __ \  / __ `/  / ___/  / __  | / __ \ / __ `/  / ___/ / __  /
 / /     / / / / / /   / / / / / /_/ /  (__  )  / /_/ / / /_/ // /_/ /  / /    / /_/ /
/_/     /_/ /_/ /_/   /_/ /_/  \__, /  /____/  /_____/  \____/ \__,_/  /_/     \__,_/
                              /____/

 ===================================================
 :: ThingsBoard Edge PE ::       (v3.6.0EDGEPE)
 ===================================================

Starting ThingsBoard Edge Upgrade from version 3.5.1 ...
Upgrading ThingsBoard from version 3.5.1 to 3.6.0 ...
Updating schema ...
relation "idx_edge_event_id" already exists, skipping
Schema updated to version 3.6.0.
Updating data from version 3.5.1 to 3.6.0 ...
Integration rate limits updater: 0 total entities updated.
Starting edge events migration - adding seq_id column. Can be skipped with TB_SKIP_EDGE_EVENTS_MIGRATION env variable set to true
Tenants edge full sync required updater: 1 total entities updated.
Updating schema ...
relation "entity_group" already exists, skipping
relation "converter" already exists, skipping
relation "integration" already exists, skipping
relation "scheduler_event" already exists, skipping
relation "blob_entity" already exists, skipping
relation "role" already exists, skipping
relation "group_permission" already exists, skipping
relation "device_group_ota_package" already exists, skipping
relation "converter_debug_event" already exists, skipping
relation "integration_debug_event" already exists, skipping
relation "raw_data_event" already exists, skipping
relation "white_labeling" already exists, skipping
relation "idx_entity_group_by_type_name_and_owner_id" already exists, skipping
relation "idx_converter_external_id" already exists, skipping
relation "idx_integration_external_id" already exists, skipping
relation "idx_role_external_id" already exists, skipping
relation "idx_entity_group_external_id" already exists, skipping
relation "idx_converter_debug_event_main" already exists, skipping
relation "idx_integration_debug_event_main" already exists, skipping
relation "idx_raw_data_event_main" already exists, skipping
Schema updated.
Installing SQL DataBase schema views and functions: schema-views-and-functions.sql
Successfully executed query: DROP VIEW IF EXISTS device_info_view CASCADE;
Successfully executed query: CREATE OR REPLACE VIEW device_info_view AS SELECT * FROM device_info_active_attribute_view;
Updating data ...
Upgrade finished successfully!

AndreMaz avatar Sep 24 '23 14:09 AndreMaz

@AndreMaz

Hello, could you please attach the complete TB Edge log file starting from the application's initiation? Additionally, could you please check PostgreSQL logs for any erros?

volodymyr-babak avatar Sep 24 '23 14:09 volodymyr-babak

Hi @volodymyr-babak sorry for the delay.

Checked the logs again and found this:

2023-09-27 08:25:40,443 [grpc-default-executor-0] ERROR o.t.license.client.TbLicenseClient - License Error: ACTIVE_INSTANCES_CAPACITY_EXCEEDED(104) - Active instances capacity exceeded!
2023-09-27 08:25:40,443 [grpc-default-executor-0] ERROR o.t.license.client.TbLicenseClient - Failed to initialize ThingsBoard License Client!
2023-09-27 08:25:40,448 [grpc-default-executor-0] ERROR o.t.s.d.s.BasicSubscriptionService - Failed to init license client
org.thingsboard.license.shared.exception.LicenseException: Active instances capacity exceeded!
2023-09-27 08:25:40,450 [Shutdown Thread] INFO  o.t.s.d.s.BasicSubscriptionService - Terminating application due to critical License Error ACTIVE_INSTANCES_CAPACITY_EXCEEDED(104), exit code [-1]...

The weird part is that I literally only have one TB-edge instance so I don't really get how can I exceed the active capacity.

In an attempt to try to fix this: I've deactivated previous instance, removed the instance-edge-license.data and then started the tb-edge again. It created a new instance (active state in the image :point_down:)

image

But TB still complains about the exceeding the capacity.

AndreMaz avatar Sep 27 '23 08:09 AndreMaz

@AndreMaz,

Do you have an account on https://thingsboard-portal.atlassian.net/servicedesk/customer/portals? If so, could you please create a ticket in the TB Service Desk system so we can continue our discussion there? I would like to obtain your license and possibly other information to troubleshoot this problem.

volodymyr-babak avatar Sep 27 '23 09:09 volodymyr-babak

Yep, I have

What the topic that I should choose? The Tech Support? image

AndreMaz avatar Sep 27 '23 09:09 AndreMaz

Yes, Tech Support should be fine.

volodymyr-babak avatar Sep 27 '23 09:09 volodymyr-babak

Done, it's the CP-10857

AndreMaz avatar Sep 27 '23 09:09 AndreMaz