PhoenixGo icon indicating copy to clipboard operation
PhoenixGo copied to clipboard

What's performancedifference after tensorrt enabled?

Open Godady opened this issue 6 years ago • 7 comments

As Nvidia said the tensorrt GIE would improve much performance of inference. But i've never seen a report about improvement when tensorRT applied in Gtx 10xx GPUs, especially Gtx 1080ti. Could anyone tell me how much performance of PhoenixGo gained when tensorrt enabled versus tensorRT disabled in 1080Ti.

Godady avatar May 17 '18 13:05 Godady

After tensorrt enabled, the usage of a 1080ti raises from ~75% to ~95%, so I think it works fine for 1080ti.

godmoves avatar May 17 '18 14:05 godmoves

30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.

wodesuck avatar May 18 '18 07:05 wodesuck

How to enable this tensortt gie to raise usage?

baduk1 avatar Jun 17 '18 23:06 baduk1

@godmoves does it win ELF weights on 95%?

baduk1 avatar Jun 18 '18 04:06 baduk1

@godmoves https://www.youtube.com/watch?v=xboKiwywEfM (2:00) My current performance is 30% on these settings: What should I change to get 75%? num_eval_threads: 2 num_search_threads: 12 max_children_per_node: 512 max_search_tree_size: 2000000000 timeout_ms_per_step: 20000 max_simulations_per_step: 0 eval_batch_size: 4 eval_wait_batch_timeout_us: 100 model_config { train_dir: "ckpt" } gpu_list: "0,1" c_puct: 2.5 virtual_loss: 1.0 enable_resign: 0 v_resign: -0.9 enable_dirichlet_noise: 0 dirichlet_noise_alpha: 0.03 dirichlet_noise_ratio: 0.25 monitor_log_every_ms: 0 get_best_move_mode: 0 enable_background_search: 0 enable_policy_temperature: 0 policy_temperature: 0.67 inherit_default_act: 1 early_stop { enable: 1 check_every_ms: 100 sims_factor: 1.0 sims_threshold: 2000 } unstable_overtime { enable: 1 time_factor: 0.3 } behind_overtime { enable: 1 act_threshold: 0.0 time_factor: 0.3 } time_control { enable: 1 c_denom: 20 c_maxply: 40 reserved_time: 1.0 }

baduk1 avatar Jun 18 '18 18:06 baduk1

30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.

arround the same on Tesla P100

25 sec/move at 5000 sims/per move with tensorRT 3.0.4 (deb install) on ubuntu 16.04, cuda 9.0, cudnn 7.0.5 (deb installs)

wonderingabout avatar Dec 05 '18 06:12 wonderingabout

15-20% on ubuntu 16.04 LTS with GTX 1060 see the wiki speed benchmark

wonderingabout avatar Dec 14 '18 21:12 wonderingabout