PhoenixGo
PhoenixGo copied to clipboard
What's performancedifference after tensorrt enabled?
As Nvidia said the tensorrt GIE would improve much performance of inference. But i've never seen a report about improvement when tensorRT applied in Gtx 10xx GPUs, especially Gtx 1080ti. Could anyone tell me how much performance of PhoenixGo gained when tensorrt enabled versus tensorRT disabled in 1080Ti.
After tensorrt enabled, the usage of a 1080ti raises from ~75% to ~95%, so I think it works fine for 1080ti.
30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.
How to enable this tensortt gie to raise usage?
@godmoves does it win ELF weights on 95%?
@godmoves https://www.youtube.com/watch?v=xboKiwywEfM (2:00) My current performance is 30% on these settings: What should I change to get 75%? num_eval_threads: 2 num_search_threads: 12 max_children_per_node: 512 max_search_tree_size: 2000000000 timeout_ms_per_step: 20000 max_simulations_per_step: 0 eval_batch_size: 4 eval_wait_batch_timeout_us: 100 model_config { train_dir: "ckpt" } gpu_list: "0,1" c_puct: 2.5 virtual_loss: 1.0 enable_resign: 0 v_resign: -0.9 enable_dirichlet_noise: 0 dirichlet_noise_alpha: 0.03 dirichlet_noise_ratio: 0.25 monitor_log_every_ms: 0 get_best_move_mode: 0 enable_background_search: 0 enable_policy_temperature: 0 policy_temperature: 0.67 inherit_default_act: 1 early_stop { enable: 1 check_every_ms: 100 sims_factor: 1.0 sims_threshold: 2000 } unstable_overtime { enable: 1 time_factor: 0.3 } behind_overtime { enable: 1 act_threshold: 0.0 time_factor: 0.3 } time_control { enable: 1 c_denom: 20 c_maxply: 40 reserved_time: 1.0 }
30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.
arround the same on Tesla P100
25 sec/move at 5000 sims/per move with tensorRT 3.0.4 (deb install) on ubuntu 16.04, cuda 9.0, cudnn 7.0.5 (deb installs)
15-20% on ubuntu 16.04 LTS with GTX 1060 see the wiki speed benchmark