wandb icon indicating copy to clipboard operation
wandb copied to clipboard

[CLI]: Wandb finish hangging and 500 Server Error in debug-internal.log

Open airlsyn opened this issue 2 years ago • 21 comments

Describe the bug

After executing run.log({"a": 99.0, "c": 85.0, "custom_step": 1000}, step=None) and subsequently closing it with run.finish(), the process hangs. The following warnings and upload progress messages are displayed(hangs for a long time):

wandb: WARNING No requirements.txt found, not creating job artifact. See https://docs.wandb.ai/guides/launch/create-job
wandb: | 0.001 MB of 0.002 MB uploaded
wandb: / 0.001 MB of 0.002 MB uploaded
wandb: - 0.001 MB of 0.002 MB uploaded
wandb: \ 0.001 MB of 0.002 MB uploaded
wandb: | 0.001 MB of 0.002 MB uploaded
wandb: / 0.001 MB of 0.002 MB uploaded

Additionally, an error is found in debug-internal.log: [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url.

Additional Files

debug-internal.log

2024-01-08 15:40:44,462 INFO    StreamThr :130488 [internal.py:wandb_internal():86] W&B internal server running at pid: 130488, started at: 2024-01-08 15:40:44.462434
2024-01-08 15:40:44,464 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status
2024-01-08 15:40:44,465 INFO    WriterThread:130488 [datastore.py:open_for_write():85] open: ~/monitor_cache/wandb/run-20240108_154044-debug06/run-debug06.wandb
2024-01-08 15:40:44,467 DEBUG   SenderThread:130488 [sender.py:send():382] send: header
2024-01-08 15:40:44,467 DEBUG   SenderThread:130488 [sender.py:send():382] send: run
2024-01-08 15:40:44,470 INFO    SenderThread:130488 [sender.py:_maybe_setup_resume():763] checking resume status for None/unknown_project-0/debug06
2024-01-08 15:40:44,611 INFO    SenderThread:130488 [sender.py:_maybe_setup_resume():841] configured resuming with: ResumeState(resumed=True,step=1,history=3,events=0,output=17,runtime=4,wandb_runtime=4,summary={'_wandb': {'runtime': 4}},config={'gpus': {'desc': None, 'value': 0}, '_wandb': {'desc': None, 'value': {'t': {'1': [1, 55], '2': [1, 55], '3': [5, 13, 14, 16, 23], '4': '3.10.13', '5': '0.16.1', '8': [5, 13], '13': 'linux-x86_64'}, 'framework': 'torch', 'start_time': 1704699597.47977, 'cli_version': '0.16.1', 'is_jupyter_run': False, 'python_version': '3.10.13', 'is_kaggle_kernel': False}}, 'job_id': {'desc': None, 'value': '0'}, 'project_id': {'desc': None, 'value': '0'}, 'project_name': {'desc': None, 'value': 'unknown_project'}},tags=[])
2024-01-08 15:40:44,648 INFO    SenderThread:130488 [dir_watcher.py:__init__():211] watching files in: ~/monitor_cache/wandb/run-20240108_154044-debug06/files
2024-01-08 15:40:44,649 INFO    SenderThread:130488 [sender.py:_start_run_threads():1136] run started: debug06 with start time 1704699640.462201
2024-01-08 15:40:44,651 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: summary_record
2024-01-08 15:40:44,651 INFO    SenderThread:130488 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-01-08 15:40:44,660 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: check_version
2024-01-08 15:40:44,660 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: check_version
2024-01-08 15:40:45,650 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_created():271] file/dir created: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/wandb-summary.json
2024-01-08 15:40:49,661 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:40:49,661 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:40:49,668 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: run_start
2024-01-08 15:40:49,669 DEBUG   HandlerThread:130488 [system_info.py:__init__():32] System info init
2024-01-08 15:40:49,669 DEBUG   HandlerThread:130488 [system_info.py:__init__():47] System info init done
2024-01-08 15:40:49,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: stop_status
2024-01-08 15:40:49,677 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: stop_status
2024-01-08 15:40:49,679 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: internal_messages
2024-01-08 15:40:49,795 DEBUG   SenderThread:130488 [sender.py:send():382] send: telemetry
2024-01-08 15:40:50,650 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_created():271] file/dir created: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/output.log
2024-01-08 15:40:54,796 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:40:58,652 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_modified():288] file/dir modified: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/output.log
2024-01-08 15:41:00,261 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:04,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: stop_status
2024-01-08 15:41:04,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: internal_messages
2024-01-08 15:41:04,678 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: stop_status
2024-01-08 15:41:05,801 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:10,802 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:16,028 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:16,655 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_modified():288] file/dir modified: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/config.yaml
2024-01-08 15:41:19,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: stop_status
2024-01-08 15:41:19,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: internal_messages
2024-01-08 15:41:19,678 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: stop_status
2024-01-08 15:41:21,794 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:22,876 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: partial_history
2024-01-08 15:41:26,877 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:31,878 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:34,678 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: stop_status
2024-01-08 15:41:34,678 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: internal_messages
2024-01-08 15:41:34,678 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: stop_status
2024-01-08 15:41:37,795 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:42,795 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:47,796 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:49,677 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: stop_status
2024-01-08 15:41:49,678 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: stop_status
2024-01-08 15:41:49,720 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: internal_messages
2024-01-08 15:41:53,793 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:41:58,794 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:42:01,963 DEBUG   SenderThread:130488 [sender.py:send():382] send: telemetry
2024-01-08 15:42:01,963 DEBUG   SenderThread:130488 [sender.py:send():382] send: exit
2024-01-08 15:42:01,963 INFO    SenderThread:130488 [sender.py:send_exit():589] handling exit code: 0
2024-01-08 15:42:01,963 INFO    SenderThread:130488 [sender.py:send_exit():591] handling runtime: 76
2024-01-08 15:42:01,964 INFO    SenderThread:130488 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-01-08 15:42:01,965 INFO    SenderThread:130488 [sender.py:send_exit():597] send defer
2024-01-08 15:42:01,965 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,965 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 0
2024-01-08 15:42:01,965 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,965 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 0
2024-01-08 15:42:01,965 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 1
2024-01-08 15:42:01,965 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,965 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 1
2024-01-08 15:42:01,965 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,965 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 1
2024-01-08 15:42:01,966 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 2
2024-01-08 15:42:01,966 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,966 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 2
2024-01-08 15:42:01,966 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,966 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 2
2024-01-08 15:42:01,966 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 3
2024-01-08 15:42:01,966 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,966 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 3
2024-01-08 15:42:01,969 DEBUG   SenderThread:130488 [sender.py:send():382] send: history
2024-01-08 15:42:01,969 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: summary_record
2024-01-08 15:42:01,969 INFO    SenderThread:130488 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-01-08 15:42:01,969 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,970 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 3
2024-01-08 15:42:01,970 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 4
2024-01-08 15:42:01,970 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,970 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 4
2024-01-08 15:42:01,970 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,970 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 4
2024-01-08 15:42:01,970 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 5
2024-01-08 15:42:01,970 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,970 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 5
2024-01-08 15:42:01,970 DEBUG   SenderThread:130488 [sender.py:send():382] send: summary
2024-01-08 15:42:01,971 INFO    SenderThread:130488 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-01-08 15:42:01,971 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,971 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 5
2024-01-08 15:42:01,971 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 6
2024-01-08 15:42:01,971 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,971 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 6
2024-01-08 15:42:01,971 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,971 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 6
2024-01-08 15:42:01,976 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: status_report
2024-01-08 15:42:01,994 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 7
2024-01-08 15:42:01,995 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:01,995 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 7
2024-01-08 15:42:01,995 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:01,995 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 7
2024-01-08 15:42:02,664 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_modified():288] file/dir modified: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/config.yaml
2024-01-08 15:42:02,665 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_modified():288] file/dir modified: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/wandb-summary.json
2024-01-08 15:42:02,963 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: poll_exit
2024-01-08 15:42:03,870 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 8
2024-01-08 15:42:03,870 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: poll_exit
2024-01-08 15:42:03,870 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:03,870 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 8
2024-01-08 15:42:03,871 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:03,871 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 8
2024-01-08 15:42:03,871 INFO    SenderThread:130488 [job_builder.py:build():298] Attempting to build job artifact
2024-01-08 15:42:03,871 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 9
2024-01-08 15:42:03,871 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:03,871 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 9
2024-01-08 15:42:03,871 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:03,871 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 9
2024-01-08 15:42:03,871 INFO    SenderThread:130488 [dir_watcher.py:finish():358] shutting down directory watcher
2024-01-08 15:42:03,963 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: poll_exit
2024-01-08 15:42:04,665 INFO    Thread-12 :130488 [dir_watcher.py:_on_file_modified():288] file/dir modified: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/output.log
2024-01-08 15:42:04,665 INFO    SenderThread:130488 [dir_watcher.py:finish():388] scan: ~/monitor_cache/wandb/run-20240108_154044-debug06/files
2024-01-08 15:42:04,665 INFO    SenderThread:130488 [dir_watcher.py:finish():402] scan save: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/output.log output.log
2024-01-08 15:42:04,666 INFO    SenderThread:130488 [dir_watcher.py:finish():402] scan save: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/config.yaml config.yaml
2024-01-08 15:42:04,668 INFO    SenderThread:130488 [dir_watcher.py:finish():402] scan save: ~/monitor_cache/wandb/run-20240108_154044-debug06/files/wandb-summary.json wandb-summary.json
2024-01-08 15:42:04,670 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 10
2024-01-08 15:42:04,670 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: poll_exit
2024-01-08 15:42:04,670 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:04,673 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 10
2024-01-08 15:42:04,674 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:04,674 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 10
2024-01-08 15:42:04,674 INFO    SenderThread:130488 [file_pusher.py:finish():175] shutting down file pusher
2024-01-08 15:42:04,734 INFO    wandb-upload_0:130488 [upload_job.py:push():131] Uploaded file ~/monitor_cache/wandb/run-20240108_154044-debug06/files/output.log
2024-01-08 15:42:04,748 INFO    wandb-upload_2:130488 [upload_job.py:push():131] Uploaded file ~/monitor_cache/wandb/run-20240108_154044-debug06/files/wandb-summary.json
2024-01-08 15:42:04,773 INFO    wandb-upload_1:130488 [upload_job.py:push():131] Uploaded file ~/monitor_cache/wandb/run-20240108_154044-debug06/files/config.yaml
2024-01-08 15:42:04,964 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: poll_exit
2024-01-08 15:42:04,964 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: poll_exit
2024-01-08 15:42:04,973 INFO    Thread-11 (_thread_body):130488 [sender.py:transition_state():617] send defer: 11
2024-01-08 15:42:04,974 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:04,974 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 11
2024-01-08 15:42:04,974 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:04,974 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 11
2024-01-08 15:42:04,974 INFO    SenderThread:130488 [file_pusher.py:join():181] waiting for file pusher
2024-01-08 15:42:04,974 INFO    SenderThread:130488 [sender.py:transition_state():617] send defer: 12
2024-01-08 15:42:04,974 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: defer
2024-01-08 15:42:04,974 INFO    HandlerThread:130488 [handler.py:handle_request_defer():172] handle defer: 12
2024-01-08 15:42:04,974 DEBUG   SenderThread:130488 [sender.py:send_request():409] send_request: defer
2024-01-08 15:42:04,974 INFO    SenderThread:130488 [sender.py:send_request_defer():613] handle sender defer: 12
2024-01-08 15:42:04,974 INFO    SenderThread:130488 [file_stream.py:finish():595] file stream finish called
2024-01-08 15:42:05,964 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: poll_exit
2024-01-08 15:42:09,746 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:42:10,965 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:15,966 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:18,433 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:42:20,967 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:25,967 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:28,870 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:42:30,968 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:35,969 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:40,970 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:45,058 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:42:45,970 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:50,971 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:42:55,972 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:00,973 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:05,974 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:09,718 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:43:10,974 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:15,975 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:20,976 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:25,976 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:30,977 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:35,978 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:40,979 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:45,980 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:50,980 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:43:51,420 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:43:55,981 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:00,982 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:05,983 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:10,984 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:15,985 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:20,985 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:25,986 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:30,987 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:35,989 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:40,990 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:45,991 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:50,992 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:44:55,993 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:00,994 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:05,995 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:10,996 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:14,691 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:45:15,997 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:20,998 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:25,999 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:31,000 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:36,001 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:41,002 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:46,002 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:51,003 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:45:56,004 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:01,005 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:06,006 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:11,006 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:16,007 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:21,008 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:26,009 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:31,010 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:36,010 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:41,011 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:46,012 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:51,013 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:46:56,014 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:01,014 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:06,015 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:11,016 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:16,017 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:21,018 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:26,019 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:31,020 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:36,022 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:41,023 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:46,024 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:48,346 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}
2024-01-08 15:47:51,025 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:47:56,026 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:01,027 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:06,028 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:11,029 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:16,030 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:21,031 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:26,032 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:31,033 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:36,034 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:41,035 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:46,036 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:51,037 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:48:56,038 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:01,040 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:06,041 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:11,042 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:16,043 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:21,044 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:26,045 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:31,046 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:36,047 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:41,048 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:46,049 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:51,051 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:49:56,052 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:01,053 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:06,055 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:11,056 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:16,058 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:21,059 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:26,060 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:31,061 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:36,062 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:41,063 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:46,064 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:51,065 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:50:56,066 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:51:01,067 DEBUG   HandlerThread:130488 [handler.py:handle_request():146] handle_request: keepalive
2024-01-08 15:52:40,005 WARNING FileStreamThread:130488 [file_stream.py:request_with_retry():668] requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7fe2fd9c2200>>, timeout=180), args: ('https://xxx.yyy.com/files/llm/unknown_project-0/debug06/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_wandb": {"runtime": 76}, "ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}, 'wandb-history.jsonl': {'offset': 3, 'content': ['{"ppl-acc-0-shot/\\u6307\\u4ee3\\u63a8\\u7406": 28.909465020576132, "ppl-rank-rr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 3917.45243318773, "ppl-rank-mrr-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 20.55172431014579, "ppl-rank-map-0-shot/\\u4fe1\\u606f\\u62bd\\u53d6": 21.125217227796206, "ppl-acc-0-shot/\\u6bb5\\u843d\\u7406\\u89e3": 25.708502024291498, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part1": 13.88888888888889, "ppl-acc-0-shot/\\u7406\\u89e3\\u63a8\\u7406-part2": 64.28571428571429, "ppl-acc-0-shot/\\u60c5\\u611f\\u5206\\u7c7b": 41.73913043478261, "ppl-acc-0-shot/\\u8bcd\\u8bed\\u7406\\u89e3": 23.52941176470588, "ppl-acc-0-shot/\\u53e5\\u6cd5\\u8bed\\u6cd5": 21.59090909090909, "ppl-rank-rr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 29460.448915466917, "ppl-rank-mrr-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-rank-map-0-shot/\\u4e3b\\u9898\\u5206\\u7c7b": 18.029650499061763, "ppl-acc-0-shot/boolq": 54.862385321100916, "ppl-acc-0-shot/afqmc": 38.484708063021316, "ppl-acc-0-shot/wic": 48.275862068965516, "ppl-acc-0-shot/eprstmt": 50.0, "ppl-acc-0-shot/ax-b": 54.710144927536234, "ppl-acc-0-shot/ax-g": 50.56179775280899, "ppl-acc-0-shot/rte": 51.985559566786996, "custom_step": 4000, "_timestamp": 1704699682.8760405, "_runtime": 42.41383934020996, "_step": 3}']}}, 'dropped': 0}}


debug.log

2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Current SDK version is 0.16.1
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Configure stats pid to 130268
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Loading settings from ~/.config/wandb/settings
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***', 'base_url': 'https://xxx.yyy.com'}
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_init.py:_log_setup():524] Logging user logs to ~/monitor_cache/wandb/run-20240108_154044-debug06/logs/debug.log
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_init.py:_log_setup():525] Logging internal logs to ~/monitor_cache/wandb/run-20240108_154044-debug06/logs/debug-internal.log
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_init.py:init():564] calling init triggers
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_init.py:init():571] wandb.init called with sweep_config: {}
config: {'project_name': 'unknown_project', 'project_id': '0', 'job_id': '0', 'gpus': 0}
2024-01-08 15:40:44,459 INFO    MainThread:130268 [wandb_init.py:init():614] starting backend
2024-01-08 15:40:44,460 INFO    MainThread:130268 [wandb_init.py:init():618] setting up manager
2024-01-08 15:40:44,461 INFO    MainThread:130268 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2024-01-08 15:40:44,462 INFO    MainThread:130268 [wandb_init.py:init():624] backend started and connected
2024-01-08 15:40:44,464 INFO    MainThread:130268 [wandb_init.py:init():716] updated telemetry
2024-01-08 15:40:44,465 INFO    MainThread:130268 [wandb_init.py:init():749] communicating run to backend with 90.0 second timeout
2024-01-08 15:40:44,646 INFO    MainThread:130268 [wandb_init.py:init():792] run resumed
2024-01-08 15:40:44,660 INFO    MainThread:130268 [wandb_run.py:_on_init():2254] communicating current version
2024-01-08 15:40:49,662 INFO    MainThread:130268 [wandb_run.py:_on_init():2263] got version response
2024-01-08 15:40:49,662 INFO    MainThread:130268 [wandb_init.py:init():800] starting run threads in backend
2024-01-08 15:40:49,677 INFO    MainThread:130268 [wandb_run.py:_console_start():2233] atexit reg
2024-01-08 15:40:49,677 INFO    MainThread:130268 [wandb_run.py:_redirect():2088] redirect: wrap_raw
2024-01-08 15:40:49,677 INFO    MainThread:130268 [wandb_run.py:_redirect():2153] Wrapping output streams.
2024-01-08 15:40:49,677 INFO    MainThread:130268 [wandb_run.py:_redirect():2178] Redirects installed.
2024-01-08 15:40:49,678 INFO    MainThread:130268 [wandb_init.py:init():841] run started, returning control to user process
2024-01-08 15:42:01,962 INFO    MainThread:130268 [wandb_run.py:_finish():1962] finishing run llm/unknown_project-0/debug06
2024-01-08 15:42:01,962 INFO    MainThread:130268 [wandb_run.py:_atexit_cleanup():2202] got exitcode: 0
2024-01-08 15:42:01,962 INFO    MainThread:130268 [wandb_run.py:_restore():2185] restore
2024-01-08 15:42:01,963 INFO    MainThread:130268 [wandb_run.py:_restore():2191] restore done

Environment

WandB version: 0.16.1 WandB local: latest docker image OS: Ubuntu, 18.04.3 LTS (Bionic Beaver) Python version: 3.10 Versions of relevant libraries:

Additional Context

No response

airlsyn avatar Jan 08 '24 08:01 airlsyn

pre:

run = wandb.init(
    group=exp_id,
    project=project_name,
    name=exp_id,
    config={},
    id=exp_id,
    resume="must",
    reinit=True,
}

When log with

{
    "ppl-acc-0-shot/boolq": 54.864385321100916,
    "ppl-acc-0-shot/rte": 51.985959566786996,
}

it looks well. However, wandb.log with

{
    "ppl-acc-0-shot/boolq": 54.864385321100916,
    "ppl/中国": 85.00,
}

error comes out in debug-internal.log

requests_with_retry encountered retryable exception: 500 Server Error: Internal Server Error for url: https://xxx.yy.com/files/llm/project_name/exp_id/file_stream. func: functools.partial(<bound method Session.post of <requests.sessions.Session object at 0x7f6564663130>>, timeout=180), args: ('https://xxx.yy.com/files/llm/project_name/exp_id/file_stream',), kwargs: {'json': {'files': {'wandb-summary.json': {'offset': 0, 'content': ['{"_step": 9000, "_timestamp": 1704765657.049876, "ppl-acc-0-shot/rte": 51.985559566786996, "ppl-acc-0-shot/boolq": 54.86238532110091, "_wandb": {"runtime": 581}, "ppl/\\u4e2d\\u56fd": 85.0}']}, 'wandb-history.jsonl': {'offset': 4, 'content': ['{"ppl-acc-0-shot/rte": 51.985559566786996, "ppl/\\u4e2d\\u56fd": 85.0, "_timestamp": 1704765657.049876, "_runtime": 486.9714159965515, "_step": 9000}']}}, 'dropped': 0}}

airlsyn avatar Jan 09 '24 02:01 airlsyn

WandBer Could you take a look? Thanks very much.

airlsyn avatar Jan 09 '24 02:01 airlsyn

WandB Internal User commented: ericxsun commented: WandBer Could you take a look? Thanks very much.

exalate-issue-sync[bot] avatar Jan 09 '24 18:01 exalate-issue-sync[bot]

hey @ericxsun - regarding the first issue, I haven't been able to reproduce this so far with the code you sent, so I have a few troubleshooting suggestions/questions in the meantime:

  • checking your network connectivity: do you have any specific firewalls in place? is your network using a proxy server?
  • running the following:
traceroute api.wandb.ai
ping -c 10 api.wandb.ai

if you are running wandb locally, replace the url with your localhost

  • if you have network logs, I would love to take a look at those as well

regarding the second issue, are you reaching the hanging upon calling run.log or run.finish()?

umakrishnaswamy avatar Jan 23 '24 13:01 umakrishnaswamy

WandB Internal User commented: umakrishnaswamy commented: hey @ericxsun - regarding the first issue, I haven't been able to reproduce this so far with the code you sent, so I have a few troubleshooting suggestions/questions in the meantime:

  • checking your network connectivity: do you have any specific firewalls in place? is your network using a proxy server?
  • running the following:
traceroute api.wandb.ai
ping -c 10 api.wandb.ai

if you are running wandb locally, replace the url with your localhost

  • if you have network logs, I would love to take a look at those as well

regarding the second issue, are you reaching the hanging upon calling run.log or run.finish()?

exalate-issue-sync[bot] avatar Jan 23 '24 13:01 exalate-issue-sync[bot]

hey @ericxsun - regarding the first issue, I haven't been able to reproduce this so far with the code you sent, so I have a few troubleshooting suggestions/questions in the meantime:

  • checking your network connectivity: do you have any specific firewalls in place? is your network using a proxy server?
  • running the following:
traceroute api.wandb.ai
ping -c 10 api.wandb.ai

if you are running wandb locally, replace the url with your localhost

  • if you have network logs, I would love to take a look at those as well

regarding the second issue, are you reaching the hanging upon calling run.log or run.finish()?

@umakrishnaswamy thanks a lot.

for this, regarding the second issue, are you reaching the hanging upon calling run.log or run.finish()?

is run.log

If we address the problem first in https://github.com/wandb/wandb/issues/3451 (specifically, by adding support for Unicode keys), this issue might be resolved.

airlsyn avatar Jan 23 '24 14:01 airlsyn

@ericxsun - I'll be sure to ask to see if there's any progress on the issue, but like Mo said in the previous thread, this issue was deprioritized by our infra team and it has not been worked on. Appreciate the update and please let me know if you have further questions!

umakrishnaswamy avatar Jan 26 '24 04:01 umakrishnaswamy

@ericxsun - I'll be sure to ask to see if there's any progress on the issue, but like Mo said in the previous thread, this issue was deprioritized by our infra team and it has not been worked on. Appreciate the update and please let me know if you have further questions!

Certainly, thank you. Perhaps your team could provide some hints and allow the community to work together to uncover and solve it? @umakrishnaswamy

airlsyn avatar Jan 26 '24 04:01 airlsyn

hey @ericxsun - the inability to log unicode is a backend issue so unfortunately there are not really any workarounds other than not logging said unicode. please let me know if you're still encountering the other issue in regards to the hanging

umakrishnaswamy avatar Feb 07 '24 04:02 umakrishnaswamy

@ericxsun I tried the following example:

    run = wandb.init()
    run.log(
        {
            "ppl-acc-0-shot/boolq": 54.864385321100916,
            "ppl/中国": 85.00,
        }
    )
    run.finish()

and it works for me know, could you please verify and see if it still and issue for you?

kptkin avatar Feb 13 '24 00:02 kptkin

nd it works for me know, could you please verify and see if it still and issue for you?

@kptkin thx. But I still encountered the identical error despite using client version 0.16.4.dev1 and server version W&B Local 0.49.1.

airlsyn avatar Feb 13 '24 10:02 airlsyn

@ericxsun oh i see you are using a local server it is an issue with the server's support for the 中国 chars like you correctly pointed out. @umakrishnaswamy could check with the server team about their plans to support these encodings in local servers?

kptkin avatar Feb 14 '24 04:02 kptkin

hey @ericxsun - the server team has no concrete plans as of yet to support these encodings but I'd be happy to make a feature request for this on your behalf & will let you know if any decisions are made to support these

umakrishnaswamy avatar Feb 29 '24 00:02 umakrishnaswamy

hey @ericxsun - the server team has no concrete plans as of yet to support these encodings but I'd be happy to make a feature request for this on your behalf & will let you know if any decisions are made to support these

Wow, Thank you so much @umakrishnaswamy

airlsyn avatar Feb 29 '24 01:02 airlsyn

WandB Internal User commented: ericxsun commented:

@ericxsun - I'll be sure to ask to see if there's any progress on the issue, but like Mo said in the previous thread, this issue was deprioritized by our infra team and it has not been worked on. Appreciate the update and please let me know if you have further questions!

Certainly, thank you. Perhaps your team could provide some hints and allow the community to work together to uncover and solve it? @umakrishnaswamy

exalate-issue-sync[bot] avatar Jun 05 '24 18:06 exalate-issue-sync[bot]

WandB Internal User commented: ericxsun commented:

nd it works for me know, could you please verify and see if it still and issue for you?

I still encountered the identical error despite using client version 0.16.4.dev1 and server version W&B Local 0.49.1.

exalate-issue-sync[bot] avatar Jun 05 '24 18:06 exalate-issue-sync[bot]

WandB Internal User commented: ericxsun commented:

hey @ericxsun - the server team has no concrete plans as of yet to support these encodings but I'd be happy to make a feature request for this on your behalf & will let you know if any decisions are made to support these

Wow, Thank you so much @umakrishnaswamy

exalate-issue-sync[bot] avatar Jun 05 '24 18:06 exalate-issue-sync[bot]

Hi, @umakrishnaswamy is there any good news? Thanks a lot.

airlsyn avatar Jul 17 '24 05:07 airlsyn