TDengine
TDengine copied to clipboard
Why use pd.read_sql(sql, create_engine(...)) were blocked when in multi process
Bug Description
Ubuntu20.04,python连接器,使用pd.read_sql(sql, create_engine('taos://user:pass@host:6030/db_name'))
直接读取数据,在多线程的情况下,程序一直卡在此处,Windows下是好的。请问是多线程共享连接的问题吗?如何处理?
多线程开启使用的是:from multiprocessing import Process
只有在主线程调用一次pd.read_sql(sql, create_engine(...))
然后在子线程里再次调用的时候会卡住,如果主线程没有调用过,则正常。
Environment (please complete the following information):
- OS: Ubuntu 20.04
- TDengine Version: 2.6.0.8
参见:https://docs.taosdata.com/reference/connector/python/#%E4%BD%BF%E7%94%A8%E8%BF%9E%E6%8E%A5%E5%99%A8%E5%BB%BA%E7%AB%8B%E8%BF%9E%E6%8E%A5
使用taos.connect(),提示UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy。似乎也没有正常运行
同样是主线程可以正常调用,在子线程里会一直卡在connect的地方
from time import sleep, perf_counter
from threading import Thread
import pandas
import taos
def task():
print('Starting a task...')
conn = taos.connect()
df: pandas.DataFrame = pandas.read_sql("SELECT * FROM test.meters limit 100000", conn)
sleep(1)
print(df)
print('done')
start_time = perf_counter()
# create two new threads
t1 = Thread(target=task)
t2 = Thread(target=task)
# start the threads
t1.start()
t2.start()
# wait for the threads to complete
t1.join()
t2.join()
end_time = perf_counter()
print(f'It took {end_time- start_time: 0.2f} second(s) to complete.')
使用taos.connect(),提示UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy。似乎也没有正常运行
这个只是警告,不影响功能。当然也可以直接用 SQLalchemy create_engine 方法就不会跳出这个警告
明白,我测试使用create_engine 和 taso.connect在windows下都是好的,但是在ubuntu 20上无法正常工作
测试使用您的这段是可以的,区别在于使用的是threading和multiprocessing,我认真看一下这二者有什么区别