TDengine icon indicating copy to clipboard operation
TDengine copied to clipboard

Why use pd.read_sql(sql, create_engine(...)) were blocked when in multi process

Open wsf1990 opened this issue 2 years ago • 7 comments

Bug Description Ubuntu20.04,python连接器,使用pd.read_sql(sql, create_engine('taos://user:pass@host:6030/db_name')) 直接读取数据,在多线程的情况下,程序一直卡在此处,Windows下是好的。请问是多线程共享连接的问题吗?如何处理? 多线程开启使用的是:from multiprocessing import Process 只有在主线程调用一次pd.read_sql(sql, create_engine(...))然后在子线程里再次调用的时候会卡住,如果主线程没有调用过,则正常。

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • TDengine Version: 2.6.0.8 image

wsf1990 avatar Aug 01 '22 23:08 wsf1990

参见:https://docs.taosdata.com/reference/connector/python/#%E4%BD%BF%E7%94%A8%E8%BF%9E%E6%8E%A5%E5%99%A8%E5%BB%BA%E7%AB%8B%E8%BF%9E%E6%8E%A5

sangshuduo avatar Aug 02 '22 02:08 sangshuduo

使用taos.connect(),提示UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy。似乎也没有正常运行

wsf1990 avatar Aug 02 '22 02:08 wsf1990

同样是主线程可以正常调用,在子线程里会一直卡在connect的地方

wsf1990 avatar Aug 02 '22 02:08 wsf1990

from time import sleep, perf_counter
from threading import Thread
import pandas
import taos


def task():
    print('Starting a task...')
    conn = taos.connect()
    df: pandas.DataFrame = pandas.read_sql("SELECT * FROM test.meters limit 100000", conn)
    sleep(1)
    print(df)

    print('done')


start_time = perf_counter()

# create two new threads
t1 = Thread(target=task)
t2 = Thread(target=task)

# start the threads
t1.start()
t2.start()

# wait for the threads to complete
t1.join()
t2.join()

end_time = perf_counter()

print(f'It took {end_time- start_time: 0.2f} second(s) to complete.')

sangshuduo avatar Aug 02 '22 03:08 sangshuduo

使用taos.connect(),提示UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy。似乎也没有正常运行

这个只是警告,不影响功能。当然也可以直接用 SQLalchemy create_engine 方法就不会跳出这个警告

sangshuduo avatar Aug 02 '22 03:08 sangshuduo

明白,我测试使用create_engine 和 taso.connect在windows下都是好的,但是在ubuntu 20上无法正常工作

wsf1990 avatar Aug 02 '22 03:08 wsf1990

测试使用您的这段是可以的,区别在于使用的是threading和multiprocessing,我认真看一下这二者有什么区别

wsf1990 avatar Aug 02 '22 03:08 wsf1990