pymysql-benchmarks
pymysql-benchmarks copied to clipboard
Benchmarks of various asynchronous Python MySQL client libraries
This is a simple benchmark for various asynchronous Python MySQL client libraries.
The client libraries tested here are:
- Twisted's adbapi which uses the MySQL client library written in C.
- the pure Python txMySQL asynchronous client library
- my own adb.py interface using tornado and adisp
The adbapi and txMySQL libraries assume they're running on the Twisted framework. The benchmarks allow you to run different reactor implementations:
- Twisted's default reactor
- the Tornado-based reactor for Twisted
The adb.py library is designed to run directly on top of Tornado and uses adisp.py to make asynchronous programming a lot easier.
Here is how you write asynchronous database code with the adb and adisp modules:
from adisp import process from adb import Database
def __init__(self):
self.adb = Database(driver="psycopg2",
host='DATABASE_HOST',
database='DATABASE_DB',
user='DATABASE_USER',
password='DATABASE_PASSWD',
num_threads=3,
tx_connection_pool_size=2,
queue_timeout=0.001)
@process
def someFunctionInvokedFromIOLoop(self):
# Drop the table if it exists
yield self.adb.runOperation("drop table if exists benchmark")
# Create a table to insert the data into
yield self.adb.runOperation("""
create table benchmark (
userid int not null primary key,
data VARCHAR(100)
);
""")
rows_to_insert = 100000
# Insert some rows in parallel
start_time = time.time()
stmts = []
for i in xrange(rows_to_insert):
stmts.append(
("insert into benchmark (userid, data) values (%s, %s)",
(i, i)))
numrows = yield map(self.adb.runOperation, stmts)
end_time = time.time()
rows = yield self.adb.runQuery("select count(*) from benchmark")
print 'inserted %s records, time taken = %s seconds' % \
(rows, end_time - start_time)
You can also use transactions:
@process
def transactions(self):
txId = yield self.adb.beginTransaction()
yield self.adb.runOperation(
"insert into mytable (userid, data) values (%s, %s)",
(1, "test"),
txId)
yield self.adb.commitTransaction(txId)
To rollback a transaction, use rollbackTransaction(txId) instead of commitTransaction().
The command line options for the benchmark are the following:
--db The database to use --dbhost Database host --dbpasswd Database user's password --dbuser Database user to use --pool_size Database connection pool size --use_adb Use adb.py database module --use_adbapi Use twisted's adbapi module --use_tornado Use tornado twisted reactor instead of twisted's reactor --use_txmysql Use txMySQL database module. Only works with Twisted and MySQL
Below are some benchmark results. The benchmark program runs on my Hackintosh Core i7-2600K 3.4GHz (4 cores 8 threads) running MacOS X 10.6.7. The database servers were run on the same machine. The benchmark connects to the databases using TCP sockets instead of Unix domain sockets.
The MySQL server version used was 5.1, while Postgres version was 9.0.4.
The pool size specified was --pool_size=10
Here are the times in seconds reported by the program:
MySQL: Tornado Twisted adb 11.79 N/A adbapi 18.99 19.74 txMySQL 45.19 43.80 (numbers from a previous version of this benchmark)
Postgres: Tornado Twisted adb 9.03 N/A adbapi 16.49 17.00
Conclusion
The new adb.py module running on top of Tornado is 40% faster than Twisted's adbapi. As a bonus, in this benchmark the memory usage of adb was 140MB versus the 300MB used by Twisted's adbapi.
When used with adisp, programming asynchronous database operations with adb and Tornado is a real breeze, since you no longer need to worry about creating callbacks to handle the results from the database.
References
The txMySQL implementation tested against is available in the main trunk at:
https://github.com/hybridlogic/txMySQL
The Tornado-based reactor implementation used is available in the main trunk at:
https://github.com/facebook/tornado