acd_cli icon indicating copy to clipboard operation
acd_cli copied to clipboard

Moving cache to SQLAlchemy ORM

Open drouarb opened this issue 8 years ago • 11 comments

All the cache-backend has been rewritten with SQLAlchemy.

Now using SQL URL and working with all compatible databases.

Compatible with old databases.

drouarb avatar Oct 05 '16 00:10 drouarb

Hmm, a few months ago I was considering using the SQLAlchemy core as an SQL query generator and got to a half-working implementation.

I had initially moved away from the SQLAlchemy ORM in 8997b8c3697d1f7b2b31f3c0bb4150004eb1e321 because of performance issues in connection with SQLite.

Could you do me the favour and time a full sync and tree call of a FUSE mount without/with the changes applied (using a database of your choice)?

yadayada avatar Oct 05 '16 04:10 yadayada

I've tested what you asked, and for a sync it's 2 time slower with SQLAlchemy ORM on SQLite database. For the tree it's 6 time slower. I've found how to speed up sync to have the same performances as native SQLite. I will push that soon.

drouarb avatar Oct 05 '16 16:10 drouarb

Hi, what config files should I edit to get acdcli working with mysql?

Saren-Arterius avatar Oct 15 '16 04:10 Saren-Arterius

You need to set the url variable string in the database section of cache.ini, e.g.

[database]
url = mysql://user:pass@localhost/

yadayada avatar Oct 15 '16 04:10 yadayada

I found that $ pip install mysqlclient is required. Also, a database has to be selected, like mysql://user:pass@localhost/acd_cli_nodes

Saren-Arterius avatar Oct 15 '16 05:10 Saren-Arterius

I am unable to read any files nor folders inside mount point: ls: cannot access '/home/saren/ACD': Bad address. $ acd_cli ls works however. I got this exception running acdcli mount in foreground after I changed acd_fuse.py:362 to except Exception as e:, was except OSError as e:.

Traceback (most recent call last):
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1112, in _execute_context
    conn = self.__connection
AttributeError: 'Connection' object has no attribute '_Connection__connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1114, in _execute_context
    conn = self._revalidate_connection()
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 424, in _revalidate_connection
    "Can't reconnect until invalid "
sqlalchemy.exc.InvalidRequestError: Can't reconnect until invalid transaction is rolled back

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.5/site-packages/fuse.py", line 414, in _wrapper
    return func(*args, **kwargs) or 0
  File "/usr/lib/python3.5/site-packages/fuse.py", line 422, in getattr
    return self.fgetattr(path, buf, None)
  File "/usr/lib/python3.5/site-packages/fuse.py", line 668, in fgetattr
    attrs = self.operations('getattr', path.decode(self.encoding), fh)
  File "/usr/lib/python3.5/site-packages/acdcli/acd_fuse.py", line 361, in __call__
    ret = getattr(self, op)(path, *args)
  File "/usr/lib/python3.5/site-packages/acdcli/acd_fuse.py", line 442, in getattr
    node = self.cache.resolve(path)
  File "/usr/lib/python3.5/site-packages/acdcli/cache/query.py", line 43, in resolve
    .filter(Nodes.name == segment).first()
  File "/usr/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2695, in first
    ret = list(self[0:1])
  File "/usr/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2487, in __getitem__
    return list(res)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2795, in __iter__
    return self._execute_and_instances(context)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/orm/query.py", line 2818, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 945, in execute
    return meth(self, multiparams, params)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1121, in _execute_context
    None, None)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
    exc_info
  File "/usr/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1114, in _execute_context
    conn = self._revalidate_connection()
  File "/usr/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 424, in _revalidate_connection
    "Can't reconnect until invalid "
sqlalchemy.exc.StatementError: (sqlalchemy.exc.InvalidRequestError) Can't reconnect until invalid transaction is rolled back [SQL: 'SELECT nodes.id AS nodes_id, nodes.type AS nodes_type, nodes.name AS nodes_name, nodes.description AS nodes_description, nodes.created AS nodes_created, nodes.modified AS nodes_modified, nodes.updated AS nodes_updated, nodes.status AS nodes_status \nFROM nodes INNER JOIN parentage ON parentage.child = nodes.id \nWHERE parentage.parent = %s AND nodes.name = %s \n LIMIT %s'] [parameters: [{}]]

Saren-Arterius avatar Oct 15 '16 07:10 Saren-Arterius

I've also tried out this branch, and am unable to see the mounted files.

Installation steps (Ubuntu 16.04):

# install the missing dependencies
apt-get install mysql-client libmysqlclient-dev
sudo pip3 install mysqlclient

# Clone the repo and install ACD
git clone -b sqlalchemy2 https://github.com/drouarb/acd_cli.git /tmp/acd_cli
cd /tmp/acd_cli && sudo python3 setup.py install
acd_cli init

# Add the database config (database previously created)
tee $HOME/.config/acd_cli/cache.ini << EOF
[database]
url = mysql://acdcli:[email protected]/acdcli
EOF

# Initial sync
acd_cli sync

# Mount the drive
mkdir -p /opt/storage/acd
acd_cli -nl mount /opt/storage/acd

After mounting, I can see my folders in /opt/storage/acd, but they have no file attributes, and are coloured like invalid symlinks:

root@localhost:~# ls -al /opt/storage/acd/
ls: cannot access '/opt/storage/acd/Music': No such file or directory
total 4
drwxrwxrw- 1 root root    0 Oct 18 19:09 .
drwxr-xr-x 4 root root 4096 Oct 21 23:44 ..
?????????? ? ?    ?       ?            ? Music

Running the acd_cli ls --recursive and acd_cli tree, I can see the correct list of files. However, attempting to download a file using acd_cli download Music/foo.mp3 fails with 16-10-21 23:49:03.506 [CRITICAL] [acd_cli] - Could not resolve path "Music/foo.mp3".

taeram avatar Oct 21 '16 23:10 taeram

Any update on this PR and when it'll be ready for merging?

natoriousbigg avatar Nov 01 '16 03:11 natoriousbigg

I haven't so much time to work on acd_cli for the moment, I'm cureently working on how to speed up the tree on the fuse.

drouarb avatar Nov 01 '16 10:11 drouarb

Do we think this will be compatible with Amazon Web Service RDS (MySql, Amazon Aurora or postgresql)?

zenjabba avatar Nov 12 '16 16:11 zenjabba

Tried the PR in a freshly created CentOS VM, getting these error. I guess some work on the SQL structure is needed.

Getting changes.................................................................
Inserting nodes...........................
Traceback (most recent call last):
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1159, in _execute_context
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/dialects/mysql/mysqldb.py", line 113, in do_executemany
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 281, in executemany
    self._get_db().encoding)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 318, in _do_execute_many
    rows += self.execute(sql + postfix)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 250, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/connections.py", line 42, in defaulterrorhandler
    raise errorvalue
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 247, in execute
    res = self._query(query)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 411, in _query
    rowcount = self._do_query(q)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 374, in _do_query
    db.query(q)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/connections.py", line 270, in query
    _mysql.connection.query(self, query)
_mysql_exceptions.DataError: (1406, "Data too long for column 'name' at row 138")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/bin/acd_cli", line 9, in <module>
    load_entry_point('acdcli==0.3.2', 'console_scripts', 'acd_cli')()
  File "/usr/lib/python3.4/site-packages/acdcli-0.3.2-py3.4.egg/EGG-INFO/scripts/acd_cli.py", line 1609, in main
    ret = args.func(args)
  File "/usr/lib/python3.4/site-packages/acdcli-0.3.2-py3.4.egg/EGG-INFO/scripts/acd_cli.py", line 729, in sync_action
    return sync_node_list(args.full, args.to_file, args.from_file)
  File "/usr/lib/python3.4/site-packages/acdcli-0.3.2-py3.4.egg/EGG-INFO/scripts/acd_cli.py", line 175, in sync_node_list
    cache.insert_nodes(changeset.nodes, partial=not full)
  File "/usr/lib/python3.4/site-packages/acdcli-0.3.2-py3.4.egg/acdcli/cache/sync.py", line 76, in insert_nodes
    self.insert_files(files)
  File "/usr/lib/python3.4/site-packages/acdcli-0.3.2-py3.4.egg/acdcli/cache/sync.py", line 154, in insert_files
    } for f in c if f['id'] not in update])
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 2055, in execute
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 945, in execute
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1189, in _execute_context
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/util/compat.py", line 202, in raise_from_cause
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/util/compat.py", line 185, in reraise
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1159, in _execute_context
  File "/usr/lib/python3.4/site-packages/SQLAlchemy-1.1.4-py3.4-linux-x86_64.egg/sqlalchemy/dialects/mysql/mysqldb.py", line 113, in do_executemany
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 281, in executemany
    self._get_db().encoding)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 318, in _do_execute_many
    rows += self.execute(sql + postfix)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 250, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/connections.py", line 42, in defaulterrorhandler
    raise errorvalue
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 247, in execute
    res = self._query(query)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 411, in _query
    rowcount = self._do_query(q)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/cursors.py", line 374, in _do_query
    db.query(q)
  File "/usr/lib64/python3.4/site-packages/MySQLdb/connections.py", line 270, in query
    _mysql.connection.query(self, query)
sqlalchemy.exc.DataError: (_mysql_exceptions.DataError) (1406, "Data too long for column 'name' at row 138") [SQL: 'INSERT INTO nodes (id, type, name, description, created, modified, updated, status) VALUES (%s, %s, %s, %s, %s, %s, %s, %s)'] [parameters: (('_79dox56T_G4qdty2QtJCA', 'file', 'TdhNfQ5klL8BfMzJUBuUWu,C', None, datetime.datetime(2016, 9, 23, 3, 10, 58, 234000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 58, 479000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 747155), 'AVAILABLE'), ('BeaZg4fYSf-8n5uwbUcuRA', 'file', 'krFKE315PIrtPH--N3Sc37Iy', None, datetime.datetime(2016, 9, 23, 3, 10, 58, 519000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 58, 737000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 754891), 'AVAILABLE'), ('2yh7KmlQS9iztKibQffI-w', 'file', 'OM9SS4BYghwh8eKAkyq4xz9a', None, datetime.datetime(2016, 9, 23, 3, 10, 58, 691000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 58, 905000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 758773), 'AVAILABLE'), ('BKGOsvcdTkKjDXWlfBfR9w', 'file', 'j6VvX62fMkUROPjYMYVz9ktP', None, datetime.datetime(2016, 9, 23, 3, 10, 58, 903000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 59, 155000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 768645), 'AVAILABLE'), ('aRjVGbADQHazJFvDJmJQ-A', 'file', 'XL9jXPzkMzUzmo6L0BuTP4AH', None, datetime.datetime(2016, 9, 23, 3, 10, 59, 109000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 59, 418000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 774647), 'AVAILABLE'), ('rIRRDdGvRWON6ZoXAA2c2w', 'file', 'o9UgitHKhDNqeRsRqgk,JuQq', None, datetime.datetime(2016, 9, 23, 3, 10, 59, 336000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 10, 59, 580000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 782188), 'AVAILABLE'), ('f_qMjZB6Tj6VFkJDgjChaA', 'file', '2,2P5Vlhm0wqOpJYco3Y,5BJ', None, datetime.datetime(2016, 9, 23, 3, 10, 59, 593000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 11, 0, 312000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 783755), 'AVAILABLE'), ('-_pS8Pf8QDuGepD_0zghzQ', 'file', '4SP6xz1ybxKvidY5ez3iNJ3K', None, datetime.datetime(2016, 9, 23, 3, 11, 0, 199000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 11, 0, 429000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 31, 793966), 'AVAILABLE')  ... displaying 10 of 900 total bound parameter sets ...  ('NuewfkMCSFGPon2TCZ8Z6A', 'file', '1mKOSshHbZEMUwJu5mDAYxxK', None, datetime.datetime(2016, 9, 23, 3, 18, 41, 570000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 18, 42, 135000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 33, 991023), 'AVAILABLE'), ('oHmC29YvRu6gTe01vGw6jw', 'file', '-mhuUEH5KECnUZcCCWpq9rIh', None, datetime.datetime(2016, 9, 23, 3, 18, 41, 59000, tzinfo=tzlocal()), datetime.datetime(2016, 9, 23, 3, 18, 42, 326000, tzinfo=tzlocal()), datetime.datetime(2016, 11, 16, 14, 50, 33, 993007), 'AVAILABLE'))]```

christianreiss avatar Nov 16 '16 14:11 christianreiss