dmi-tcat icon indicating copy to clipboard operation
dmi-tcat copied to clipboard

Added an index for tweet_id in the _media tables for new bins

Open xmacex opened this issue 7 years ago • 2 comments

Fixes #299 for new bins, by considerably speeding up the

SELECT * FROM the_bin_name_media WHERE tweet_id = 916039373991361504

queries which are done from analysis/mod.export_tweets.php to look up fields from the media table.

Does not solve the issue with already existing bins, for which the same index must be created. I wrote following Python program (improvements welcome of course), but am unsure how to best contribute something similar to TCAT:

"""
Add tweet_id indices to a TCAT database so that export can feasibly be
done with additional fields.
"""

import logging
import sqlalchemy

CONNECTION = "mysql://username:password@hostname/database"

def add_index(table, engine):
    """Add the tweet_id index for table table."""
    if 'tweet_id' not in [idx.name for idx in table.indexes]:
        mindex = sqlalchemy.Index('tweet_id', table.c.tweet_id)
        logging.info("Indexing %s", table)
        return mindex.create(engine)

    logging.info("Already exists for %s", table)
    return None


def add_indices(suffix):
    """Add indices for all tables which have the suffix suffix."""
    logging.info("Adding indices")
    engine = sqlalchemy.create_engine(CONNECTION)
    results = []
    for table_name in engine.table_names():
        if table_name.endswith(suffix):
            meta = sqlalchemy.MetaData()
            table = sqlalchemy.Table(table_name, meta, autoload=True, autoload_with=engine)
            results.append(add_index(table, engine))
    return results


if __name__ == "__main__":
    logging.getLogger().setLevel(logging.INFO)
    print(add_indices("_media"))
    print(add_indices("_mentions"))
    print(add_indices("_hashtags"))

xmacex avatar Feb 15 '18 23:02 xmacex

LOL please someone review my code for the upgrade script. There's a couple of layers of conditional logic and I am not entirely confident about the contexts in which the upgrade script run. I feel that what I propose could be simplified, but I'm sticking close to code I cannibalized from 16110ec. Of course I tested it on my toy installation, and it seems to work, but the upgrade path is long and I didn't test all execution paths though I hope I reasoned myself through all of them... a refactoring opportunity maybe?

I also made an inappropriate change in f0df436 by poking around an earlier 2015 change, but cancelled the change in 21c6a8b after I figured out the logic of the upgrade script, and wrote what I consider a more appropriate change by imitating what the upgrade script seems to be doing, ie. layering changes upon one another. Sorry and thank you 😺

xmacex avatar Feb 18 '18 22:02 xmacex

Thanks @xmacex and sorry for the long time to review!

The code looks solid! Would it be possible to update the upgrade.php, so that your change can be rolled out?

niczem avatar Jul 03 '20 06:07 niczem