pynab To do with pres but best not to say

Here's what I've managed on my own. I leave it up to those with better progamming skills to do something useful with it and update the repo.

from sqlalchemy import Column, Integer, BigInteger, LargeBinary, Text, String, Boolean, DateTime, ForeignKey, \
    create_engine, UniqueConstraint, Enum, Index, func, and_, exc, event
from pynab.db import db_session, Pre, Release

import datetime
import pytz
import hashlib

def get_hash(pre_time, title, req):
    with db_session() as db:
        title = bytes(title, 'utf-8')
        req = bytes(req, 'utf-8')
        case_title = [title,
                    title.lower(),
                    title.upper()
                    ]
        found = False
        for c in case_title:
            if not found:
                lst = [c,
                       c + req,
                       c + req + req
                       ]

                for a in lst:
                    md5 = hashlib.md5()
                    sha1 = hashlib.sha1()
                    md5.update(a)
                    sha1.update(a)
                    m_query = db.query(Release).filter(((Release.original_name) == md5.hexdigest())).first()
                    if m_query:
                        print("{} {:100s} {:^48s}".format(pre_time, str(a), md5.hexdigest()))
                        found = True
                        break
                    s_query = db.query(Release).filter(((Release.original_name) == sha1.hexdigest())).first()
                    if s_query:
                        print("{} {:100s} {:^48s}".format(pre_time, str(a), sha1.hexdigest()))
                        found = True
                        break
                if found:
                    break

def get_releases(days):

    with db_session() as db:
        # filter set to just lookup a.b.moovee
        p_query = db.query(Pre).filter((Pre.pretime >= (datetime.datetime.now(pytz.utc) - datetime.timedelta(days=days))) & (Pre.requestgroup == "alt.binaries.moovee"))
        p_query = p_query.order_by(Pre.pretime.desc())
        for pre in p_query.all():
            get_hash(pre.pretime, pre.name, str(pre.requestid))

if __name__ == '__main__':
    get_releases(90) # set to search 90 days of pre's

Aug 06 '17 00:08 ukharley

I note this is targeted at moovees, would this apply to teevee?

Sep 28 '17 03:09 gkoh

I believe it would only be needed for moovee group. I read that somewhat recently the moovee posting behavior changed.

Sep 28 '17 14:09 ctero

Right, I found a reddit post explaining those changes, then only a bit later indicating it had reverted.

I ask only due to noticing a number of hashed titles in teevee but they surely don't match anything I can find/generate.

Sep 29 '17 05:09 gkoh

Try this group: alt.binaries.solar-xl Quite a few of the hashed movie releases are ending up in it. Rar and the other processing finds the name most of the time. when it doesn't, the above code will.

Sep 29 '17 09:09 ukharley

@gkoh @ukharley So did these changes actually happened. Does this need to be incorporated?

@ukharley Are you currently using this in postproc or at time of scan?

Sep 30 '17 18:09 brookesy2

I gave this a shot and ended up with an a CPU being soaked for 24 hours seemingly without end (I killed it). I can see what is trying to be done here and had a look at how nzedb does it.

There they generate the comparison hashes during pre entry, so the pre table has a few extra columns. This is then compared and handled during postprocess.

This effectively trades DB space for postprocess CPU time.

Default PostgreSQL has an MD5 function built-in, we could get more hash functions if we require pgcrypto. This would speed up the hash compute time, I assume the Python ones are a bit less efficient.

Oct 03 '17 05:10 gkoh

@brookesy Proof of concept only. I run it manually if and when needed.

Oct 03 '17 14:10 ukharley

pynab pynab copied to clipboard

To do with pres but best not to say

pynab
pynab copied to clipboard