pynab icon indicating copy to clipboard operation
pynab copied to clipboard

To do with pres but best not to say

Open ukharley opened this issue 7 years ago • 7 comments

Here's what I've managed on my own. I leave it up to those with better progamming skills to do something useful with it and update the repo.

from sqlalchemy import Column, Integer, BigInteger, LargeBinary, Text, String, Boolean, DateTime, ForeignKey, \
    create_engine, UniqueConstraint, Enum, Index, func, and_, exc, event
from pynab.db import db_session, Pre, Release

import datetime
import pytz
import hashlib

def get_hash(pre_time, title, req):
    with db_session() as db:
        title = bytes(title, 'utf-8')
        req = bytes(req, 'utf-8')
        case_title = [title,
                    title.lower(),
                    title.upper()
                    ]
        found = False
        for c in case_title:
            if not found:
                lst = [c,
                       c + req,
                       c + req + req
                       ]

                for a in lst:
                    md5 = hashlib.md5()
                    sha1 = hashlib.sha1()
                    md5.update(a)
                    sha1.update(a)
                    m_query = db.query(Release).filter(((Release.original_name) == md5.hexdigest())).first()
                    if m_query:
                        print("{} {:100s} {:^48s}".format(pre_time, str(a), md5.hexdigest()))
                        found = True
                        break
                    s_query = db.query(Release).filter(((Release.original_name) == sha1.hexdigest())).first()
                    if s_query:
                        print("{} {:100s} {:^48s}".format(pre_time, str(a), sha1.hexdigest()))
                        found = True
                        break
                if found:
                    break

def get_releases(days):

    with db_session() as db:
        # filter set to just lookup a.b.moovee
        p_query = db.query(Pre).filter((Pre.pretime >= (datetime.datetime.now(pytz.utc) - datetime.timedelta(days=days))) & (Pre.requestgroup == "alt.binaries.moovee"))
        p_query = p_query.order_by(Pre.pretime.desc())
        for pre in p_query.all():
            get_hash(pre.pretime, pre.name, str(pre.requestid))

if __name__ == '__main__':
    get_releases(90) # set to search 90 days of pre's

ukharley avatar Aug 06 '17 00:08 ukharley

I note this is targeted at moovees, would this apply to teevee?

gkoh avatar Sep 28 '17 03:09 gkoh

I believe it would only be needed for moovee group. I read that somewhat recently the moovee posting behavior changed.

ctero avatar Sep 28 '17 14:09 ctero

Right, I found a reddit post explaining those changes, then only a bit later indicating it had reverted.

I ask only due to noticing a number of hashed titles in teevee but they surely don't match anything I can find/generate.

gkoh avatar Sep 29 '17 05:09 gkoh

Try this group: alt.binaries.solar-xl Quite a few of the hashed movie releases are ending up in it. Rar and the other processing finds the name most of the time. when it doesn't, the above code will.

ukharley avatar Sep 29 '17 09:09 ukharley

@gkoh @ukharley So did these changes actually happened. Does this need to be incorporated?

@ukharley Are you currently using this in postproc or at time of scan?

brookesy2 avatar Sep 30 '17 18:09 brookesy2

I gave this a shot and ended up with an a CPU being soaked for 24 hours seemingly without end (I killed it). I can see what is trying to be done here and had a look at how nzedb does it.

There they generate the comparison hashes during pre entry, so the pre table has a few extra columns. This is then compared and handled during postprocess.

This effectively trades DB space for postprocess CPU time.

Default PostgreSQL has an MD5 function built-in, we could get more hash functions if we require pgcrypto. This would speed up the hash compute time, I assume the Python ones are a bit less efficient.

gkoh avatar Oct 03 '17 05:10 gkoh

@brookesy Proof of concept only. I run it manually if and when needed.

ukharley avatar Oct 03 '17 14:10 ukharley