pynab
pynab copied to clipboard
To do with pres but best not to say
Here's what I've managed on my own. I leave it up to those with better progamming skills to do something useful with it and update the repo.
from sqlalchemy import Column, Integer, BigInteger, LargeBinary, Text, String, Boolean, DateTime, ForeignKey, \
create_engine, UniqueConstraint, Enum, Index, func, and_, exc, event
from pynab.db import db_session, Pre, Release
import datetime
import pytz
import hashlib
def get_hash(pre_time, title, req):
with db_session() as db:
title = bytes(title, 'utf-8')
req = bytes(req, 'utf-8')
case_title = [title,
title.lower(),
title.upper()
]
found = False
for c in case_title:
if not found:
lst = [c,
c + req,
c + req + req
]
for a in lst:
md5 = hashlib.md5()
sha1 = hashlib.sha1()
md5.update(a)
sha1.update(a)
m_query = db.query(Release).filter(((Release.original_name) == md5.hexdigest())).first()
if m_query:
print("{} {:100s} {:^48s}".format(pre_time, str(a), md5.hexdigest()))
found = True
break
s_query = db.query(Release).filter(((Release.original_name) == sha1.hexdigest())).first()
if s_query:
print("{} {:100s} {:^48s}".format(pre_time, str(a), sha1.hexdigest()))
found = True
break
if found:
break
def get_releases(days):
with db_session() as db:
# filter set to just lookup a.b.moovee
p_query = db.query(Pre).filter((Pre.pretime >= (datetime.datetime.now(pytz.utc) - datetime.timedelta(days=days))) & (Pre.requestgroup == "alt.binaries.moovee"))
p_query = p_query.order_by(Pre.pretime.desc())
for pre in p_query.all():
get_hash(pre.pretime, pre.name, str(pre.requestid))
if __name__ == '__main__':
get_releases(90) # set to search 90 days of pre's
I note this is targeted at moovees, would this apply to teevee?
I believe it would only be needed for moovee group. I read that somewhat recently the moovee posting behavior changed.
Right, I found a reddit post explaining those changes, then only a bit later indicating it had reverted.
I ask only due to noticing a number of hashed titles in teevee but they surely don't match anything I can find/generate.
Try this group: alt.binaries.solar-xl Quite a few of the hashed movie releases are ending up in it. Rar and the other processing finds the name most of the time. when it doesn't, the above code will.
@gkoh @ukharley So did these changes actually happened. Does this need to be incorporated?
@ukharley Are you currently using this in postproc or at time of scan?
I gave this a shot and ended up with an a CPU being soaked for 24 hours seemingly without end (I killed it). I can see what is trying to be done here and had a look at how nzedb does it.
There they generate the comparison hashes during pre entry, so the pre table has a few extra columns. This is then compared and handled during postprocess.
This effectively trades DB space for postprocess CPU time.
Default PostgreSQL has an MD5 function built-in, we could get more hash functions if we require pgcrypto. This would speed up the hash compute time, I assume the Python ones are a bit less efficient.
@brookesy Proof of concept only. I run it manually if and when needed.