credential-digger icon indicating copy to clipboard operation
credential-digger copied to clipboard

update_similar_discoveries exception when embeddings are not computed

Open marcorosa opened this issue 4 years ago • 1 comments

If we use the update_similar_discoveries function (e.g., from the UI where the flag to update similar discoveries is active by default), but the similarity was not computed (i.e., there are no embeddings entries for the repo in the corresponding table of the db), then an exception is raised

INFO:werkzeug:127.0.0.1 - - [10/Aug/2021 14:53:52] "POST /update_similar_discoveries HTTP/1.1" 500 -
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2464, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2450, in wsgi_app
    response = self.handle_exception(e)
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1867, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/i355397/git/credential-digger/ui/server.py", line 434, in update_similar_discoveries
    response2 = c.update_similar_snippets(target_snippet,
  File "/Users/i355397/git/credential-digger/venv/lib/python3.9/site-packages/credentialdigger-4.0.2-py3.9.egg/credentialdigger/client.py", line 1346, in update_similar_snippets
    target_embedding = self.get_embedding(snippet=target_snippet)
  File "/Users/i355397/git/credential-digger/venv/lib/python3.9/site-packages/credentialdigger-4.0.2-py3.9.egg/credentialdigger/client_sqlite.py", line 507, in get_embedding
    return super().get_embedding(query=query,
  File "/Users/i355397/git/credential-digger/venv/lib/python3.9/site-packages/credentialdigger-4.0.2-py3.9.egg/credentialdigger/client.py", line 598, in get_embedding
    embedding_str = cursor.fetchone()[0]
TypeError: 'NoneType' object is not subscriptable

Ideas for possible fix/right behaviour of this scenario (TBD which one of the following options is better):

  • treat it as a normal update_discovery
  • compute the embeddings on the fly and call update_similar_discoveries

marcorosa avatar Aug 10 '21 12:08 marcorosa

Hi @marcorosa , thank you for opening this issue. I also suggest removing the "Update similar discoveries" checkbox for the repositories for which we did not calculate the embeddings.

alaabenfatma avatar Aug 10 '21 13:08 alaabenfatma