provenance
provenance copied to clipboard
ChainedRepo with ArchivedFile
ArchivedFile
has a method, abspath()
, that returns the path to the blob that is the file so it can, for example, be read. It's definition found here:
def abspath(self):
repo = repos.get_default_repo()
path = repo.blobstore._filename(self.blob_id)
return os.path.abspath(path)
My default_repo is a ChainedRepo
so when repo.blobstore
is called while getting path
an AttributeError
is thrown because a ChainedRepo
doesn't have a blobstore
. Instead it has stores
which is a list of the blobstore
's that are chained. Here's my debug session to show some of that:
ipdb> repo
<provenance.repos.ChainedRepo object at 0x111759898>
ipdb> repo.stores
[<provenance.repos.PostgresRepo object at 0x118c53ac8>, <provenance.repos.PostgresRepo object at 0x1119502e8>]
ipdb> repo.stores[1]
<provenance.repos.PostgresRepo object at 0x1119502e8>
ipdb> repo.stores[1].blobstore
<provenance.sftp.SFTPStore object at 0x1189230b8>
ipdb> repo.stores[0].blobstore
<provenance.blobstores.DiskStore object at 0x118923080>
ipdb> repo.stores[0].blobstore._filename(self.blob_id)
'/Users/.../blobstore/e86d496122b230f2d4ebaa3e9bdb9371cf9486c4'
ipdb> repo.stores[1].blobstore._filename(self.blob_id)
'/Users/.../blobstore/e86d496122b230f2d4ebaa3e9bdb9371cf9486c4'
Thoughts? Is it a bug or an all too common user error?
Looks like a problem with the implementation/design. I'll have to look more closely how abspath
is being used... but I think the end solution will probably adding a _filename
method onto a repo. Basically it will have to see if it has a disk blobstore and then delegate to to it's _filename
. Or maybe a S3
blobstore would work as well.. these are the things that needs to be considered. But the chained repo would then have to iterate and find the diskstore. Kinda messy but I think it would be best to put that logic in the repos rather than have artifact file try to figure everything out.
I pushed a fix for this.. I think. I didn't write an automated test or even test it manually but it should work. :) Let me know if it solves your problem.
I haven't been able to figure out why yet, but repo._filename(self.blob_id)
is returning None
.
=== EIN IPython Debugger ===
ipdb> > /Users/.../python3.5/posixpath.py(64)isabs()
62 """Test whether a path is absolute"""
63 sep = _get_sep(s)
---> 64 return s.startswith(sep)
65
66
ipdb> up
> /Users/.../python3.5/posixpath.py(358)abspath()
356 def abspath(path):
357 """Return an absolute path."""
--> 358 if not isabs(path):
359 if isinstance(path, bytes):
360 cwd = os.getcwdb()
ipdb> up
> /Users/.../python3.5/site-packages/provenance/core.py(506)abspath()
504 repo = repos.get_default_repo()
505 path = repo._filename(self.blob_id)
--> 506 return os.path.abspath(path)
507
508 def __fspath__(self):
ipdb> print(repo)
<provenance.repos.ChainedRepo object at 0x10d8143c8>
ipdb> print(self.blob_id)
/Users/.../blobstore/e86d496122b230f2d4ebaa3e9bdb9371cf9486c4
ipdb> print(path)
None
Did you step into the repo._filename(self.blob_id)
call to see why?
Alright, after much pain I figured out how to use the debugger to do it.
> /Users/.../python3.5/site-packages/provenance/core.py(505)abspath()
-> path = repo._filename(self.blob_id)
(Pdb) step
--Call--
> /Users/.../python3.5/site-packages/provenance/repos.py(875)_filename()
-> def _filename(self, id):
(Pdb) next
> /Users/.../python3.5/site-packages/provenance/repos.py(876)_filename()
-> return cs.chained_filename(self, id)
(Pdb) step
--Call--
> /Users/.../python3.5/site-packages/provenance/_commonstore.py(144)chained_filename()
-> def chained_filename(chained, id):
(Pdb) id
'/Users/.../blobstore/e86d496122b230f2d4ebaa3e9bdb9371cf9486c4'
(Pdb) chained
<provenance.repos.ChainedRepo object at 0x10b7576d8>
(Pdb) next
> /Users/.../python3.5/site-packages/provenance/_commonstore.py(145)chained_filename()
-> if id in chained.stores:
(Pdb) chained.stores
[<provenance.repos.PostgresRepo object at 0x10b2d7ba8>, <provenance.repos.PostgresRepo object at 0x10b74e2e8>]
So in chained_filename
we have if id in chained.stores
. id
is the path to the blob on my disk while chained.stores
is a list of repos
. I don't think id
will ever be in chained.stores
.
Ah, okay, you are right. The .stores
is the problem. I pushed a fix to master.
BTW, you can use 's' and 'n' as shortcuts for 'step' and 'n'. #ProTip
Same problem.
(Pdb) break /Users/.../python3.5/site-packages/provenance/core.py:505
Breakpoint 1 at /Users/.../python3.5/site-packages/provenance/core.py:505
(Pdb) c
> /Users/.../python3.5/site-packages/provenance/core.py(505)abspath()
-> path = repo._filename(self.blob_id)
(Pdb) s
--Call--
> /Users/.../python3.5/site-packages/provenance/repos.py(875)_filename()
-> def _filename(self, id):
(Pdb) n
> /Users/.../python3.5/site-packages/provenance/repos.py(876)_filename()
-> return cs.chained_filename(self, id)
(Pdb) s
--Call--
> /Users/.../python3.5/site-packages/provenance/_commonstore.py(144)chained_filename()
-> def chained_filename(chained, id):
(Pdb) n
> /Users/.../python3.5/site-packages/provenance/_commonstore.py(145)chained_filename()
-> if id in chained:
(Pdb) id
'/Users/.../blobstore/e86d496122b230f2d4ebaa3e9bdb9371cf9486c4'
(Pdb) chained
<provenance.repos.ChainedRepo object at 0x111851f28>
Similar to last time, it checks if id in chained
, id
is a string (path to file), chained
is a repo object. If I'm not mistaken it's saying, "is this string in this object? No." So it still returns None
.
I'm a bit confused about why id
is what it is. This all starts when I call proxy.abspath()
. But id
seems to be the very thing that I want returned. Before I started using a chained repo I noticed that abspath()
and blob_id
were the same. Is that supposed to be the case?
Before I started using a chained repo I noticed that abspath() and blob_id were the same. Is that supposed to be the case?
No, id
should be the artifact id, so a hash of the contents. If it is the path of the file then something else must be wrong.