#9007 - improve and document IBlobsBackend
This MR includes a refactor of the blobs filesystem backend in a way that:
- all interactions with
twisted.web.server.Requests
are made in the blobs resource itself, and are not passed to the backend. - all methods now return deferreds.
- documentation is added on how to implement a new backend.
Other info:
- Closes: #9007 (closed)
Merge request reports
Activity
changed milestone to %Cycle 9
added 22 commits
-
d7894edd - 1 commit from branch
leap:master
- a3e84839 - [doc] add api doc for blobs backend
- 99c39feb - [refactor] remove unneeded class
- c82b1381 - [feature] add an exception for blobs not found
- a10096ad - [refactor] make blobs backend get_tag() agnostic of twisted.web requests
- a21e0e68 - [refactor] make blobs backend count() agnostic of twisted.web requests
- c4ad6e03 - [refactor] make blobs backend get_flags() agnostic of twisted.web requests
- e05de467 - [refactor] make blobs backend list_blobs() agnostic of twisted.web requests
- 78b67e06 - [refactor] make blobs backend read_blob() agnostic of twisted.web
- ce75ee1c - [refactor] make blobs backend set_flags() agnostic of twisted.web requests
- 9b6018b2 - [refactor] make blobs backend write_blob() agnostic of twisted.web requests
- 21966935 - [refactor] make blobs backend delete_blob() agnostic of twisted.web requests
- a8019297 - [refactor] make delete_blob() return a deferred
- 786c7f59 - [refactor] make get_blob_size() return a deferred
- d6aa6c32 - [refactor] make count() return a deferred
- 53c9d59a - [refactor] make list_blobs() return a deferred
- f458e3bb - [refactor] make get_total_storage() return a deferred
- 0cd0d0be - [refactor] make get_tag() return a deferred
- b3ba1fe8 - [refactor] make get_flags() return a deferred
- 2d78f953 - [refactor] make set_flags() return a deferred
- 6acc3a68 - [refactor] make read_blob() return a deferred
- d766e061 - [doc] improve documentation on IBlobsBackend
Toggle commit list-
d7894edd - 1 commit from branch
changed title from #9007 (closed) - Abstract blobs filesystem backend from twisted request mechanics to #9007 (closed) - improve and document IBlobsBackend
added 1 commit
- a1c99b53 - [test] fix incoming test with the new async blobs backend
added 1 commit
- 33addaba - [test] fix incoming test with the new async blobs backend
73 93 def __touch(self, path): 74 94 open(path, 'a') 75 95 76 def read_blob(self, user, blob_id, request, namespace=''): 96 def read_blob(self, user, blob_id, namespace=''): 77 97 logger.info('reading blob: %s - %s@%s' % (user, blob_id, namespace)) 78 path = self._get_path(user, blob_id, namespace) 98 try: 99 path = self._get_path(user, blob_id, namespace) 100 except Exception as e: 101 return defer.fail(e) 79 102 logger.debug('blob path: %s' % path) 80 _file = static.File(path, defaultType='application/octet-stream') 81 return _file.render_GET(request) 103 fd = open(path) changed this line in version 11 of the diff
251 292 pass 252 293 253 294 295 class BlobFile(resource.Resource): 296 297 def __init__(self, fd): 298 self.fd = fd 299 self.fd.seek(0, 2) 300 self.size = self.fd.tell() 301 self.fd.seek(0) 302 303 def render_GET(self, request): 304 request.setHeader(b'content-length', intToBytes(self.size)) 305 request.setHeader(b'content-type', 'application/octet-stream') 306 request.setResponseCode(http.OK) 307 producer = static.NoRangeStaticProducer(request, self.fd) Why
NoRangeStaticProducer
instead ofFile
? It doesn't support range requests. TheFile
does. This will be necessary for resuming downloads, see #8989 (closed)So
File
is a resource that usesNoRangeStaticProducer
,SingleRangeStaticProducer, or
MultipleRangeStaticProducer, which are producers, depending on the content of the
Range` header.I just stuck that one there because I didn't want to copy the whole
File.makeProducer
method, but we can/must do it to properly serve file ranges.Do you have a suggestion on how to better handle this?
If I understood it correctly, we can't use
static.File
instead ofBlobFile
becauseread_blob
returns an open fd right? If we extract the path from the returned fd and use thestatic.File
this won't be applicable to other backends, as I see it, may be wrong.Thinking on it for a while, if all backends are supposed to be Twisted based, then we may be better off with a method that given a consumer (or a method to be called with partial data), connects a producer to it. I imagined this because if we go to CouchDB, then it could be a
treq.collect
to a CouchDB request calling the given method. In the case of a file, it could beFileBodyProducer
doing the same. Anyway, I am struggling to see a way to supportContent-Range
on both kinds of backends simultaneously through the same interface.Since we are kind of mixing 3 issues here (#9008 #8989 (closed) #9007 (closed)), I would go for the easiest that fit most and solve this later when we get a spec for the new backend (if we decide to add it). The easiest I see for covering #8989 (closed) and #9007 (closed) is to use
static.File
as before in the place ofBlobFile
, then adapting it later for the new backend. This could either be a direct usage ofstatic.File
or subclassing it onBlobFile
to getFile.makeProducer
method. What do you think?3rd option, since there are more two MRs depending on this: merge as it is and solve the
static.File
dilemma as part of #8989 (closed)Victor @shyba commented about 4 hours ago:
If I understood it correctly, we can't use
static.File
instead ofBlobFile
becauseread_blob
returns an open fd right?There are some issues here:
-
read_blob
was receiving arequest
, and doing it's thing to serve the blob over that request. For that, it was usingtwisted.web.static.File
, because in the fs backend case, the blob was a file. - it seemed reasonable to me to separate the request handling from the backend data handling, and that means having some way to pass data back and forth without using the
request
.
If we extract the path from the returned fd and use the static.File this won't be applicable to other backends, as I see it, may be wrong.
Not all file descriptors have paths associated with them, only the ones that actually represent a file in the filesystem.
Thinking on it for a while, if all backends are supposed to be Twisted based, then we may be better off with a method that given a consumer (or a method to be called with partial data), connects a producer to it. I imagined this because if we go to CouchDB, then it could be a treq.collect to a CouchDB request calling the given method. In the case of a file, it could be FileBodyProducer doing the same.
I think the producer/consumer approach is more abstract and more twisted-like than passing file descriptors around. I will come up with something for that and ping you.
Anyway, I am struggling to see a way to support Content-Range on both kinds of backends simultaneously through the same interface.
What do you mean by "both kinds of backends? Couch and Filesystem? Where is the difficulty?
Since we are kind of mixing 3 issues here (#9008 #8989 (closed) #9007 (closed)), I would go for the easiest that fit most and solve this later when we get a spec for the new backend (if we decide to add it). The easiest I see for covering #8989 (closed) and #9007 (closed) is to use static.File as before in the place of BlobFile, then adapting it later for the new backend. This could either be a direct usage of static.File or subclassing it on BlobFile to get File.makeProducer method. What do you think?
I still don't see a way of using
static.File
directly. Is your idea to usefd.name
on theResource
level? If we do that we would be assuming that the fd does represent a file in the filesystem, which would break the generality of the solution, wouldn't it?3rd option, since there are more two MRs depending on this: merge as it is and solve the
static.File
dilemma as part of #8989 (closed)I can rebase the other MRs over master, that is not a problem.
One question for you: will you be touching the
IBlobsBackend
for implementing content-ranges? If yes, maybe it'd be better for you to pick up this branch from the current state and then adapt the interface to what makes most sense for that solution. Does that make sense?-
changed this line in version 7 of the diff
- Resolved by Victor
mentioned in merge request !170 (merged)
added 27 commits
-
33addaba...1c533260 - 2 commits from branch
leap:master
- 271c06c6 - [doc] add api doc for blobs backend
- 4ba18283 - [refactor] remove unneeded class
- 7c4f755e - [feature] add an exception for blobs not found
- c6d8f9d8 - [refactor] make blobs backend get_tag() agnostic of twisted.web requests
- c6ab16f3 - [refactor] make blobs backend count() agnostic of twisted.web requests
- 64435d08 - [refactor] make blobs backend get_flags() agnostic of twisted.web requests
- b02bc4bc - [refactor] make blobs backend list_blobs() agnostic of twisted.web requests
- 202630bd - [refactor] make blobs backend read_blob() agnostic of twisted.web
- b8922389 - [refactor] make blobs backend set_flags() agnostic of twisted.web requests
- 6383189b - [refactor] make blobs backend write_blob() agnostic of twisted.web requests
- 03879f10 - [refactor] make blobs backend delete_blob() agnostic of twisted.web requests
- 8cd55761 - [refactor] make delete_blob() return a deferred
- 1e4cbe25 - [refactor] make get_blob_size() return a deferred
- 308ed481 - [refactor] make count() return a deferred
- d8e8c3c5 - [refactor] make list_blobs() return a deferred
- 690b888e - [refactor] make get_total_storage() return a deferred
- 4a0b2bb9 - [refactor] make get_tag() return a deferred
- 2c42af6d - [refactor] make get_flags() return a deferred
- 2f1f127b - [refactor] make set_flags() return a deferred
- 1820b8ad - [refactor] make read_blob() return a deferred
- 615514fd - [doc] improve documentation on IBlobsBackend
- ed05d944 - [bug] fix exception catching with new blobs backend interface
- b3535c72 - [bug] handle path exceptions using twisted failures
- 750398d8 - [test] fix incoming test with the new async blobs backend
- cfc7e476 - [refactor] use producer/consumer on write/read_blob respectivelly
Toggle commit list-
33addaba...1c533260 - 2 commits from branch
added 1 commit
- 67f514e8 - [refactor] use producer/consumer on write/read_blob respectivelly
added 1 commit
- aa9bd136 - [refactor] use producer/consumer on write/read_blob respectivelly