|
|
This are some notes we made to standardize the naming we give for the data transfer features.
|
|
|
|
|
|
Chunking:
|
|
|
|
|
|
- Suppose a 100 MB file and a configured maximum chunk size of 1 MB.
|
|
|
- `chunks = lambda content: content.split_in_chunks(MAX_CHUNK_SIZE)`
|
|
|
- The 100 MB file is represented as 100 chunks of 1MB.
|
|
|
- Chunking is implemented in the application layer and does not relate to network.
|
|
|
|
|
|
Batching:
|
|
|
|
|
|
- There are **N** documents to send (N can be configured, like `sum([size(b) for b in blobs]) < X`).
|
|
|
- Send in a batch means that all documents are sent in only one (HTTP) request.
|
|
|
|
|
|
Streaming:
|
|
|
|
|
|
- Choose a value **Y** MB for max chunk size.
|
|
|
- Suppose we need to transfer a total of **X** MB (> **Y**) from multiple blobs or even all of them.
|
|
|
- Client and server memory will never go above **Y** MB while sending **X** MB.
|
|
|
- The other end parses back the single stream of data into the original multiple blobs that were used as input. |
|
|
\ No newline at end of file |