Download/upload
This library supplies utility functions for file downloading and uploading. The functions support automatic retries, multipart downloads and streaming uploads. Their usage is described below.
Synchronous
- threedi_api_client.files.download_file(url: str, target: Path, chunk_size: int = 16777216, timeout: Optional[Union[float, Timeout]] = 5.0, pool: Optional[PoolManager] = None, callback_func: Optional[Callable[[int, int], None]] = None) Tuple[Path, int]
Download a file to a specified path on disk.
It is assumed that the file server supports multipart downloads (range requests).
- Parameters:
url – The url to retrieve.
target – The location to copy to. If this is an existing file, it is overwritten. If it is a directory, a filename is generated from the filename in the url.
chunk_size – The number of bytes per request. Default: 16MB.
timeout – The total timeout in seconds.
pool – If not supplied, a default connection pool will be created with a retry policy of 3 retries after 1, 2, 4 seconds.
callback_func – optional function used to receive: bytes_downloaded, total_bytes for example: def callback(bytes_downloaded: int, total_bytes: int) -> None
- Returns:
Tuple of file path, total number of downloaded bytes.
- Raises:
threedi_api_client.openapi.ApiException – raised on unexpected server responses (HTTP status codes other than 206, 413, 429, 503)
urllib3.exceptions.HTTPError – various low-level HTTP errors that persist after retrying: connection errors, timeouts, decode errors, invalid HTTP headers, payload too large (HTTP 413), too many requests (HTTP 429), service unavailable (HTTP 503)
- threedi_api_client.files.upload_file(url: str, file_path: Path, chunk_size: int = 16777216, timeout: Optional[Union[float, Timeout]] = None, pool: Optional[PoolManager] = None, md5: Optional[bytes] = None, callback_func: Optional[Callable[[int, int], None]] = None, headers: Optional[Dict] = None) int
Upload a file at specified file path to a url.
- Parameters:
url – The url to upload to.
file_path – The file path to read data from.
chunk_size – The size of the chunk in the streaming upload. Note that this function does not do multipart upload. Default: 16MB.
timeout – The total timeout in seconds. The default is a connect timeout of 5 seconds and a read timeout of 10 minutes.
pool – If not supplied, a default connection pool will be created with a retry policy of 3 retries after 1, 2, 4 seconds.
md5 – The MD5 digest (binary) of the file. Supply the MD5 to enable server-side integrity check. Note that when using presigned urls in AWS S3, the md5 hash should be included in the signing procedure.
callback_func – optional function used to receive: bytes_uploaded, total_bytes for example: def callback(bytes_uploaded: int, total_bytes: int) -> None
headers – optional extra headers for the PUT request.
- Returns:
The total number of uploaded bytes.
- Raises:
IOError – Raised if the provided file is incompatible or empty.
threedi_api_client.openapi.ApiException – raised on unexpected server responses (HTTP status codes other than 206, 413, 429, 503)
urllib3.exceptions.HTTPError – various low-level HTTP errors that persist after retrying: connection errors, timeouts, decode errors, invalid HTTP headers, payload too large (HTTP 413), too many requests (HTTP 429), service unavailable (HTTP 503)
Asynchronous
- async threedi_api_client.aio.files.download_file(url: str, target: Path, chunk_size: int = 16777216, timeout: Optional[Union[float, ClientTimeout]] = None, connector: Optional[BaseConnector] = None, executor: Optional[ThreadPoolExecutor] = None, retries: int = 3, backoff_factor: float = 1.0, callback_func: Optional[Callable[[int, int], Awaitable[None]]] = None) Tuple[Path, int]
Download a file to a specified path on disk.
It is assumed that the file server supports multipart downloads (range requests).
- Parameters:
url – The url to retrieve.
target – The location to copy to. If this is an existing file, it is overwritten. If it is a directory, a filename is generated from the filename in the url.
chunk_size – The number of bytes per request. Default: 16MB.
timeout – The total timeout of the download of a single chunk in seconds. By default, there is no total timeout, but only socket timeouts of 5s.
connector – An optional aiohttp connector to support connection pooling. If not supplied, a default TCPConnector is instantiated with a pool size (limit) of 4.
executor – The ThreadPoolExecutor to execute local file I/O in. If not supplied, default executor is used.
retries – Total number of retries per request.
backoff_factor – Multiplier for retry delay times (1, 2, 4, …)
callback_func – optional async function used to receive: bytes_downloaded, total_bytes for example: async def callback(bytes_downloaded: int, total_bytes: int) -> None
- Returns:
Tuple of file path, total number of uploaded bytes.
- Raises:
threedi_api_client.openapi.ApiException – raised on unexpected server responses (HTTP status codes other than 206, 413, 429, 503)
aiohttp.ClientError – various low-level HTTP errors that persist after retrying: connection errors, timeouts, decode errors, invalid HTTP headers, payload too large (HTTP 413), too many requests (HTTP 429), service unavailable (HTTP 503)
- async threedi_api_client.aio.files.upload_file(url: str, file_path: Path, chunk_size: int = 16777216, timeout: Optional[Union[float, ClientTimeout]] = None, connector: Optional[BaseConnector] = None, md5: Optional[bytes] = None, executor: Optional[ThreadPoolExecutor] = None, retries: int = 3, backoff_factor: float = 1.0, callback_func: Optional[Callable[[int, int], Awaitable[None]]] = None) int
Upload a file at specified file path to a url.
- Parameters:
url – The url to upload to.
file_path – The file path to read data from.
chunk_size – The size of the chunk in the streaming upload. Note that this function does not do multipart upload. Default: 16MB.
timeout – The total timeout of the upload in seconds. By default, there is no total timeout, but only socket connect timeout of 5 seconds and a socket read timeout of 10 minutes.
connector – An optional aiohttp connector to support connection pooling.
md5 – The MD5 digest (binary) of the file. Supply the MD5 to enable server-side integrity check. Note that when using presigned urls in AWS S3, the md5 hash should be included in the signing procedure.
executor – The ThreadPoolExecutor to execute local file I/O and MD5 hashing in. If not supplied, default executor is used.
retries – Total number of retries per request.
backoff_factor – Multiplier for retry delay times (1, 2, 4, …)
callback_func – optional async function used to receive: bytes_uploaded, total_bytes for example: async def callback(bytes_uploaded: int, total_bytes: int) -> None
- Returns:
The total number of uploaded bytes.
- Raises:
IOError – Raised if the provided file is incompatible or empty.
threedi_api_client.openapi.ApiException – raised on unexpected server responses (HTTP status codes other than 206, 413, 429, 503)
aiohttp.ClientError – various low-level HTTP errors that persist after retrying: connection errors, timeouts, decode errors, invalid HTTP headers, payload too large (HTTP 413), too many requests (HTTP 429), service unavailable (HTTP 503)