Inputs are normally downloaded in full prior to the execution starting to run, but they can optionally be configured to be downloaded on demand during execution runtime.
Executions can also decide to not download some or all on-demand inputs, or download only part of the set of files associated with the input.
Configuration
To begin using on-demand inputs, mark some inputs with the download: on-demand
property.
- step:
name: on-demand
image: python:3.12
command:
- pip install valohai-utils
- ls -R /valohai/inputs
- python ./on_demand.py
inputs:
- name: on_demand_input
download: on-demand
- name: normal_input
Using on-demand inputs with valohai-utils
If you are using a Python module as your command, you can use valohai-utils
to download input files
prior to using them in code.
import valohai
import base64
for filepath in valohai.inputs("on_demand_input").paths():
print(f"Got file path: {filepath}")
with open(filepath, "rb") as f:
print("File header (base64):")
print(base64.b64encode(f.read(10)))
Using valohai.inputs('on_demand_input').paths()
will download the on-demand input files and the for
loop
block will be able to access and read them.
Note
valohai-utils
will always download every file in the input, even if you use a filter to get only some files.
Note
While you can also use valohai.inputs('on_demand_input').stream()
and .streams()
, input files will be
downloaded in full prior to yielding any data.
Note
valohai-utils
does not support machine role or identity based download authentication. Storage
backends use pre-signed URLs by default to authenticate downloads, which is supported.
Using on-demand inputs manually
For other programming environments and for custom download logic, you can download inputs using the input request API available inside the execution.
Accessing the input request API
The details needed to request input data are found in /valohai/config/api.json
(or /valohai/config/api.yaml
).
Read the file and get the “input_request” key to find the endpoint access information. The config data structure looks like this:
{
"input_request": {
"method": "POST",
"url": "https://app.valohai.com/input-request-data/",
"headers": {
"Authorization": "Execution-Token {unique token}"
}
}
}
The Authorization
header contained is unique to each execution and API endpoint, and is only valid during the
execution’s runtime. Make a HTTPS request using these details: with the given method (POST), URL and headers.
Optional: Selecting inputs returned by the input request API
The input request API returns data for all inputs for the execution.
If you are only interested in specific inputs, you can limit the response to those by supplying
the input
GET parameter containing an input ID. The GET parameters are the part of the URL after the question mark.
This parameter can be repeated to request several inputs.
You can find the input IDs for each input in the input configuration files, /valohai/config/inputs.json
or
/valohai/config/inputs.yaml
. Sample configuration:
{
"on_demand_input": {
"input_id": "5c83f6fb-1cb2-4fc3-bda6-2cbd7461f16f",
"files": [
{
"checksums": {
"md5": "57e6a6b5a93dd6e374486942f87607da",
"sha1": "173f7f5960a68318e4759f210a2f74c1d824f13a",
"sha256": "0f442236de57cf7351d2029cda990f712cffde85ad1e4c9b5e0e819dd8e93bb4"
},
"datum_id": "017ca34a-7ba2-42cb-9d7e-b2db548f52cb",
"input_id": "f2da3706-8ba9-47d3-a90e-81d39804d776",
"name": "model-92998463-98a8-42c3-b329-a139b69fcec5.pb",
"path": "/valohai/inputs/on_demand_input/model-92998463-98a8-42c3-b329-a139b69fcec5.pb",
"size": 1592920,
"uri": "datum://017ca34a-7ba2-42cb-9d7e-b2db548f52cb"
}
]
}
}
Read the file and get the input information by name using key lookup. The input ID is available
under the input_id
key.
Reading the input request API response
The API response contains valid download URLs for each file for each input requested in the url
field.
Sample response:
[
{
"name": "on_demand_input",
"files": [
{
"filename": "model-92998463-98a8-42c3-b329-a139b69fcec5.pb",
"original_uri": "s3://example-bucket/models/A7rqTQbo.pb",
"url": "https://example-bucket.s3.amazonaws.com/models/A7rqTQbo.pb?AWSAccessKeyId={...}",
"input_id": "f2da3706-8ba9-47d3-a90e-81d39804d776",
"download_intent": "on-demand"
}
]
}
]
You can now select which files you want and download them using the url
response field to the execution.
The url
field typically contains a pre-signed URL that can be used to download the input file right away.
Note
It is possible to configure Valohai to not generate pre-signed URLs for inputs. In this case you will need to download the file with alternate means of authenticating the download, such as an instance role.
Note
The filename
field is not always available. The original_uri
, url
, input_id
and download_intent
are always available.
While Valohai will download input files into standard /valohai/{input-name}/{filename}
paths,
you can save the downloaded files anywhere in the execution’s working directory.
Summary
Here’s a pseudocode summary of how to download an on-demand input with custom code:
input_config = load_json(/valohai/config/inputs.json)
api_config = load_json(/valohai/config/api.json)
input_request_config = api_config[input_request]
input_id_to_download = input_config[on_demand_input][input_id]
input_request = http_request(
method=input_request_config[method]
url=input_request_config[url]
query_parameters={input: input_id_to_download}
headers=input_request_config[headers]
)
for each result in load_json(input_request)
if result[name] == on_demand_input
url_to_download = result[files][0][url]
break loop
end if
end for each
download_input(url_to_download)