Downloading Files to Executions

You frequently have changing files like training samples/labels or pretrained models you want to use during your executions. I mean, what would you otherwise operate on.

You can either:

  1. download the files manually yourself using the tools you want; we don’t recommend this but might be easy way to get started if you’ve already mastered them
  2. use Valohai’s input mechanism to leverage our automatic record keeping, version control, reproducibility and caching features

This section covers Valohai’s input concept in more detail.

For a file to be usable by an execution, you first have to upload it to a data stores connected to the project either manually or by using our web user interface upload utility.

Here is how to upload files using the web user interface after the data store has been configured:

During an execution, Valohai inputs are available under /valohai/inputs/<INPUT_NAME>/<INPUT_FILE_NAME>. To see this in action, try running ls -laR /valohai/inputs/ as the main command of an execution which has inputs.

When you specify the actual input or default for one, you have 3 options:

Option #1: Custom Store URL

You can connect private data stores to Valohai projects.

If you connect a store that contains files that Valohai doesn’t know about, like the files that you have uploaded there yourself, you can use the following syntax to refer to the files.

  • Azure Blob Storage: azure://{account_name}/{container_name}/{blob_name}
  • Google Storage: gs://{bucket}/{key}
  • Amazon S3: s3://{bucket}/{key}
  • OpenStack Swift: swift://{project}/{container}/{key}

This syntax also has supports wildcard syntax to download multiple files:

  • s3://my-bucket/dataset/images/*.jpg for all .jpg (JPEG) files
  • s3://my-bucket/dataset/image-sets/**.jpg for recursing subdirectories for all .jpg (JPEG) files

Tip

If you are using your own data store, we show the exact location for each file through Data browser (2).

Where to find the file path in your data store.

Option #2: Datum URI

You can use the datum://<identifier> syntax to refer to specific files Valohai platform already knows about.

Files will have a datum identifier if the files were uploaded to Valohai either:

  1. by another execution, or
  2. by using the Valohai web interface uploader under “Data” tab of the project

Tip

Find the datum URL through the “datum://” button under “Data” tab of your project.

Where to find datum URL with identifier.

Option #3: Public HTTP(S) URL

If your data is available through an HTTP(S) address, use the URL as-is.