For the technical specifications, go to valohai.yaml endpoint section.
A deployment is a versioned collection of one or more web endpoints for online inference.
Each deployment has a deploy target, which is a Kubernetes cluster the service will be served on. The default deploy target is a shared Kubernetes cluster managed by Valohai but you can also use your own cluster.
You can have multiple deployments if you want to run your service in different geolocations.
You can create and manage deployments under
Deployment tab on Valohai web user interface.
If you only need non-interactive batch predictions (taking a lot of samples as input and writing predictions into a file), you can just create a step to handle that, take your model/samples as inputs and write your predictions to
Deployments are mainly required if one of the following is true:
- you want to get fast predictions on per-sample basis
- you want to give prediction endpoint access to application that outside of your organization e.g. a customer that doesn’t have a Valohai account
A deployment version is a Docker image that Valohai builds on top of the Docker image you specify in the
endpoint YAML definition. The build image will include 1) your code repository and 2) all files you defined in the YAML file and specified during deployment version creation.
The deployment version is the actual artifact that is served on the target Kubernetes cluster.
An endpoint is a Docker container running a HTTP server in an auto-scaling Kubernetes cluster.
You can have multiple endpoints per deployment version because a single project can have various inference needs for different contexts.
Authentication can be added in various ways but the easiest is to use HTTP Basic Auth.
You define endpoints in the valohai.yaml:
- endpoint: name: greet-me description: Responds with a big hello! image: python:3.6 port: 8000 server-command: python -m wsgiref.simple_server - endpoint: name: predict-digit description: Predict digits from image inputs ("file" parameter). image: gcr.io/tensorflow/tensorflow:1.13.1-py3 wsgi: predict_wsgi:predict_wsgi files: - name: model description: Model output file from training step. path: model.pb
A deployment version alias is a human-readable name that points to a specific deployment version e.g.
Aliases create canonical URLs for development so you can use Valohai to control which version is being served in each context. This allows you to update currently used version or rollback to previous version if something goes wrong.
For example, version alias
https://valohai.cloud/ruksi/mnist/americas/production/predict-digit could be used by applications utilizing your predictions and they don’t need to change the URL when you a release new version.
Testing an endpoint¶
You can easily test your endpoint by sending it a POST/GET/PUT request with a query and/or fields. A field could be e.g. plain text, a JSON file or an image.
Navigate inside a deployment, select the endpoint, define your request and press Send. You’ll get the response from your inference service directly in your browser.