In this how-to guide, you will see an example of batch inference with a csv file. We will utilize TensorFlow 2.5.0 to run inference on the csv file, using a pretrained model.
Data and model
We’ll require two essential files for this tutorial:
-
MNIST Model: A pre-trained model created with TensorFlow 2.5.0. You can access it here:
s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/model.zip
. -
CSV: A csv file with some data. You can access it here:
s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/data.csv
.
You don’t need to download these files separately; they are readily available for your job at the provided locations.
Inference code
Here’s an example inference code that loads a model from a zip file and then runs inference on the csv file downloaded.
The predictions are printed out as Valohai metadata and saved in an JSON file in Valohai outputs.
import json
from zipfile import ZipFile
import pandas as pd
import tensorflow as tf
with ZipFile('/valohai/inputs/model/model.zip', 'r') as f:
f.extractall()
# Load the model
model = tf.keras.models.load_model('model')
# Load the data
csv = pd.read_csv('/valohai/inputs/data/data.csv')
labels = csv.pop('target')
data = tf.data.Dataset.from_tensor_slices((dict(csv), labels))
batch_data = data.batch(batch_size=32)
results = model.predict(batch_data)
# Let's build a dictionary out of the results,
# e.g. {"1": 0.375, "2": 0.76}
flattened_results = results.flatten()
indexed_results = enumerate(flattened_results, start=1)
metadata = dict(indexed_results)
for value in metadata.values():
print(json.dumps({"result": str(value)}))
with open('/valohai/outputs/results.json', 'w') as f:
# The JSON library doesn't know how to print
# NumPy float32 values, so we stringify them
json.dump(metadata, f, default=lambda v: str(v))
Define a valohai.yaml
A batch inference is ran as a standard Valohai step and it’ll be run as an execution.
- step:
name: csv-inference
image: tensorflow/tensorflow:2.5.0
command:
- pip install pandas
- python batch_inference_csv.py
inputs:
- name: model
default: s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/model.zip
- name: data
default: s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/data.csv
Run from the command-line
Now you can execute your inference job:
vh execution run batch --adhoc --open-browser
If everything went according to plan, you can now preview the results in the Outputs tab.