CSV Inference Example
Process CSV data with a pre-trained model using Valohai's execution system. This example uses TensorFlow 2.5.0 to run predictions on tabular data.
Batch inference runs as a standard Valohai execution, which means you can schedule it, trigger it via API, or chain it in pipelines.
What you'll need
Two files are available in public storage:
Model:
s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/model.zipData:
s3://valohai-public-files/tutorials/batch-inference/csv-batch-inference/data.csv
You don't need to download these manually. Valohai will fetch them when the job runs.
Inference code
This script loads a zipped model, processes CSV data, and outputs predictions as JSON metadata and a results file.
import json
from zipfile import ZipFile
import pandas as pd
import tensorflow as tf
# Extract and load the model from Valohai inputs
with ZipFile("/valohai/inputs/model/model.zip", "r") as f:
f.extractall()
model = tf.keras.models.load_model("model")
# Load CSV data from Valohai inputs
csv = pd.read_csv("/valohai/inputs/data/data.csv")
labels = csv.pop("target")
data = tf.data.Dataset.from_tensor_slices((dict(csv), labels))
batch_data = data.batch(batch_size=32)
# Run predictions
results = model.predict(batch_data)
# Build results dictionary: {"1": 0.375, "2": 0.76}
flattened_results = results.flatten()
indexed_results = enumerate(flattened_results, start=1)
metadata = dict(indexed_results)
# Print each result as Valohai metadata for tracking
for value in metadata.values():
print(json.dumps({"result": str(value)}))
# Save results to Valohai outputs
with open("/valohai/outputs/results.json", "w") as f:
# NumPy float32 values need stringification for JSON
json.dump(metadata, f, default=lambda v: str(v))Key Valohai paths:
/valohai/inputs/model/- Where input files land/valohai/outputs/- Where output files are saved and versioned
Define the step
Add this to your valohai.yaml:
Why these inputs work:
defaultprovides a starting point, but you can override with any S3, GCS, or Azure URLInput names (
model,data) map to/valohai/inputs/{name}/directories
Run the inference
Execute from your terminal:
This streams logs to your terminal in real-time.
Check your results
Once complete, find your outputs in the Outputs tab of the execution. You'll see:
results.jsonwith all predictionsExecution metadata showing individual prediction values
Next: Learn how to schedule recurring inference runs or trigger via API.
Last updated
Was this helpful?
