# System Environment Variables

Valohai automatically sets environment variables in every execution to provide context about your job and control agent behavior.

## Execution Context Variables

These variables give you information about the current execution:

```shell
# Execution paths
VH_CONFIG_DIR=/valohai/config           # Configuration files
VH_INPUTS_DIR=/valohai/inputs           # Downloaded input files
VH_OUTPUTS_DIR=/valohai/outputs         # Files to upload
VH_REPOSITORY_DIR=/valohai/repository   # Git repository code (working directory)

# Execution identifiers
VH_JOB_ID=exec-016eb6ec-50cb-0031-3f48-d556e47b1c78       # Job UUID
VH_EXECUTION_ID=016eb6ec-50cb-0031-3f48-d556e47b1c78      # Execution ID
VH_PROJECT_ID=04a37c09-dbe1-4c01-b715-0a3223c50188        # Project ID

# Pipeline context (when applicable)
VH_TASK_ID=f9c97759-513e-44a1-9666-97cf198cde80           # Task ID
VH_PIPELINE_ID=f403603b-ad11-4cc4-a90d-3118f51c8dcd       # Pipeline ID
VH_PIPELINE_NODE_ID=972834e2-23b5-429a-9f6d-80b8c4a75c8a  # Pipeline node ID

# TPU
VH_TPU= # Contains the GRPC endpoint(s) of the allocated Cloud TPU(s), separated by spaces (when TPUs are available)

# Distributed tasks context
VH_DIST_MEMBER_ID=abc123dfg # The name of this execution in the distributed execution group
VH_DIST_MEMBER_INDEX=1 # Zero-based index of this execution in the distributed execution group
VH_DIST_MEMBER_COUNT=3 # Number of executions in the distributed execution group

# Cloud instance IPs
VH_PUBLIC_IP=13.53.65.91    # Public IP (falls back to private if unavailable)
VH_LOCAL_IP=172.16.0.1      # Local/private IP address of cloud instance

```

> 💡 *Use these paths to read inputs and write outputs in your code instead of hardcoding locations.*

## Agent Behavior Controls

Set these variables to modify how the Valohai agent handles your execution.

> :exclamation: These must be defined in your **step definition** to take effect.
>
> Environment variables set via `export` in `commands` section only work inside the container and wont' affect agent behavior.

### **Interactive Terminal**

#### VH\_INTERACTIVE

Enables the use of [Interactive Terminal](/development-and-debugging/interactive-terminal.md).

```shellscript
VH_INTERACTIVE=1     # true values: 1, yes, true
```

### **Data**

#### **VH\_NO\_DATA\_CACHE**

Ignore cached input data and re-download from source.

```bash
VH_NO_DATA_CACHE=1      # true values: 1, yes, true
```

Useful when you've reused URLs but the underlying data has changed.

#### **VH\_NO\_OUTPUT\_CACHE**

Prevent caching of produced output files.

```shellscript
VH_NO_OUTPUT_CACHE=1     # true values: 1, yes, true
```

Prevents caching when your execution produces large outputs that won't be reused.\
This saves disk space for future executions.

#### **VH\_CLEAN**

Remove all Docker images and cached data before and after execution

```bash
VH_CLEAN=true     # true values: 1, yes, true
```

This increases execution time but ensures a clean environment.

#### VH\_NO\_INPUT\_HASHING

Whether to skip hashing input files. This can speed up initialization of the execution, at the expense of\
not having the data hashed for integrity checking.

```shellscript
VH_NO_INPUT_HASHING=1     # true values: 1, yes, true
```

#### VH\_NO\_OUTPUT\_HASHING

Whether to skip hashing output files. This can speed up finalization of the execution, at the expense of\
not having the data hashed for integrity checking.

```shellscript
VH_NO_OUTPUT_HASHING=1     # true values: 1, yes, true
```

#### VH\_RENAME\_DUPLICATES

Whether to rename input files with duplicate filenames to avoid conflicts. Clashing filenames will be suffixed with an underscore and a counting number like `file.txt`, `file_2.txt`, etc.

```shellscript
VH_RENAME_DUPLICATES=1     # true values: 1, yes, true
```

#### VH\_ENABLE\_DATASET\_VERSION\_PACKAGING

Whether to enable the dataset version packaging feature. Default to false.

```shellscript
VH_ENABLE_DATASET_VERSION_PACKAGING=1     # true values: 1, yes, true
```

### **Configuration Files**

Each [configuration file](/executions/system-configuration-files.md) will be written in both .**json** and .**yaml** format.\
In case you don't need the .yaml version of these files (their advantage is being a bit more human readable but the machine won't mind json either), you can disable their writing with:

```shellscript
VH_YAML_CONFIG_FILES=0     # false values: 0, no, false
```

> :bulb: `/valohai/config/inputs.yaml` will contain details about each requested file. In case your execution requests a large amount of files (>20k), generating and writing the .yaml version of this configuration file might take a while (even a few minutes).\
> If your execution does not rely on the .yaml version of this file, feel free to disable it and speed up the execution startup.

### **Logging**

#### VH\_INPUT\_LOGGING

During the input download phase, Valohai will log status of each requested file (downloaded/found-in-cache/on-demand). These messages could clutter the logs and make it harder to inspect the rest of the logs.\
There are three possible options (thus values for this environment variable):

* **`VH_INPUT_LOGGING=enable`**

  This is the default value and it allows logging status of each input file.
* **`VH_INPUT_LOGGING=disable`**

  Suppress all input processing logs
* **`VH_INPUT_LOGGING=file`**

  Write logs to `/tmp/peon/runs/exec-<execution-id>/input_processing.log` and show minimal logs in the UI (occasionally number of downloaded files and errors).

  :exclamation: Note that this file is written on the path on the machine, and not inside the execution's container, therefore it's not accessible from within the execution.

### **Shared Cache**

> :exclamation: **Currently applicable only in Kubernetes environments**

When additional (shared) cache layers are used, it's expected that these will be remote/network file systems.\
When files are found in local cache, the ones that are requested by the execution, Valohai will expose to the execution container by creating **hard-links**. This is not possible when files are found on remote/network file system.\
\
There are two possible behaviors, controlled by the next environment variables:

* **Copying** each file from remote file system - Execution will access actual files but this may take longer than the regular download (depending on the amount of files)
* Creating **sym-links** - Execution will start sooner but will have to access data via sym-links. Even though this is generally "safe", it may cause misbehavior of some programs.

#### VH\_ALLOW\_INPUT\_SYMLINK

By setting this environment variable to a truthy value, you instruct the Valohai agent to create sym-links pointing to the files on the remote file system.

```shellscript
VH_ALLOW_INPUT_SYMLINK=true     # true values: 1, yes, true
```

> :bulb:This behavior is **disabled** by default.

#### VH\_ALLOW\_INPUT\_COPY

Copying files is default behavior (when **VH\_ALLOW\_INPUT\_SYMLINK** is disabled) but also used as a fallback behavior in case sym-linking of a file fails (when **VH\_ALLOW\_INPUT\_SYMLINK** is enabled).

```shellscript
VH_ALLOW_INPUT_COPY=1     # true values: 1, yes, true
```

> :bulb:This behavior is **enabled** by default.

Execution will be stopped and marked as errored in case:

* **VH\_ALLOW\_INPUT\_SYMLINK = 0 and VH\_ALLOW\_INPUT\_COPY=0 -** Neither sym-linking nor copying is allowed
* **VH\_ALLOW\_INPUT\_SYMLINK=1 and VH\_ALLOW\_INPUT\_COPY=0 and \[sym-linking of a file fails]** In such case, copying would be used as a fallback, but since it's disabled, execution will be stopped.

### **Resource Limits**

#### VH\_CPU\_LIMIT

Limit CPU cores available to the job.

```bash
VH_CPU_LIMIT=2        # Use 2 cores
VH_CPU_LIMIT=0,2,4    # Use specific core indices
```

#### VH\_MEMORY\_LIMIT

Set memory usage limit.

```bash
VH_MEMORY_LIMIT=500M   # 500 megabytes
VH_MEMORY_LIMIT=2G     # 2 gigabytes
```

#### VH\_SHM\_SIZE

Increase shared memory directory size.

```bash
VH_SHM_SIZE=16G       # 16 gigabytes
```

Useful for applications that need large shared memory (e.g., PyTorch DataLoader).

### **Docker**

#### VH\_DOCKER\_NETWORK

Name of the network in which to create the execution container.

```shellscript
VH_DOCKER_NETWORK="training-net"
```

#### VH\_EXPOSE\_PORTS

Comma-separated list of port mappings (\<host-port>:\<container-port>) to expose on the execution container.

```shellscript
VH_EXPOSE_PORTS='8080:80, 22:22'
```

#### VH\_INIT

Whether to use an `init`-like daemon in the Docker container. Defaults to **true**, since most workloads don't properly handle signals. In case a workload does handle signals properly, this can be set to a falsy value.

```shellscript
VH_INIT=0     # false values: 0, no, false
```

#### VH\_ALLOW\_ENTRYPOINT

Whether to use the `ENTRYPOINT` defined in the Docker image.

```shellscript
VH_ALLOW_ENTRYPOINT=1     # true values: 1, yes, true
```

#### **VH\_NO\_IMAGE\_CACHE**

Force Docker image re-pull, ignoring local cache.

```bash
VH_NO_IMAGE_CACHE=true     # true values: 1, yes, true
```

### **Storage Behavior**

#### VH\_TMPFS

Control whether `/tmp` writes to memory or disk.

```bash
VH_TMPFS=0    # false values: 0, no, false
```

By default, `/tmp` is a memory filesystem. Setting this to false writes to disk instead, which is slower but avoids out-of-memory errors for large temporary files.

### GPU

#### VH\_RESET\_GPU

Whether to issue an Nvidia GPU reset command before running the container.

```shellscript
VH_RESET_GPU=1    # true values: 1, yes, true
```

#### VH\_XORG

Whether to start a Xorg server for the execution's duration.

```shellscript
VH_XORG=1    # true values: 1, yes, true
```

### CODE

#### VH\_CHOWN\_REPOSITORY

Whether to try to `chown` the repository directory to the user running the container.

```shellscript
VH_CHOWN_REPOSITORY=1    # true values: 1, yes, true
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/executions/system-environment-variables.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
