vLLM Inference Server with Intel Gaudi

This document provides instructions for setting up and using vLLM with Intel® Gaudi® AI accelerator. It includes troubleshooting tips, guidelines for performance tuning, enabling FP8 calibration and inference, and profiling. Additionally, answers to common questions are provided to assist with getting started.