GPU to continue (skip-by-id). Key differences vs ``run_infer.py``: * Loads model via :class:`vllm.LLM` (bf16 weights, batched scheduler).