GPU-valg for LLM-inferens: A100 vs H100 vs CPU-offloading