Validating Low-Confidence LLM Generation