Advanced text-to-image model for improved image fidelity