Is there a bit missing about having to test and reiterate the code to get a working version?
I have been doing this for months to create Google App scripts. Though I was using Bing Chat not v4 Chatgpt and then asking Claude.ai v2 to analyse the errors and feed back the solution suggestions to Bing Chat..and around and around until a working version emerged. If it got into a death spiral I would ask Bard and then get it's mixed code crazy answer fixed and that jump-started it again.
Sometimes the working version never arrived - though I put that down to my own lack of knowledge as I was delving into topics of which I had no prior experience.