**21 Sep Update** - I edited this post and pulled the prompts into a separate doc to make this easier to digest.
Is anyone out there really good at generating images? I've invested a few hours now trying to create an image to use as my background for video chats (Zoom, Meet, Teams, etc) and I keep running into a host of issues, such as:
1) AI not following instructions (putting objects in different places than I asked for, not following color guidance, etc)
2) AI saying it made a requested change but the image stays exactly the same
3) Making additional changes beyond what I asked despite me specifically saying to keep everything exactly the same except for my specified changes
4) Objects in the image getting cut off when I upload it as my background (and unable to fix because of issue 2)
A few specific questions about working w/ images:
1) Is it better to only make one change at a time, or are multiple changes at once usually successful?
2) Do you reach a point where you've just been iterating on the same image too many times and the AI just sort of gives up?
3) Is it better to start w/ a more generic prompt and get really specific through modifications? Or should you start w/ a hyper-specific prompt like the one I had from Prompt Cowboy?
4) Is it helpful to do an image search for objects you're thinking of including and uploading those for the AI to use as an example vs. trying to describe objects w/ words?
Happy to provide more details, but here's a quick recap of what I've done so far:
I started in ChatGPT. I told it what I wanted to do, found some YouTube videos about what makes a great professional Zoom background, used Tactiq to pull the transcripts, and uploaded those to give it additional context. It generated a few prompts to use w/ different tools. See the attached doc for the prompts.
I tried using the one built for ChatGPT, and the starting image wasn't terrible but it needed some work and I ended up running into the issues mentioned above. I pasted that prompt into Prompt Cowboy and iterated on it a bit to generate a much more detailed prompt. See attached doc for the prompt.
I've now tried this in both ChatGPT and Gemini, and I keep running into the issues mentioned above. It ends up being kind of frustrating because I feel like I'm super close to what I want, but I just can't quite get it over the finish line.
For example, I found that Gemini was generally better and much faster and following instructions and generating images, but when I tried uploading an image I was pretty happy with, the objects were getting cut off. I asked it multiple times to move the items so they wouldn't get cut off, and it kept saying it had moved them like I asked, but literally nothing changed.
I even took a screenshot of my Google Meet background so it could see how things were cut off, and it was like "Thanks for uploading the screenshot, that make it really clear what the issue is". It then proceeded to generate about 7 images in a row where literally nothing changed. It just wouldn't move the positioning of the objects to prevent them from getting cut off, despite profusely apologizing and saying it would definitely fix it this time. Then I hit my daily limit for requests on the free version of Gemini, at which point I dug back to my AF days for a string of expletives and decided to post here for some help LOL.
Sorry, I know this is a long one...and all for what was supposed to be a quick little side project, but that's how the learning process goes!