You can drop AUDIO file, such as .wav or .mp3, into Google AI Studio platform (different than the regular Gemini web app, because that web app isn't able to do this yet).
So yea, I'm working towards the Inspiring Asia film contest, and I was able to work with Gemini on the studio app to break down my soundtrack.
It can tell me:
- overall mood/vibes
- analyze the musical structure aka harmony/melodic arcs
- volume dynamics
- tempo and rhythm
- types of instruments
- and whatever else you need
This helps with my project by helping to guide the visual pacing of the story from the music. I uploaded the soundtrack below to Gemini AI Studio Gemini 2.5 Pro and it gave me this for example:
- Mood: A psychological fever dream, moving from frantic struggle to a catastrophic breakdown, and finally, a tearful release.
- Story: A musical battle. Hikari's complex talent is attacked by a dark, judgmental theme (the pressure). The music is a tense, minor-key conflict that finally resolves into a soaring, major-key feeling of freedom.
- Sound: Extreme dynamics map her emotional journey: from loud and frantic, to a fragile whisper, to a crushing peak, followed by total silence, and a final, a hopeful return.
- Instrument: A single solo piano, which is both percussive and vulnerable, completely exposing the performer with nowhere to hide.
That's just a taste of what it can do, you can basically break it down into more detailed analysis that ties into your story, so you can judge whether or not if the music "hits".
and use the gemini 2.5 pro model.