Why do you want the music to sound like it is coming from your phone while you are on a bus? FWIW, my phone, when I am on a bus, is in some random location as I move it between hands, put it in my pocket or attempt to prop it on a ledge, and I'm probably often staring blankly out the window... am I somehow crazy? Why does the iPhone keep wanting me to think the audio is coming from my phone?
I often watch movies with AirPods Max in the living room. Multiple times I've had to take off the headphones to make sure I'm not accidentally blasting whatever I'm watching through the home theater setup instead.
With "normal" headphones the sound follows your head, so if you glance to the side to see where your snacks are, the voice turns. With spatial audio the voice still sounds like it's coming from where the TV is. (Or iPad or iPhone)
This sounds like a cool engineering project in search of a reason to exist. The fact that they're trying "encourage" artists to encode with this does not diminish this impression.
The reason it exists is because of the products they haven't released yet. Spatial audio is part of the AR experience they want to release in the future, and it just so happens they can release it before that is ubiquitous so now they have a testbed for it with existing devices.
I enjoy it. Mono is fine for getting music to our ears, but then we invented stereo. Stereo is fine for getting music to our ears, but then we invented spatial audio. Neither stereo nor spatial are required to hear and enjoy music. However, I vastly prefer stereo to mono, and I can imagine a near future where I'd prefer spatial to both.
It sounded gimmicky to me until I experienced it. Now I get why Apple (and others!) are pushing it.
I really dislike Spatial Audio mixes for music, it does not make sense, I don't want the sounds coming from every direction, music is never experienced like that in reality.
Talking to some musician friends about it and they also do not like the experience, from either side: not as a listener nor as a musician.
It's a cool gimmick for some art projects, but for general listening it has no improvements over stereo while making the process of mixing much more complicated.
Besides what the other commenter said (Apple adds support for things like this with a long lead in front of the use-cases they really intend it for), everyone I've talked about it with loves spatial audio. There really is something compelling about music that sounds like it comes around you in the environment rather than just from inside your head.
I suspect there is some effect where people are primed to have positive impressions of things they spent a lot of money on. Not to say no one would have enjoyed it in a blind test. But I think some fraction will be predisposed to enjoy it to justify their decision. I've heard plenty of first hand accounts from people that didn't like it also. Benn Jordan's video on the subject is the most comprehensive treatment I've seen.