I think a neural net would be an overkill here. If you are doing the mixing on device, I think the algorithm could be as simple as just overlaying two tracks.
Splitting tracks on-device would probably use a neural net, but I don't think they're doing that.
Splitting tracks on-device would probably use a neural net, but I don't think they're doing that.