There are a couple of GitHub discussions on the Whisper repository with various fixes/hacks to deal with it: https://github.com/openai/whisper/discussions/679 https://github.com/openai/whisper/discussions/813
If you get a chance, I encourage you to try out the other newer models I mentioned, I think you'd be very impressed.
As for the silence, I wonder why the the model even receives it. I would think a lot of that would be compressed out of existence to save bandwidth.
There are a couple of GitHub discussions on the Whisper repository with various fixes/hacks to deal with it: https://github.com/openai/whisper/discussions/679 https://github.com/openai/whisper/discussions/813
If you get a chance, I encourage you to try out the other newer models I mentioned, I think you'd be very impressed.