The reason this sort of speech pattern is still mostly intelligible is that consonants are actually more important clues markers to the listener (as a general rule).
For example, you could probably interpret "thuh kwuhck bruhn fuhx juhmps uhvr thuh luhzy dug" quite readily, but not "duh dwid drowd dod dudd oded duh dudy dod".
It is true that vowels are sometimes the distinguishing phoneme between minimal pairs, but the reality is that we are quite adept at calibrating ourselves to cope with vowel variation.
One example is actually head size, of all things. I won't go into all the psychoacoustics involved (not my area of expertise), but your brain adjusts its expectations for how a given vowel should sound based on the speaker's unique "instrumental" qualities. In other words, you can end up with two very objectively different waveforms that your brain is able to match up without hesitation.
A more familiar example is dialects. A Southern American speaker might say "ice", and to a General American speaker, it may sound much more like "ass", but this kind of variation doesn't really lead to the hilarious confusion one might expect. After a quick "scan" of another dialect, we usually adapt fairly quickly, and we retain those "settings" for the next time we hear a similar dialect.
One example that's not so easy to work around might be the pen-pin merger. In general, it leads to few mix-ups. But those speakers, "pen" and "pin" are homophones, and unlike other such pairs (bin-Ben, lint-lent, mint-meant, tin[t]-ten[t], win-when, etc.) they may not be easily differentiable by context. In response, many speakers refer to pens as "ink pens" and pins as "stick pins", which avoids the ambiguity (animal pens and female swans aren't really an issue I suppose).
So even when vowels lead to genuine ambiguity, we get around it pretty easily.
Is this really a instance of consonants versus vowels or an instance of a large group versus a small group? The biggest reason it is hard to distinguish your second example, is likely because you are changing a larger percentage of the word than in the first example. If you select a random 5 consonants and change them, I have a feeling you would get similar results to changing vowels.
For example, you could probably interpret "thuh kwuhck bruhn fuhx juhmps uhvr thuh luhzy dug" quite readily, but not "duh dwid drowd dod dudd oded duh dudy dod".
It is true that vowels are sometimes the distinguishing phoneme between minimal pairs, but the reality is that we are quite adept at calibrating ourselves to cope with vowel variation.
One example is actually head size, of all things. I won't go into all the psychoacoustics involved (not my area of expertise), but your brain adjusts its expectations for how a given vowel should sound based on the speaker's unique "instrumental" qualities. In other words, you can end up with two very objectively different waveforms that your brain is able to match up without hesitation.
A more familiar example is dialects. A Southern American speaker might say "ice", and to a General American speaker, it may sound much more like "ass", but this kind of variation doesn't really lead to the hilarious confusion one might expect. After a quick "scan" of another dialect, we usually adapt fairly quickly, and we retain those "settings" for the next time we hear a similar dialect.
One example that's not so easy to work around might be the pen-pin merger. In general, it leads to few mix-ups. But those speakers, "pen" and "pin" are homophones, and unlike other such pairs (bin-Ben, lint-lent, mint-meant, tin[t]-ten[t], win-when, etc.) they may not be easily differentiable by context. In response, many speakers refer to pens as "ink pens" and pins as "stick pins", which avoids the ambiguity (animal pens and female swans aren't really an issue I suppose).
So even when vowels lead to genuine ambiguity, we get around it pretty easily.