Been capturing this exact pattern for about three years now with my Zoom H5 and a couple of external mics - an Audio-Technica AT2020 and a cheap dynamic I keep around specifically because the noise floor behaves differently.
The static you're describing isn't random. I started noticing it clusters after the question, usually in that 2–4 second window before any potential response. My working theory is that whatever process generates an EVP response also produces some kind of electromagnetic precursor - the static is almost like the channel opening up.
What I do now is run a secondary recorder simultaneously, a little Olympus WS-853 sitting a metre away. If the static appears on both units independently, I flag it as genuinely interesting. If it's only on one, probably interference from something mundane - phone signal, wiring in the building, that sort of thing.
The pattern is more pronounced in certain locations too. A farmhouse I visited near Brecon last autumn was producing this almost every single session. Consistent enough that I started mapping where in the room the static spiked using a basic EMF meter alongside the audio.
Worth asking - what are you recording on, and are you monitoring in real time or reviewing afterwards? That changes how you interpret what you're hearing quite a bit. A lot of people miss the static entirely because they're only skimming for voice-like sounds.
Would be curious whether anyone else has tried the dual-recorder method and compared results. Feels like the kind of thing we could build a proper methodology around if enough people are seeing the same thing.