The FMV (4.8) and clone (6.0) features are intended to be easier-to-use wrappers around ifunc. FMV-4.8 (C++) generates the ifunc dispatching code for the programmer, and FMV-6.0 (C/C++) moves on to do the code duplication for the programmer too. It is added with the assumption that GCC's optimization / vectorization magic is good enough to gain significant performance gains by just changing -march options on the exact same piece of code.
> the question is how many exceptions would arise, and what the speed penalty would be to fall back to the next slower implementation.
This can be done for essentially free with ifuncs.
Here's a crummy example from the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.ht...
Here's a more thorough and (IMO) intelligible treatise on the subject: https://lwn.net/Articles/691932/