The fit imho is that at least if you stick to numerics you'd have to try pretty hard not to be writing code in a way that could be easily translated into something that'd be runnable on a gpu (or larrabee).
...(and gpus / larrabee etc. aren't solely vector processors, but the idea is apparent).
Most of the bulk numeric actions in an array language map pretty nicely to the data-parallel approach you need to use to take advantage of a gpu or larrabee (if it ever shows up); in particular take a look through this:
...and see how much more straightforward it'd be to take advantage of (compared to SSE and so on). Your interpreter has to be a little more sophisticated (work has to be kept in units of 512 bytes) but seems much more tractable than previously.
Since this isn't a new idea there's history to learn from; it was previously the case that you'd get a speedup from offloading work to the vector units but not really a cost-proportionate one. But now if you look at the performance differential between cpus and gpus and their relative costs it starts making sense again.
What specifically do you see as the fit between GPUs and array languages?