Cool project; I'm working on something similar. But I'm using BFSK, which means two frequencies instead of 16.
You don't really need an FFT (expensive) to decode these 16 frequencies. Instead, you can shape them into square waves with a software Schmitt trigger, then measure time between transitions. When n successive transitions indicate the same frequency, you emit the corresponding nibble.
The more frequencies you use in your encoding, the longer it takes to discriminate in the receiver. I think this is why real-world FSK systems are usually BFSK.
You don't really need an FFT (expensive) to decode these 16 frequencies. Instead, you can shape them into square waves with a software Schmitt trigger, then measure time between transitions. When n successive transitions indicate the same frequency, you emit the corresponding nibble.
The more frequencies you use in your encoding, the longer it takes to discriminate in the receiver. I think this is why real-world FSK systems are usually BFSK.