link: https://github.com/tcrensink/chat_term
I took a brief look at your code and it just looks like your run of the mill python and bash code, with an integration locked to tmux.
Have you ran benchmarks comparing the time it takes to show the full response of a prompt with a daemon and without?
I'm a bit skeptical that loading the amount of code that you have into memory takes that long, but I'm coming from a nodejs background.
link: https://github.com/tcrensink/chat_term