The paper mentioned was submitted in May. I think encoder-decoder architectures still make a lot of sense for multimodal models.
There are still many low hanging fruits. I have probably seen dozens of variations of chain-of-thoughts, tree-of-thoughts, graph-of-thoughts, self-ask, self-critique, self-plan, self-reflect, etc.
Is there a paper, or some other source, that synthesizes this grab-bag of techniques? As an interested outsider, it's hard to keep up with all the action. Would appreciate any pointers that try to bring some of the pieces together.
Thank you for that. I wonder if you'd be able to just post a link to the paper, or give its title? I didn't have a Meetup account, tried to create one, spent 4 minutes dicking with it, and still can't get it to let me in.
There are still many low hanging fruits. I have probably seen dozens of variations of chain-of-thoughts, tree-of-thoughts, graph-of-thoughts, self-ask, self-critique, self-plan, self-reflect, etc.