I’ve been creating a semantic search using embeddings tonight against my own podcast transcripts. I’d be happy to have my own content surfacing mechanism like this!
Building out the search a little more to support exact matches would also be super useful in this flow. For example, I've been on several podcasts talking about Notebook.ai, but searching for the name also matches "notebook", which results in an unusable signal-to-noise ratio (seeing every podcast that says the word "notebook"). Likewise, it'd be great to quote-search exact matches for "Andrew Brown", instead of seeing all podcasts that mention "Andrew" or "brown".
If I were a sponsor looking for a podcast I would want my search process to look something like this:
- Search for a term relevant to my line of business
- See a list of podcasts ordered by % of utterances which contain my key phrase throughout their last N episodes
- Annotation of how many listeners each podcast had in last N episodes