Member-only story
Your Daily AI Research tl;dr — 2022–08–31 🧠
Turn-Taking Prediction for Natural Conversational Speech, Feature Pyramid Diffusion for Complex Scene Image Synthesis and the tenth iteration of “This AI newsletter is all you need”!
Welcome to your official daily AI research tl;dr (often with code and news) for AI professionals where I share the most exciting papers I find daily, along with a one-liner summary to help you quickly determine if the article (and code) is worth investigating.
1️⃣ Turn-Taking Prediction for Natural Conversational Speech
While a streaming voice assistant system has been used in many applications, it is merely powerful for one-way discussions and basic question/answer unnatural interactions. As you know, it works pretty bad if you pause to think or accidentally repeat words. They present a turn-taking predictor built on top of the end-to-end (E2E) speech recognizer to help with fluent, “real”, discussions.
Link to the paper: https://arxiv.org/pdf/2208.13321.pdf