This talk delves into the application of a hybrid cascaded architecture for optimized wakeword detection, focusing on its implementation in Roku's Voice Remote Pro. The importance of accurate wakewords for handsfree operation is introduced, followed by a discussion on how the hybrid cascaded architecture addresses the challenges in wakeword detection. These challenges include accuracy, low latency, low power consumption, noisy environments, and different pronunciations. A hybrid approach, which combines edge and cloud models, is presented as a solution to effectively manage these challenges. The cascaded architecture, a two-stage process involving a remote keyword spotter and a cloud-based validation model, is explained, highlighting how it reduces false rejects while manages false accepts. This talk concludes by discussing the effectiveness of this approach and its successful application in Roku's Voice Remote Pro. A Q&A session follows for further discussion.
Frank Maker
Frank Maker is Director of Software at Roku - his team is responsible for Voice, EdgeML, and Remote software He is the engineering owner for remotes and develops innovative EdgeML models for Roku´s new products.
In his role, Frank is responsible for:
* Embedded machine learning (TinyML / EdgeML)
* Microcontroller firmware development
* Embedded Linux firmware development (RokuOS)
* Embedded machine learning model development and deployment
* Automated EdgeML testing
* Wi-Fi remote development