A couple of weeks ago I published a tutorial explaining how to build an Arduino robot controlled by Alexa Echo. I got thousands of views, tens of respects so thanks for this! For those who haven’t seen it yet, here’s the link: https://www.hackster.io/lukaszbudnik/voice-controlled-robot-ebbacf
Alexa Echo is not as popular as for example Android phone and it was a little bit difficult to build replicas. Also, as I was already using Android for bluetooth and cloud - phone communication, I thought about switching completely to Google so that it would be easier to build a complete solution/replica.The Control Flow
Here is how all the pieces are working together:
- Launch Android application
- Sign in into your Google account
- Device will store phone’s settings in Firebase Database
- You talk to Android phone
- Android application:
5.1 Records your voice
5.2 Uploads audio file to Google Storage
5.3 Invokes node.js application (hosted on Google App Engine) which will stream the recorded audio directly to Google Cloud Speech API
- Node.js application parses transcripts returned by Google Cloud Speech API and translates them to proper digital messages. Messages are sent to Google Cloud Messaging service
- Android application receives an incoming remote message and relays it over Bluetooth to Arduino.
- Arduino using HC-06 bluetooth serial module is reading the messages and moving the robot accordingly
Simple :)Google Cloud Services
In this section I will show you how I extended my original application and integrated it with a lot of additional Google Services. The changes include:
- Google Authentication
- Google Firebase Database - for storing phone’s settings. For example in the previous Alexa Skill we had to copy phone’s Firebase registration token and provide it to Robot skill. This is now stored in Firebase Database and handled automatically by the application.
- Google Cloud Speech API - for voice recognition
- Google Firebase Storage - for storing audio recorded by Android
- Google App Engine - hosts new node.js application which fetches phone’s settings from Firebase Database and then streams audio from Firebase Storage to Cloud Speech API.
The application is complete. No changes are required. You only have to download google-services.json from Google web console and you’re done (just like in the Alexa Skill version). Watch the video where I talk a little bit more about Google Services.Google Cloud Speech API
In this section you will learn how to:
- use Google Cloud Speech API node.js client
- how to set it up for streaming
- how to use different languages and improve accuracy of voice recognition
- finally how to stream audio and respond to voice recognition events
Google Cloud Speech API service is in Beta and its Node.js client is in Alpha stage. Keep that in mind that some things may change.
Still in Beta but the completeness (68 languages supported out of the box) and quality (while playing around with it had only 1 error when streaming from Firebase Storage to Cloud Speech API I got an error complaining about too fast streaming and asking to slow down) is pretty impressive. Watch the video for more details.Implementing new language support
Google Cloud Speech API (while still in beta) already provides support for 68 languages. Watch this video to learn how to add a new language support to the node.js application.Arduino Robot
Arduino is exactly the same as in the Alexa Robot Skill version. No changes at all. Below is exactly the same video I used in my previous tutorial.Show time
Here is my robot in action:Summary
My Robot understands both English and Polish. By now you should be able to add support for any other language (supported by Google Speech API of course).
The whole solution is open-source and available on github (see code section). I’m open to all pull requests so if you want to contribute, fire one my way.