My experience setting up the google AIY voice kit
I received the kit as a christmas present and while the official instructions look very simple, there's a lot to be figured out. Many things are not compatible, unnecessarily confusing and not working out-of-the-box as you might expect. I'll try to summarize the problems I encountered here.
Voice recognition with the voicekit can be done with the following two options:
- Using the local google-assistant-library for python (can't be done with a Pi Zero W). This only offers english as the input language.
- Using google cloud service (requires a google billing account. In the EU this cannot be done without a commercial account)
As a german citizen, this left me with only the first option and I have no idea if german will be supported by the library.
Many resources on this topic deal with a different point in time. 6 months ago the folders were named differently, the standard image did work with a Pi Zero W and more config was needed. All of this made googling for solutions not as easy as I wished.
The hardware
In addition to the voice kit, you will need the following things:
- a micro sd card that will contain the operating system and a SD card adapter to write data onto it
- a power supply that reliably outputs 2+ Amps (some PSUs are labelled 2.5A but dont provide that much power, I ended up buying this), otherwise you'll be stuck in a boot loop
- A Raspberry Pi 3B. Save yourself the time and DON'T pick a Zero W. Why?
- You'll need a mini-HDMI adapter and a USB OTG cable for mouse/keyboard
- The image that google currently recommends comes with python binaries that were compiled for ARMv7. The Zero W has an ARMv6 cpu which cannot execute those binaries. You will get an Illegal Instruction error.
- You can work around this issue using an older version of the aiyprojects-raspbian image that contains binaries compatible to ARMv6. But there is no official way to download older versions. I used an image downloaded from some dude's dropbox (from this tutorial), which I do not recommend (No offense, LukeK1990). That older version has a funny bug that makes the Pi not find SSIDs on channels 12 and 13. Guess what channel my wifi was on.
- The google assistant library for python is not available for ARMv6. Taken from one of the setup scripts:
# The google-assistant-library is only available on ARMv7.
if [[ "$(uname -m)" == "armv7l" ]] ; then
env/bin/pip install google-assistant-library==0.0.3
fi - ...just use a Raspberry Pi 3B.
- If you do the setup correctly, your device should show up in your wifi or ethernet with ssh running. If not, you will need a mouse, keyboard and monitor (with hdmi or an adapter) to see what your Pi is doing.
The Setup
Assemble the physical parts just like the official instructions state. Before the first boot, modify the image on your SD card for headless mode so you can access it using ssh over the network.
Wifi
If you choose to use wifi to get your Pi into your network, place a file called wpa_supplicant.conf into the root of the sd card. Put the following text inside it:
network={
ssid="your-wifi-ssid"
psk="your-wifi-password"
}
This will cause raspbian to copy that file to /etc/wpa_supplicant/wpa_supplicant.conf, making the wlan0 interface connect to your wifi.
ssh
Place a file called ssh (without a file extension) to your sd card root. That will make sshd start on startup, which, by default, doesn't.
Start it!
Put the micro SD card in the slot, then plug in the power. The device should show up in your network after a minute. Your can then ssh into it using the credentials pi/raspberry. It is recommended to change the password using passwd
If your want to access the desktop remotely, the easiest way is to start the vnc service via the raspian's desktop start menu using Raspberry Config.
The confusing part
The python project on the image is a clone of the git repo at https://github.com/google/aiyprojects-raspbian. That repo has several branches, each of which seem to serve some purpose. And the different images of aiyprojects-raspbian come with a clone of different branches. And the official instructions switch the version of the image without a changelog or telling you about it.
In the forums, user sheridat tries to clear up why the different branches exist. I am fine with just having a branch with a script that recognizes voice inputs (master). After all the shit I had to go through I literally forgot that this was the goal all along.
- The master branch
- is deprecated
- "contains code that was released with the initial release of the Voice Kit, implementing a voice recognizer with various different voice commands implemented in src/actions.py" (from the README.md)
- contains a runnable main.py that can be run as a service without any modifications to get the assistant up and running. Custom commands can easily be added by editing the action.py file.
- Is part of the May 3rd image in ~/voice-recognizer-raspi. This path name shows up hardcoded in script files in the other branches but it makes no sense there.
- The voicekit branch
- is also deprecated
- the api for the assistant received some further development, so this branch is more up-do-date than master
- contains example files that can be customized
- Is part of the November 9th image in ~/AIY-voice-kit-python. As mentioned before the files in the scripts folder reference the (by default) nonexistant folder ~/voice-recognizer-raspi, making this hard to understand without having looked at the other branches.
- The aiyprojects branch
- is the only non-deprecated one
- came into existence because in addition to the voicekit, there will also be a visionkit.
- This branch contains the api and examples for both visionkit and voicekit
- "The code for all AIY kits is in the aiyprojects branch, and is included in images starting with aiyprojects-2017-12-18.img" (from README.md)
- aiyprojects-2017-12-18.img was never linked by the official instruction page
In early January 2018 the aiyprojects team made the instructions point to an image tagged aiyprojects-2018-01-03. I have no idea what funny stuff we can expect from future images.
Some things I would've hoped for by the google team:
- A changelog of what the different images do
- A way of downloading older versions of the image
- Anything that would make this "The confusing part" chapter obsolete
Using the correct branch
My suggestion is that you use the master branch. If it's not in your home folder already (depending on the image), clone it using
cd ~
git clone -b master https://github.com/google/aiyprojects-raspbian.git voice-recognizer-raspi
The projects use a virtualenv for execution, meaning that the project has it's own copyof the python binaries. The dependencies needed to run the project have to be installed into that virtualenv if you just cloned it. To do that, do:
cd ~/voice-recognizer-raspi
cd scripts
sudo install-deps.sh
That might take a while. You will then have a folder in your project called env. You should also install the services, so that the LED can be accessed:
cd ~/voice-recognizer-raspi
cd scripts
sudo install-services.sh
That script tries to install services to /etc/systemd/system so that you can control them with systemctl. If your image already has (some of) these services pre-installed, the script fails. Try to take a look at the script and perform each step individually as needed.
You will need your /home/pi/assistant.json file just like in the official instruction so that the library has the necessary credentials.
React to "ok, google"
The install-deps.sh script copied a config file to ~/.config/, edit that file to enable saying "ok google" once the script runs:
nano ~/.config/voice-recognizer.ini
then change line 5 to
trigger = ok-google
then save the file.
Run the script
Tell the bash to use the virtualenv using the source command, then run src/main.py
cd ~/voice-recognizer-raspi/
source env/bin/activate
src/main.py
You should now have a looping script that reacts to the ok-google trigger. You can add custom commands in the action.py file.
This took an exhaustlingly long time to set up. There are forum posts and github issues all over the place which is a strong contrast to how easy the official instructions look.
I hope this helps anyone.
My next plan is to add a custom hotword instead of "ok google", which has been proven to be doable. Then I will be able to start my computer using wake on lan with a custom trigger and custom command.
Troubleshooting
- If your Pi is restarting all the time, the power might not be enough. If you have a power bank, use it for tests, since they tend to output a lot of power.
- To scan for wifis manually, take a look here:
sudo iwlist wlan0 scan | grep ESSID
lists all the wifis.
Further resources
- https://www.androidauthority.com/google-voice-kit-review-774411/
- https://www.youtube.com/watch?v=ztVDaq4oPqA
- https://www.youtube.com/watch?v=ELorGnc9aSM
- https://www.raspberrypi.org/forums/viewforum.php?f=114