Turn your Raspberry Pi into a Translator with Speech Recognition and Playback (60+ languages)

This project has been picked up by Make Magazine and Radioshack to create this great step by step guide for their Weekend Project Campaign. Check out the guide here, and the amazingly awesome video below:

I get many requests from people who are still looking for cheap, easy, and fun project ideas for their Raspberry Pi’s, so I wanted to share this translator project I’ve been working on. With very little effort, we can turn this 35$ mini-computer into a feature rich language translator that not only supports voice recognition and native speaker playback, but also is capable of dynamically translating between 1000’s of language pairs, FREE! Even if you are not interested in building this exact translational tool, there are still many parts of this tutorial that might be interesting to you (speech recognition, text to speech, Microsoft/Google translation APIs). Just like the rest of my posts, this one starts with our shopping list. Most of my readers will probably already have most of these items around the house:

Shopping List

QTY Required Items Price(USD)*
1 Raspberry PI $35.00
1 Micro USB cable $5.49
1 Logitech USB Headset $28.53
1 SD Card (class 4 and 4gb minimum) $13.10
Total: $82.12
Optional Items
1 Power Supply $9.95
1 HDMI Cable $2.28
1 Case $12.75

*There are definitely cheaper options available for USB Headsets, I chose the logitech as it is plug and play. For alternatives, check this list for verified Raspberry Pi supported sound cards


This tutorial assumes your Raspberry Pi has:
-the latest version of Raspian installed
-an internet connection
-the correct sound card drivers for your headset

Configuring and Testing Your Headset

Before we start writing any code, lets ensure that we can record and playback sound using our USB Headset. The easiest way to do this is with the built in linux commands ‘arecord’ and ‘aplay’. But first lets make sure our file system is up to date.

sudo apt-get update
sudo apt-get upgrade

Now, plug in your USB Headset and run the following commands

cat /proc/asound/cards
cat /proc/asound/modules

You should see that the Logitech Headset is listed as card 1. Additionally, the second command should show that the driver for card 0 (the default raspberry pi output) is snd_bcm2835 and the driver for card 1 (our logitech headset) is snd_usb_audio.

alsa cards module usb headset

This is a problem because it shows that Raspberry Pi defaults to transmitting sound over its built in hardware, and does not have an audio input device configured. To solve this, we need to update ALSA (Advanced Linux Sound Architecture) to use our Headset as default for audio input and output. This can be done by a quick change to the ALSA config file located in /etc/modprobe.d/alsa-base.conf:

sudo nano /etc/modprobe.d/alsa-base.conf

Near the end of this file, change the line that says

options snd-usb-audio index=-2


options snd-usb-audio index=0

Save and close the file and reboot the Raspberry Pi using the following command:

sudo reboot

After the system comes back online, the sound system should be reloaded so that when we rerun the above commands…

cat /proc/asound/cards
cat /proc/asound/modules

…we should see the USB Headset is now the default input/output device (card 0) as shown below.

alsa after update

We can now test this out by recording a 5 second clip from the microphone:

arecord -d 5 -r 48000 daveconroy.wav

and play it back through the headphone speakers:

aplay daveconroy.wav

To adjust the levels you can use the built in utility alsamixer. This tool handles both audio input and output levels.

sudo alsamixer

Now that our headset is configured, we can move onto the next step of converting from Speech to Text.

Speech to Text or Speech Recognition with a Raspberry Pi

There are a few options for speech recognition with rPi’s, but I thought the best solution for this tutorial was to use Google’s Speech to Text service. This service allows us to upload the file we just recorded and convert it to text (which we will later use to translate).

Let’s create a shell script to handle this process for us.

sudo nano stt.sh

with the following contents

echo "Recording your Speech (Ctrl+C to Transcribe)"
arecord -D plughw:0,0 -q -f cd -t wav -d 0 -r 16000 | flac - -f --best --sample-rate 16000 -s -o daveconroy.flac;
echo "Converting Speech to Text..."
wget -q -U "Mozilla/5.0" --post-file daveconroy.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12  > stt.txt
echo "You Said:"
value=`cat stt.txt`
echo "$value"

Make it executable

sudo chmod +x stt.sh

The last step before we can run the script is to install the FLAC Codec that is not included in the standard Raspian image.

sudo apt-get install flac

Now we can run the Script


This will automatically start recording your voice, just press Ctrl+C when you are done speaking. At that point the script uploads the sound file to Google, they transcribe it and return it so it can be displayed on our screen. Pretty impressive for only a few lines of code! Sample output below:
here is an example speech recognition raspberry pi

Microsoft Translation and Google Text to Speech

Now that we can record our voice and convert it into text, we need to translate it to our desired foreign language. I would love to be able to use Google’s Translate tool for this, but unfortunately there is a 20$ sign up fee for use of this API. I plan on purchasing this for myself, but I wanted to make this project free so every one had an opportunity to try it.

As an alternative, we will be using Microsoft’s translate service which currently is still free for public use. The list of supported languages and their corresponding codes can be found here. In our previous example we used a simple shell script, but for the translation and playback process – I’ve written a more powerful python script.

All of this code can be found on my github repository (contributions welcome!).

Lets first create the file:

sudo nano PiTranslate.py

and add the following contents

import json
import requests
import urllib
import subprocess
import argparse
parser = argparse.ArgumentParser(description='This is a demo script by DaveConroy.com.')
parser.add_argument('-o','--origin_language', help='Origin Language',required=True)
parser.add_argument('-d','--destination_language', help='Destination Language', required=True)
parser.add_argument('-t','--text_to_translate', help='Text to Translate', required=True)
args = parser.parse_args()
## show values ##
print ("Origin: %s" % args.origin_language )
print ("Destination: %s" % args.destination_language )
print ("Text: %s" % args.text_to_translate )
text = args.text_to_translate
def speakOriginText(phrase):
    googleSpeechURL = "http://translate.google.com/translate_tts?tl="+ origin_language +"&q=" + phrase
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def speakDestinationText(phrase):
    googleSpeechURL = "http://translate.google.com/translate_tts?tl=" + destination_language +"&q=" + phrase
    print googleSpeechURL
    subprocess.call(["mplayer",googleSpeechURL], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
args = {
        'client_id': '',#your client id here
        'client_secret': '',#your azure secret here
        'scope': 'http://api.microsofttranslator.com',
        'grant_type': 'client_credentials'
oauth_url = 'https://datamarket.accesscontrol.windows.net/v2/OAuth2-13'
oauth_junk = json.loads(requests.post(oauth_url,data=urllib.urlencode(args)).content)
translation_args = {
        'text': text,
        'to': destination_language,
        'from': origin_language
headers={'Authorization': 'Bearer '+oauth_junk['access_token']}
translation_url = 'http://api.microsofttranslator.com/V2/Ajax.svc/Translate?'
translation_result = requests.get(translation_url+urllib.urlencode(translation_args),headers=headers)
speakOriginText('Translating ' + translation_args["text"])

For the script to run we need to import a few python libraries and a media player.

sudo apt-get install python-pip mplayer
sudo pip install requests

The last thing we need to do before we can run the script is sign up for a Microsoft Azure Marketplace API key. To do so, simply visit the marketplace, register an application, and then enter your client id and secret passcode into the script above.

Now we can run the script:

sudo python PiTranslate.py -o en -d es -t "hello my name is david conroy"

The script has 3 required inputs:
-o orignation language
-d destination language
-t “text to translate”

hola me nombre david conroy

The above command starts in English and translates to Spanish. My favorite part about the whole tutorial is how quickly you can change between languages you are translating, and how the returned voice changes according to the destination language.

Putting it all Together

It is actually very easy to combine the two scripts we created in this tutorial. In fact, it only takes one line of code to be added to the bottom of stt.sh shell script we created earlier (assuming PiTranslate.py and stt.sh are in the same directory).

sudo nano stt.sh

python PiTranslate.py -o en -d es -t "$value"

For those of you who skipped around in this tutorial, here is the entire script again with that line added:

echo "Recording your Speech (Ctrl+C to Transcribe)"
arecord -D plughw:0,0 -f cd -t wav -d 0 -q -r 16000 | flac - -s -f --best --sample-rate 16000 -o daveconroy.flac;
echo "Converting Speech to Text..."
wget -q -U "Mozilla/5.0" --post-file daveconroy.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12  > stt.txt
echo "You Said:"
value=`cat stt.txt`
echo "$value"
#translate from English to Spanish and play over speakers
python PiTranslate.py -o en -d es -t "$value"

Now, run the Speech To Text script again, and it will translate it from English to Spanish by default.


Change your origin and destination languages in the last line as desired, and the PiTranslate.py script will do the rest! There are literally 1000’s of language pairs supported here. Here is a screenshot:


Video Demo

I apologize this video is a little shaky, it was difficult holding the headset to the phone while running the scripts.

Known Limitations and Additional Resources

Both the origin and destination languages have to be supported by Microsoft Translate and Google Translate in order for this script to work.

Language Codes:

Some special characters in certain languages will also cause trouble with the translation services, but I am working on a fix for that.


I really enjoyed working on this project as it incorporates a wide range of technology and tools to create something immediately useful and fun to play with. Plus, its all FREE. If you have any questions at all regarding this project, just leave a comment below or on github and I’d be happy to help you!

Read More

How to turn your Raspberry Pi into an FM Transmitter

For those who don’t know, the Raspberry Pi can transmit an FM signal directly. It’s a surprisingly powerful signal, too, and it’s very easy to do.

Following the guide on the Imperial College Robotics Society (ICRS) wiki, it took me less than 5 minutes to get the entire thing operational.

Step 1 – Download/Extract the Sample Code(GPL)

I am hosting a copy of their code located here. (this archive contains the source and binary).

wget http://www.daveconroy.com/SampleCode/Pifm.tar.gz
tar -zxvf Pifm.tar.gz

Step 2 – Attach the Antennae

Find an 8 inch piece of plain wire, and attach it to the GPIO4 port on your Pi. Technically the is step is optional, but my transmission range went from 200ft to 8 inches without it. Use the picture below as a reference.
raspberry pi fm transmitter

Step 3 – Run the Code

Usage: sudo ./pifm wavfile.wav [freq] [sample rate]

The second command line argument is the frequency to transmit on, as a number in Mhz. For example, this will transmit on 100.1 FM

sudo ./pifm sound.wav 100.1

You can use whatever frequency you’d like (88->108).

That’s it! Here is a video of mine working.

How It Works
According to the ICRS, it uses the hardware on the raspberry pi that is actually meant to generate spread-spectrum clock signals on the GPIO pins to output FM Radio energy.

For more information, and a link to the actual C code, visit the ICRS wiki. I’m also happy to answer any questions you have regarding my setup. Thanks!

Read More