Using Amazon Polly with Node-Red

Using Amazon Polly with Node-Red

Amazon Polly and Node-Red. This was far easier to get going than any Alexa skill I tried to do. I just came across this the other day on Scargill’s Tech blog.

Amazon Polly is a service that turns text into lifelike speech. … With Amazon Polly, you only pay for the number of characters you convert to speech, and you can save and replay Amazon Polly’s generated speech.”

Basically you feed Polly SSML text and it returns it in spoken form as an MP3 (or other format). Sweet, I can give my server/Node-Red a voice! Not sure my wife will like this but it will actually server a valid purpose. I was admitted to the hospital (again) for a short three-day stay recently and while I was admitted (and loaded up on drugs) my server closet reached my maximum set temperature… and issued the text alert 187 times. Now I can have the server annunciate that the closet needs to be opened. I could add my wife’s number to the SMS list but that will fail, lol. Simply saving the MP3 on the server and having it be played back works just great.

I am sure you can tap into the API many other ways, and use Polly many other ways, but the simplest way was my path. I just logged into the Amazon Polly Developer console and slapped some text in the input box. It spit out an MP3 and I saved it. The catch here is that the free tier of the Polly program only lasts 12 months, then you get charged. Its super cheap though, $4 per 1 million characters. The free tier is 4 million per month (for 12 months). From what I have read Polly will search the cloud to see if the file already exists. If it does it pulls it down, if it doesn’t it creates it (and counts agains the character limit) and pulls it down. Is this for “the cloud” or your account? Dunno. Still pretty cheap.

I have circumvented the whole thing (for now). I don’t really plan on (or have the need to) generating Polly files on the fly. What works for me is as I said, just slapping a ton of phrases I think I’ll need into Polly and saving the files for later. I’m cool with that.

But, there is a node for NR that will generate Polly files for you on the fly, and/or pull down the ones already generated. I mentioned I found Polly on Scargill’s blog, he also used NR to play with Polly. He had to do it the hard way. Three days ago the node was made. Definitely check out his blog even if you just use the node. Its all good info.

Install node-red-contrib-polly-tts

https://flows.nodered.org/node/node-red-contrib-polly-tts
https://www.npmjs.com/package/node-red-contrib-polly-tts

npm install node-red-contrib-polly-tts

The node looks pretty simply to use (I have not used it). You need to add your Amazon Polly API credentials to the node. The input gets TTS’d and the file saved. If the text has already been converted no Polly call is made and just the file served. So it looks like it stores all the Polly MP3s for you locally. Sweet.

 

Custom Alexa Node-Red Skill

Custom Alexa Node-Red Skill

Finally! I am surprised it didn’t even take as long as it usually does for me to figure shit out. Took a few weeks. As usual, I did not come up with the solution on my own but found it out on the web and slapped it all together. I wanted a custom Alexa Node-Red skill, to be able to take a command given to Alexa and read back data from one of my sensors. Things like temperature sensors, water level, etc. I wanted to be able to ask Alexa what the values are. What I got: exactly what I wanted. It all works. There is two parts to this: the Node-Red flow and the Alexa skill.

Alexa Node-Red

First off, to get any of this working you must have your Node-Red server accessible from the outside world. That means port-forwarding, DNS, domains, SSL, all that. It’s fun getting it all working. Not. Just like my previous post, I happened to have it already setup. Once your Node-Red install is available from the web you are good to go. Now you don’t need the entire NR setup opened up either. I just allowed a few NR served pages to be available. Not the entire NR itself.

Update: I made a new post about Node-Red behind a reverse-proxy/SSL

Let’s Begin

It starts with a regular HTTP node to a switch node. That switch node splits up Alexa’s requests to NR; LaunchRequest, IntentRequest, SessionEndedRequest. LaunchRequest gets invoked when the skill starts. You could have Alexa say “Hello what do you want?” for example. IntentRequest is the goods. Then theres SessionEndedRequest, I’m assuming this gets called at the end. Haven’t toyed with it. Then you pass those requests off to do other stuff, like the DoCommand where it grabs your intent? Then a function node to extract the commands, which gets passed off to another switch node to split up the possible commands you can give Alexa. Give her as many commands as you want, then there is a “device doesn’t exist” at the bottom. This is used if she didn’t hear you right or the device doesn’t exist. All that data gets passed to a template to format what Alexa will say and sticks the data in JSON. Bam! That wasn’t so hard right?

Here’s the flow (all standard nodes):

That’s the Node-Red half. You are not done yet. On to the Alexa skills half. This part is easy don’t worry. Login to your Amazon Dashboard and click Alexa. Choose “Get Started” with the Alexa Skills Kit, click add a new skill. Under Skill Information give it a name and choose the invocation word, what you will say to Alexa to start your skill. I chose “Node Red”, so I have to say “Alexa, ask Node Red….”. These can be changed at any time it seems. You won’t be publishing this skill, it stays beta for only you to use. For the Global Fields section, no you will not be using an audio player. Well, maybe you will but I didn’t, and it will probably change things for you.

Note about the flow: The NR flow works (for me) just fine however I noticed it throws an error in the debug tab whenever a command is called. If it is an unrecognized command response it doesn’t throw the error though. It complains about headers already being sent. I will update the flow if I find a fix for it.

Interaction Model

Intent Schema

Intent Schema

This is the part of the Alexa skill where you tell it what to do. It is pretty straight forward. Just copy this to your “Intent Schema”. There are no custom slot types and no values to enter.

Sample Utterances

Sample Utterances

This is where you list the invocation phrases that will activate Alexa. Normally (and in other online tutorials for Alexa skills) this is where you add a ton of different phrases. But we are not. Node-Red is going to handle that side for us. This box just gets one line of text.

Configuration

Global Fields/Endpoint

For a service endpoint you are going to pick “HTTPS”. In a lot of other tutorials you will usually choose AWS Lambda but we are doing all of our own heavy lifting with NR. We don’t need no stinking Lambda. Choose your closest location and enter the URL that your Node-Red is accessible from (via the web remember). Say no to account linking and you can also leave Permissions alone.

SSL Certificate

Certificate for Endpoint

Choose the option that bests describes you. Most likely it will be the first option. For me I am using a subdomain that is already SSL’d with Let’s Encrypt so I choose the second option.

Test

Basically just leave the toggle flipped to enable the skill for you to use. you don’t need to do anything else on this page.

Publishing Information

Nothing to do here, you won’t be publishing this skill. Why? Because it requires too much setup on the users behalf. I don’t think Amazon would approve a half functioning skill that requires advanced user setup to get working. You could always try. Good luck.

Privacy & Compliance

Three no’s and one box to check, I mean as long as it all applies to you right 😉

Done

That should be it. With Node-Red available to the web and the flow implimented and with the new Alexa skill you just made you should be good to. I hope you found this useful, I sure wish I had found a blog post like this. Now go test it out with your Amazon Echo/Dot!

At the time of this writing a beta product appeared in the Amazon Dashboard for a the “Skill Builder”, looks to be a new UI for building Alexa skills. If this gets implemented for everyone in the future things may be different than they are described in this blog post.

This is where I found the goodness, buried deep in comments on a (awesome) blog.

Alexa Skills

Alexa Skills

I purchased an Amazon Echo Dot a couple weeks back. In the past I had stated that I would never put one in my house, and generally had a distain for voice activated things. I had done a little bit of reading on the Dot and I have been getting further into home automated things. I finally had a light bulb moment where I now understand all the coding. I am now dangerous. Haha.

Logically the next step was integrating Alexa with my custom build devices. All of my devices work off an ESP or Raspberry Pi (or will). I found Fauxmo Wemo emulation which is fantastic. That covers about 60% of my use, a simple off and on. But for things like my temperature sensors, hyrgrometers and water lever sensors I need to be able to pull data and have Alexa read it back. Like “Alexa, ask server for the temperature.” Or “Alexa, ask the garden for an update.” All of my devices communicate via MQTT. I started with Node-Red without Alexa so this is the direction I went. I still think it was the best route to go. All I need is an Alexa skill to read back different inbound MQTT topics (for different devices). I can’t believe there isn’t a skill already out there to read back MQTT data or homebrew temperature sensors. Or is there and I haven’t found it yet? The majority of the available Alexa skills fucking suck. There are very few actually useful skills to use. The majority of the skills I have seen are garbage. One off skills you would use once or twice then forget about. More Alexa joke skills than there needs to be.

So I started looking into making my own skill. This is what I have found to get started with creating a skill.

  • You need an Amazon AWS Developer account
  • You need access Lambda
  • You must subscribe to basic free tier of service
  • You must subscribe to Amazon EC2, which is only free for one year.

So is this a catch with using custom skills and Alexa? Will my skills then start costing me money after 12 months. If I cancel after 12 months do I lose all my skills? I’m thinking yes, I will bet you Amazon is planning on cashing in on this later? Or is it already that late in the Echo game that I am just being unfoundedly paranoid?

I have not yet created a skill. Javascript is not my forte and I haven’t sat down to dig in to this yet.

 

Some info that I found hard to find:
https://forums.developer.amazon.com/articles/45945/where-do-i-find-the-alexa-skills-kit-trigger-event.html

Great walk-through on creating a new skill:
https://www.youtube.com/watch?v=zt9WdE5kR6g