Creating a Mycroft Skill I.: First Steps

I’ve been intrigued by voice-activated assistants like Amazon Alexa for quite some time. Unfortunately, the closed nature of these tools, along with accessibility issues in their associated development environments, has made me disinclined to write skills for them. I’ve recently begun hacking on Mycroft, however, and the experience has been both amazingly pleasant and incredibly quick.

What is Mycroft?

Mycroft is the world’s first open source assistant.

Mycroft runs anywhere – on a desktop computer, inside an automobile, or on a Raspberry Pi. This is open source software which can be freely remixed, extended, and improved. Mycroft may be used in anything from a science project to an enterprise software application. – The Mycroft Website

It took some work, some of it unintuitive, but I managed to get Mycroft running on my Raspberry Pi. With that accomplished, I began building my first skill. This series of posts documents that process.

What Will We Build?

I use public transit heavily, but dislike checking bus schedules. I wish I could simply ask a voice-operated assistant for stop details, departure times, etc. Likewise, I waste lots of time refreshing real-time transit feeds to track upcoming arrivals and departures.

What we’ll build, provided things go as I hope, is an assistant that can answer these questions and perform the following tasks:

  • “Give me information on stop 1068.”
  • “What is the next departure from stop 1068?”
  • “Which stop is on the south side of Woodrow and Anderson?”
  • “When does the next westbound 323 depart from stop 1068?”
  • “Notify me 10 minutes before the next 323 departs.”

I should also note that, until recently, I haven’t worked with Python in any significant capacity for nearly two decades. If I’m doing something non-idiomatic, please let me know and do be gentle. This project is as much about relearning Python as it is about building a Mycroft skill.

With that in mind, let’s get started!

Laying the Foundations

I began with Mycroft’s great guide to writing your first skill. I also checked out several existing skills, did some research into Python libraries, and came up with this project skeleton:

from datetime.datetime import now
from os.path import dirname, exists, join

from adapt.intent import IntentBuilder
from fuzzywuzzy import fuzz, process
from mycroft.skills.core import MycroftSkill, intent_handler
from mycroft.util.log import getLogger
import pygtfs
import requests

__author__ = 'Nolan Darilek'

# Logger: used for debug lines, like "LOGGER.debug(xyz)". These
# statements will show up in the command line when running Mycroft.
LOGGER = getLogger(__name__)

class GtfsSkill(MycroftSkill):

    def __init__(self):
        super(GtfsSkill, self).__init__(name="GtfsSkill")

    def initialize(self):
        super(GtfsSkill, self).initialize()

    def stop(self):
        pass

def create_skill():
    return GtfsSkill()

Loading GTFS Data

There is, unfortunately, not a good general-purpose transit API. Let’s start by loading a static GTFS feed. For bonus points, we’ll also do the following:

  • Make the GTFS URL configurable, both in local files and via the home.mycroft.ai web interface.
  • Add an intent that refreshes the transit feed. Saying something like “Hey Mycroft, refresh my transit data” should pull in the latest GTFS data.

First, add this to settingsmeta.json:

{
  "identifier": "GtfsSkill",
  "name": "GTFS Transit",
  "skillMetadata": {
    "sections": [
      {
        "name": "Options",
        "fields": [
          {
            "type": "label",
            "label": "URL to the GTFS feed you wish to track"
          },
          {
            "name": "gtfsURL",
            "type": "text",
            "label": "GTFS feed URL",
            "value": ""
          }
        ]
      }
    ]
  }
}

Then, modify the class created above as follows:

    def __init__(self):
        super(GtfsSkill, self).__init__(name="GtfsSkill")
        self.gtfs_path = join(self.file_system.path, "gtfs.zip")

    @property
    def schedule(self):
        return pygtfs.Schedule(join(self.file_system.path, "gtfs.db"))

    def initialize(self):
        super(GtfsSkill, self).initialize()
        if not exists(self.gtfs_path) and self.config.get("gtfsURL"):
            self.refresh_gtfs(speak_messages = True)

    def refresh_gtfs(self, speak_messages=False):
        if speak_messages:
            self.speak_dialog("refreshing")
            self.speak_dialog("wait")
        response = requests.get(self.config.get("gtfsURL"), allow_redirects=True)
        gtfs = self.file_system.open("gtfs.zip", "w")
        gtfs.write(response.content)
        if speak_messages:
            self.speak_dialog("importing")
            self.speak_dialog("wait")
        gtfs.close()
        pygtfs.overwrite_feed(self.schedule, join(self.file_system.path, gtfs.name))
        if speak_messages:
            self.speak_dialog("done")

    @intent_handler(IntentBuilder("RefreshIntent").require("RefreshKeyword"))
    def handle_refresh_intent(self, message):
        self.refresh_gtfs(speak_messages = True)

Create another file called vocab/en-us/RefreshKeyword.voc with the following content:

refresh
reload
download
retrieve

In dialog/en-us/done.dialog:

Import of transit data complete.
Import of transit feed complete.
Import of transit data finished.
Import of transit feed finished.
Import of transit data done.
Import of transit feed done.
Done importing transit data.

In dialog/en-us/importing.dialog:

Importing transit data. This may take a while.

In dialog/en-us/refreshing.dialog:

Refreshing transit data.
Updating transit data.
Refreshing transit details.
Updating transit details.
Refreshing transit feed.
Updating transit feed.

And, finally, dialog/en-us/wait.dialog:

Please wait.
Please be patient.

There’s quite a bit going on here:

  • We create a self.path variable to access the skill’s local storage on the filesystem, with a little help from Mycroft’s FileSystemAccess module.
  • We declare a pygtfs schedule as a Python property since it will be accessed from several threads.
  • A helper function downloads GTFS data, writes it to a local file, then imports it into a SQLite database.
  • Spoken feedback is provided at various stages in the process. The refresh function can also run silently if desired. I’ve added an assortment of phrases so the dialog is a bit more interesting and dynamic.

You’ll also notice code to handle our first intent. Specifically:

    @intent_handler(IntentBuilder("RefreshIntent").require("RefreshKeyword"))
    def handle_refresh_intent(self, message):
        self.refresh_gtfs(speak_messages = True)

This uses a slightly different format than is documented in Mycroft’s tutorial. The @intent_handler annotation makes associating intent handlers with keywords significantly more pleasant.

With this code in place, Mycroft can respond to commands like:

  • “Hey Mycroft, retrieve the latest GTFS data.”
  • “Hey Mycroft, I want you to download the latest transit feed updates.”

Running the Skill

On Linux, my desktop platform of choice, this was fairly straight-forward. I can’t speak to how well it works elsewhere, though.

First, follow the git clone installation instructions. Once you’ve built the initial development environment, run the following from the repository clone:

$ ./start-mycroft.sh all
# ...
$ ./start-mycroft.sh skill_container <path/to/skill>

The skill is then launched in the foreground. To interact, run the following from another terminal:

$ ./start-mycroft.sh cli

This places you in a command line interface, where anything you type is treated like a spoken command to Mycroft.

What’s Next?

In the next post, we’ll:

  • Explore how matching works in more detail.
  • Match against regular expressions to build more complex queries.
  • Use fuzzy string matching to pair spoken input against written text.

If you’re impatient and are ready to skip ahead then check out this repository, where I’m already providing stop descriptions in response to information requests. Stay tuned!

TMA, the tmux Automator

I’ve recently shifted my development workflow toward Neovim and tmux. This is a gradual process–I generally work at it for longer and longer each day–but I’m slowly noticing productivity gains that simply weren’t available with my previous tools. One pain point that I haven’t found a good solution for is automating tmux sessions, spinning up my editor and shells in repeatable ways.

I’ve looked at various projects, but none of them quite fit the things I want and don’t from a tmux automator:

  • I don’t want to pull in a separate interpreted language. Ideally, I want a single binary I can scp anywhere I might need it.
  • I don’t want something that does more than automate tmux sessions. Don’t spin up an editor, don’t manage a directory of project files. Just spin up and shut down tmux sessions.
  • I don’t want project files managed in a separate place. Ideally I should have a single configuration file that, say, specifies my ideal development environment for a Rust/Elixir/Node project, then simply copy that file into a new project directory.
  • I don’t want a full programming language or templating system in my configuration files. If I don’t have a configuration file for a specific project type at hand, I should be able to bash one out from memory.
  • I want to specify as little as possible. An automation solution can probably infer the root directory for a project based on from where it is launched. Likewise, most of my projects are in uniquely-named directories, so infer the session name as well.

So over the course of a few days, I whipped up this little tmux automator. Written in Rust, it’s a single binary that runs anywhere (as long as the architectures match, of course.) Configuration files are based on TOML, and are incredibly easy to create from memory once you get the hang of it. As an example, here is the configuration I use when developing tma itself:

[[window]]
name = "code"

[[window.pane]]
command = "vim"

[[window.pane]]
command = "cargo watch"

This gives me a code window with two vertically-stacked panes, one running my editor while the other recompiles the project on code change. I almost don’t need the latter pane with Neovim’s Language Server integration, but I’ll keep it for now.

Most of my projects have been somewhat large in scope. This one, I’m glad to say, seems to be just about done. I need a few extra command line flags for better tmux integration, the ability to set session names from the command line, and configuration variables for running commands before new windows/panes are open. Once those are in, I can probably focus exclusively on maintenance and move on to other things. Declaring something finished is a nice feeling. Sure, it may not work with every tmux version out there, but it works well enough for me, and I hope others find it useful as well.