Have you ever thought it would be neat to be able to control your browser from another application? Me too. In fact, I think we should be able to control all our applications from other applications.
In this post I’m going to go over a hodge podge of pieces I put together to be able to play and pause YouTube videos in Firefox by just making an HTTP POST request.
So here is a run down of the plan to make this happen. First, create a Firefox plugin. Have the plugin look at all the tabs being opened. If a tab is a YouTube tab, add it to a list.
Next, we’ll create a page-mod for our YouTube pages. A page-mod is a script that gets added to a page from your plugin. We’re going to use it to fire functions that find the video tag on the page and pause or play them.
Now we need a way to control it. To accomplish this we’ll need two components. A client and a server. Our server will be a Ruby Sinatra application. It’s sole purpose in life is to be a queue.
The server will just take HTTP POST requests with a
cmd parameter in the form
data and put them in a queue. When an HTTP GET is requested on that queue, the
server will pop the
cmd and return it.
If there is nothing in the queue, the response from the server will just be null. If there is a command, we’ll send it from our page-worker to our main plugin which will in turn loop through all our YouTube tabs, and call the appropriate function we added to it with our page-mod.
Clear as mud, right? Ok, good! Lets look at the parts individually.
The Firefox Plugin
Making a Firefox plugin is an interesting concept that definitely warrants its own blog post. Here we’re just going to go over the quick and dirty getting started concepts.
First, make sure you have node.js installed and crack open a
command prompt. In the prompt run
npm install -g jpm. Use sudo if you’re a Mac user
or Unix wizard. This installs the Firefox plugin toolkit.
From there, check out Mozilla’s great Getting Started page regarding the creation of a plugin. Use it to start your own.
Once you’ve created your plugin skeleton, open up the
index.js file. This is the file
task is to peek at the tabs as they are opened, check their URL for YouTube, add some
magic script to it and push that tab into a list.
Below is the code for getting a sneak peak at those tabs and injecting some scripts into them.
In the first few lines of code you can see us pulling in some plugin SDK elements. The
self var lets us get access to other files in our plugin.
tabs var lets us control the tabs within Firefox, and the
URL var lets us do some
work with URLs that would otherwise be tedious and annoying.
Next we setup an array to hold our YouTube tabs. Then, using the
we watch for new tabs that are ready to rock. Once they are, we use
URL to check if they
are YouTube tabs. If it’s not a YouTube tab, we don’t really care and move on.
If it is a YouTube tab, we attach a script to it, make a small object that keeps track of the tab and the script, then stuff it into our array for later use.
You can see the
self var in use when we attach the script file to the tab.
self.data.url lets us find the “url” of a file in our data directory.
So, what exactly is that
youtube.js file that we’re attaching to our YouTube tabs? Well,
first off it’s in a directory named
data in our plugin skeleton. This folder might not
exist and you may have to create it.
This file is code that will be injected into our tabs for immediate or later use. We’re
going to setup a
port between our
index.js file and our YouTube tabs. Here is the code
We have access to the
self object in our injected file, and the
port property lets us
respond to and emit messages between files. So, any time we call
in our main file, it’ll fire that function in this file.
I’ve created three messages the YouTube tab can respond to. In the toggle function you can see that we’re just finding the video tag and using normal HTML5 functions to play or pause the video depending on it’s current state.
Now onto our server, which is just an over-engineered queue. This part is built with Ruby using Sinatra. It’s also setup to handle multiple queues, but we’re just taking advantage of one for now. I have bigger plans for this beast eventually.
Lets just jump right into the code.
This looks like a lot, but it’s pretty straight forward. First we bring in our Sinatra and
JSON gems. Then we set a Sinatra setting
:queues to be a hash. This will house our
queues. Everything is done in memory, as it’s all meant to be fast paced and short lived
Next you’ll see the
before call. This fires before every single request. In it we
setup CORS which stands for Cross-Origin Resource Sharing. If you want to get into the
specifics of it, ask Ryan. Here is the web dev run down, you need to just add a
bunch of headers and return 200 on the OPTIONS method for requests.
If you don’t do this, your plugin, which runs on a different port, can’t connect to your server, because it’s only expecting requests from the same domain. The port, in this case, is considered part of that domain.
Now, on to the
post '/queue/:qName' block. Here, we check if the key “cmd” was passed
in the data. If it wasn’t we toss a
BAD REQUEST error and halt everything.
If we have a command, we get our queue by whatever name was passed and stuff the command
string right in there. The
ensure_queue functions are just to reduce
some clutter. They make sure the queue we’re looking for exists, and if it doesn’t it
get '/queue/:qName' block is even more simple. Get the requested queue from our
hash and if it’s empty return null as JSON. If it’s not, pop the command off the queue
and return it as JSON. The
pop function is an alias for
deq so don’t worry, this is a
The final connection will be made using a page worker. The page worker is a hidden page you can create using your plugin to do work for you. Here is the code used to create a worker page.
First, we need to pull in the page worker sdk, then we create a page by calling
pageWorkers.Page and pass it some options.
contentURL is an HTML page to go along with your script. I just put a
tag in my
data/worker.html file for this.
contentScriptFile option, we provide an array of scripts to inject. The first
file is a copy of jQuery, so we can use it’s
$.ajax() function. The second is the script
that does the actual work, which we’ll get to in a moment.
contentScriptWhen option tells the system when to fire your scripts you’ve injected.
Here we’ve used the “ready” option, so when the page worker is ready, it’ll run.
Next we see the
pageWorker.port.on('command') function being setup. Once our main script
receives a “command” from our worker, it’ll loop through our YouTube tabs we stored
earlier and pass the appropriate command along to them.
Lets look at the
worker.js page to see how we’re connecting to the server.
This is actually a very good use case for web sockets, but since this was just a proof of concept for me, I decided to keep it simple and just poll the server every 500ms.
In each poll, we look for commands on the
firefox queue name and emit a command backend
to our main file. It doesn’t get much easier than that.
How to make sure it works?
This is all fine and dandy, but how do you make sure any of it actually works? Well, first
make sure your Sinatra app is running. Then in your command prompt under your plugin
directory, kick off
jpm run. This will open up a new instance of Firefox with your
plugin installed and isolated for testing.
Next, you can open up your favorite motivational YouTube clip and fire a POST command to your server! Remember, nothing is impossible. Don’t let your dreams be dreams.
There are about a billion ways you can POST data to your server, but I prefer to use another plugin called REST Easy. You can set it up to fire any HTTP method you need with any data you wish. It’s very useful.
There are a lot of pieces to this puzzle, but they all come together quite simply and in a very useful bit of functionality. I plan to eventually flush this out with other commands and set it up to use web sockets instead of polling.
A tool like this could be used to allow other applications to manipulate Firefox from a distance dynamically. It’s not like Selenium or other web drivers, where you script what you want to do ahead of time.
I hope this idea helps to inspire other things that can be remote controlled, or fun ways to use this functionality to increase productivity.