Since we’re had quite a few requests for the possibility to use Unified Remote with an Apple TV I started looking in to how this could be done. My first instinct was to look into the possibility of compiling the server for Apple TV. However this proved to be too much of a hassle and would require the user to jailbreak his or her Apple TV. All in all not really an option.
This left me stuck without ideas until a colleague pointed out that Apple has their own remote control app, maybe it can control the Apple TV. True enough, it had the option for remote controlling Apple TVs and I started digging in to how that worked.
What I found, using Wireshark for packet sniffing and a whole lot of Google Fu, was that it uses the same protocol as for remote controlling iTunes, a version of DAAP (Digital Audio Access Protocol) called DACP (Digital Audio Control Protocol). Apparently this protocol is far from well documented, so using Wireshark as well as some 5-10 years old Google Code Wiki articles I set out to investigate how the protocol worked.
A project called daap-client for Android had a very helpful article explaining some of the inner workings of the protocol. Basically it’s a very simple binary protocol over HTTP that uses tags to determine the data type of each packet. The packets look like this:
|Tag (4 bytes)||Length (4 bytes)||Data|
Now this leads to some interesting findings. First of all the fun fact that Length is always present, meaning that for instance a 32-bit integer has Length 4 (Just in case you happen to forget the length of an int) and in order to send a single byte you need 8 bytes overhead in the form of a Tag and Length = 1.
The other interesting thing is that this raises the question: How do we interpret the data? Obviously it has something to do with the tag but if you take a closer look at the tags provided by the article mentioned earlier the tags aren’t direct definitions of the data type contained in the packet. Instead there is a table in which one can look up the data type connected to a certain tag, however the table in the article lacked quite a lot of the types used by a more recent version of iTunes.
To figure out what tags correspond to what type I wrote a simple decoder based on the info I had. By pointing it to a computer running iTunes and fetching the URL
/content-codes I managed to print out all content codes in a neat and orderly fashion. Based on this a dictionary of each tag and its corresponding data type could be constructed.
To conclude, we have made some progress as to integrating Apple TV support in to the server but now comes the really challenging part, actually sending data, keeping a session and finding out how the protocol actually works. Stay tuned for more!