Advertising

Toying with iChat's AV over XMPP, part one

I added fake caps&features to Z-XMPP to simulate iChat's AV support. List of the caps&features that I added:

this.features.push("apple:profile:bundle-transfer"); this.features.push("apple:profile:efh-transfer"); this.features.push("apple:profile:transfer-extensions:rsrcfork"); this.features.push("http://www.apple.com/xmpp/message-attachments"); this.featuresExt["ice"] = ["apple:iq:vc:ice"]; this.featuresExt["recauth"] = ["apple:iq:vc:recauth"]; this.featuresExt["rdserver"] = ["apple:iq:rd:server"]; this.featuresExt["maudio"] = ["apple:iq:vc:multiaudio"]; this.featuresExt["audio"] = ["apple:iq:vc:audio"]; this.featuresExt["rdclient"] = ["apple:iq:rd:client"]; this.featuresExt["mvideo"] = ["apple:iq:vc:multivideo"]; this.featuresExt["auxvideo"] = ["apple:iq:vc:auxvideo"]; this.featuresExt["rdmuxing"] = ["apple:iq:rd:muxing"]; this.featuresExt["avcap"] = ["apple:iq:vc:capable"]; this.featuresExt["avavail"] = ["apple:iq:vc:available"]; this.featuresExt["video"] = ["apple:iq:vc:video"];

And here's what I get when I press call:

0
62706c6973743030d501020304050607081a1a5b564353657373696f6e4944595643494345446174615e5643496e7669746565734c6973745f101156435365637572697479456e61626c65645e56434f72646572497346696e616c121c3b4d784f11020b040b73747265616d747970656481e803840140848484134e534d757461626c6544696374696f6e6172790084840c4e5344696374696f6e617279008484084e534f626a65637400858401690392848484084e53537472696e67019584012b11636f6e6e656374696f6e2068656c7065728692849396019284979807436f6d6d4e41548692848484084e534e756d626572008484074e5356616c7565009584012a849696008686928497980776657273696f6e8692849b9b9d96028692849798075349502d4943458692848484064e534461746100959681280184065b323936635d00000000000000010000000115f9ddce00000000002d000000000000656e3100000000000000000000000000c0a8018400000000d69a20fffe63f7424012000000000000656e3100000000000000000000000000c0a8018400000000d69a20fffe63f7424012000000000000656e317e0000000000000000000000003f57fe7bffffffff2965df00019c08bd40120000000000000000000000000003000000012b8105b7005d2440004600000000000065787465726e616c00030000000000005d8b25d300000000bea594be2d8d3240c326000000000000656e3100000000000000000000000000c0a8018400000000d69a20fffe63f7424012000000000000656e317e0000000000000000000000003f57fe7bffffffff2965df00019c08bd4012000080681f408686a2091cdb0a0b0c0d0e0f101112131415161718161518151a1b1556726561736f6e59696e76697465644279597663506172747949445c73656e64696e67566964656f5c70726573656e74697479494457617264526f6c655c73656e64696e67417564696f557374617465587573696e674943455e70726573656e746974794e616d65556572726f7210005f101e69767563696361407468652d6576696c2d6d6163626f6f6b2e6c6f63616c5f101f69767563696361407468652d6576696c2d6d6163626f6f6b2e6c6f63616c310909086b004900760061006e002000560075010d006900630061db0a0b0c0d0e0f101112131415161d181f1518211823155f101e706572696361407468652d6576696c2d6d6163626f6f6b2e6c6f63616c32095f101d706572696361407468652d6576696c2d6d6163626f6f6b2e6c6f63616c0910020956706572696361080800080013001f00290038004c005b0060026f027202890290029a02a402b102be02c602d302d902e202f102f702f9031a033c033d033e033f0356036d038e038f03af03b003b203b303ba03bb00000000000002010000000000000026000000000000000000000000000003bc
4

The long thing is a hexadecimal (yes, hex -- strangely not base64) string which, when decoded, is pretty much binary looking. There are a few things that seem to be strings. Let's try to find something readable in the string:

(removed since the binary text included some characters that confused various XML decoders; use a hex to string converter such as this one)

This mentions the caller and the receiver of the call. We can also see several interesting strings such as NSMutableDictionary, NSDictionary and NSObject; that is the inheritance path for NSMutableDictionary in Objective-C. This invitation looks like something serialized in Objective-C into some binary format, then converted into hex, inserted in XMPP XML, and sent over the wire. In fact... "bplist"... that sounds oddly similar to b-plist... binary plist? That could probably be easily decoded with an Objective-C based XMPP client!

Not only that: decoding that with, for example, Python (import binascii, binascii.unhexlify(data)) and writing to file with extension .plist shows that we can open this in Apple's Property List Editor, uncovering VCICEData (a large binary blob - "data"), VCInviteesList (an array), VCOrderIsFinal (a boolean set to false), VCSecurityEnabled (a boolean set to false), and VCSessionID (a "number").

VCInviteesList members are two dictionaries. In the dictionaries, we have these items: ardRole (number 0), error (number 0), invitedBy (string with caller jid), presentityID (string with caller or callee jid), presentityName (string with caller's or callee's name), sendingAudio (boolean set to true), sendingVideo (boolean set to false), state (0 with caller, 2 with callee), usingICE (boolean set to false with caller, true with callee), and finally vcPartyID (string with entity's jid suffixed by 1 or 2, depending on place in array).

Surprisingly, iChat ignores completely when an iq-error stanza is sent to signify that the feature needed to parse iq-query is not supported.

Googling for AVChatConferenceData surprisingly uncovers nothing. Could it really be that noone has yet tried to discover how iChat's AV works in combination with XMPP?

Oh, and canceling the call goes like this: 1

I'm not sure I'll keep on pursuing this, especially since I didn't write a client to play with. I'm not keen on studying Telepathy, Gabble, Sofia-SIP and Farsight, closest thing one could find to something opensource that could be used to implement iChat's SIP-based AV. I don't exactly have the time required to write a full XMPP client for desktops or iOS just to BEGIN trying to figure out iChat's AV -- maybe I'll be crazy enough some time in the future :-)

Have fun!


rest of the post
About me

Which XMPP technologies to use?

I've been writing a piece of software in order to perform a major upgrade of ZATEMAS, a specialized web app suite I'm a co-author on. Since there is a small community of users on it that might interact better if they had an opportunity to do so, I came to an idea to provide them with an instant messaging client built right into the browser, similar to Facebook Chat, Meebo Bar and Gmail Chat. I've picked XMPP as the protocol; better known as Jabber, this is the same protocol Google Talk uses.

Examining various options took a while. I've looked at Meebo Bar, and concluded that it doesn't fit my use case, since I want users to be automatically logged in on a local XMPP server.

I've examined several clients. In the process, I've learned a bit about BOSH (the standardized method for using XMPP over HTTP), about Apache's mod_redirect, and a bit about rules for doing cross-domain xmlHttpRequests in Javascript.

The only serious contender for the throne of the best open easily deployable web-based XMPP client is JWChat. This is a venerable old client which creates a popup and behaves much like a typical desktop IM client. This means it was unsuitable; there is no easy way to embed it in a page, yet preserve its state upon page switching.

This is actually the biggest problem I came upon. There seems to be no XMPP client nor library written in Javascript that can trivially handle page switching (which means destroying entire Javascript context); you cannot trivially serialize their state. Which is why I'm proud to announce a custom-made client, Z-XMPP. It's already available for preview on https://bitbucket.org/ivucica/zxmpp, and will soon be available for preview on ZATEMAS. Its license is currently a custom one, hand-written, and most definitely not satisfying any open source/free software definition. That will change soon, as soon as I pick the right license.

That brings us to the other part of the XMPP stack. Which server to pick? Which BOSH connection manager to pick?

There are, really, two contestants in my arena when it comes to a server. First one is a veteran of XMPP, ejabberd. ejabberd is written in Erlang, and is massively scalable. It's trivial to install and configure in Debian, and it supports a lot of cool features out of the box. It supports something called "shared rosters", which basically means you can create groups in people's contact lists that you, as the admin, can enforce to contain whomever you want. You can force people, for example, to see everyone else working in the company. This is a critical feature for ZATEMAS, just as is so-called "external authentication". What I'm missing here is: external auth does not support fetching any attribute apart from basic operations with passwords (is it correct? please change it!), and vCard cannot be created from command line, only updated. This means I cannot trivially set people's real names during an update run.

Obviously, ejabberd has flaws, and I cannot easily update it since Erlang is a language fundamentally different from any other I worked with.

So the second contender is a server I discovered only tonight. It's Prosody, an extremely lightweight XMPP server written in Lua by a bunch of very friendly folks. I really like the attitude and personal approach the principal author of Prosody has, but that's not all. Server's source code looks extremely well organized, the server is quite featureful, and most importantly, it's written in a language that mere mortals can understand. I'm not a big fan of Lua, but I can read it, and I can update it, especially when it's so well written as Prosody seems to be.

I'm currently not very familiar with Prosody, but the fact that I managed to set it up very quickly and that it starts up momentarily... well, I think that we can hack a ZATEMAS-based external authentication module into it! Also, I think I might be able to better add my own debug functions, to easily see what I did wrong while developing my client.

Both ejabberd and Prosody come with a BOSH connection manager (the thing that translates HTTP requests into a continuous XMPP stream; a continuous XMPP TCP stream is something you cannot achieve from the web). So why another one?

Well, perhaps you want to log into Google Talk!

Yep, folks, that's what PunJab allows: have your BOSH-based client log into any XMPP server. I must say I like PunJab; it's written in Python. Despite that, its internals seem a bit less clear than Prosody's, yet still manageable. PunJab does its job and does it extremely well.

So there you have it. Perhaps we'll soon have an opportunity to talk about how to install Z-XMPP instead of just talking about why and how I'm working on an IM service :)

Until next time!


rest of the post