JSONSelect Grows Up

JSONSelect is a query language for JSON. With JSONSelect you can write small patterns that match against JSON documents. The language is mostly an adaptation of CSS to JSON, motivated by the belief that CSS does a fine job and is widely understood.

I announced JSONSelect about two weeks ago, and there’s been some significant additions since. This blog post will provide a quick overview of the main changes to the language with examples.

If you’re interested in the really short version, go to the online demo and check out the last five selectors.

nth and objects

An oversight in the first release of JSONSelect allowed pseudo functions which rely on ordering to apply to objects. Specifically given a JSON object:

{
    "foo": 1,
    "bar": 2
}

You could write a pattern which would attempt to match one of its children based on its position, such as object:first-child. The problem with this (first pointed out by Robert Sayre) is that by definition, a JSON object is unordered. This simplest way to solve this problem is to change the nth family of pseudo-classes so that they never apply to object children. No interesting use cases are broken, and this whole issue is avoided.

If you’re interested in reading more, you can review the bug report, and check out Dave Herman’s deeper exploration of JSON semantics;

checking values with :expr()

The first release of JSONSelect had no way to conditionally match values based on their contents. It turns out there’s a whole lot you simply cannot do without basic checks on object values. Given that there’s so much less information in the structure of a JSON document vs. an HTML document, I felt the ability to test values was important for JSONSelect.

To figure out how to build in this feature I reviewed CSS3, sizzle, and sly. The only real tool I discovered was :contains(), a sizzle selector which lets you search for substrings inside a node’s contents.

JSONPath is a similar query language for JSON, and supports moderately complex expressions. A tiny expression language ultimately seemed like the most flexible solution without being overly difficult to learn or implement.

The expression language which ultimately made it into JSONSelect is influenced by CSS in that several of the available operators are inspired by those present in CSS for testing attribute values. Another area where CSS influenced JSONSelect’s expression support is in the representation of the current node’s value in an expression, which can be specified with the placeholder x. Let’s see it in action, given the following document:

{ "books": [
  { "price": 16.31,
    "name": "The Unbearable Lightness of Being",
    "author": "Milan Kundera"
  },
  { "price": 23.09,
    "name": "JavaScript Web Applications",
    "author: "Alex MacCaw"
  },
  { "price": 106.00,
    "name: "Compilers: Principles, Techniques, and Tools",
    "author": "Alfred V. Aho"
  }
] }

Here’s how you could use expressions to find all the books you could buy with the forty US dollars in your pocket:

.books .price:expr(x <= 40.00)

Now if you really strongly believed that nothing worth reading could possibly be less than a twenty spot, you could augment the expression to match your arrogance:

.books .price:expr(20 <= x && 40 >= x)

Now what was the real title of the Dragon Book? I can’t remember, but I do know that it started with Compilers:

.books .price:expr(x ^= "Compilers")

The full set of selectors supported in JSONSelect expressions is visible in the reference implementation (but you might want to refer to the latest version) and real soon now it will make it into the formal documentation.

Now, :expr() is a bit more complex than other features of JSONSelect, and for this reason two syntactic shortcuts have been added to the language to handle common tasks: :val(V) is a shorthand for :expr(x = V), and :contains(V) is a shorthand for :expr(x *= V). Perhaps a couple more shortcuts will be added as they prove themselves sufficiently common.

One thing you might have noticed, however, is that :expr() is rather useless right now. While it allows you to express constraints on values, it would only return the matching values themselves. In the above examples, what we really want are nodes related to the matched value, concretely the records of the books which match our criteria… This leads us to the next language addition:

testing descendants with :has()

Inspired by JQuery, the :has() selector allows you to perform a check against a descendant of a node. :has() in JSONSelect may contain a full selector and gives us a simple way to solve the problems discussed in the previous section. If we wanted to get the full records for all books under 20$:

.books object:has(.price:expr(x<20))

This will specifically find an object underneath the .books node that has a price property less than 20 bucks. It’s useful to point out that the object type here is redundant, as objects are the only types of nodes to which :has() can apply, so we can simplify a touch:

.books :has(.price:expr(x<20))

A naive attempt to simplify further, might yield:

:has(.price:expr(x<20))

In this case we omit the .books class that rooted the expression. This expression would match both the root document, and the object that we want, because :has() by default holds a selector that matches any descendant. A solution to this issue gives us a way to (re)introduce the :root selector:

:has(:root > .price:expr(x<20))

In case you’re not familiar with :root, it’s a CSS standard pseudo class that always matches the root of the document. With HTML this is rather useless (as resig points out), because the root of the document is more intuitively matched with html. In JSONSelect, :root is a much more important idea given there is no other way to refer to the root document.

In the above example, it’s demonstrated that when selectors inside :has() are matched against a target node, the node itself is the new document root. This feature can be used to restrict the behavior of :has() to match an object’s children (as opposed to any descendant). It may be useful at some point to add a custom has function to do this, but you be the judge, is this clearer?

:has-child(.price:expr(x<20))

:has-child is not part of JSONSelect at the moment, but we might just add it…

Now. Given all the new tools we have, we can easily express more complex ideas, like “I want all of the titles of books that are written by Authors whose names are ‘Alex MacCaw’:

:has(:root > .author:val("Alex MacCaw")) > .name

If you share my sense of aesthetics, this construct is just too complex. And that’s part of the motivation for our final new language operator:

matching siblings with ~

The CSS3 spec defines the sibling combinator. This combinator is almost totally useless in CSS given it’s directional restriction. That is, A ~ B will only match B if it occurs after A.

So a shorthand for sibling matching is interesting, and ~ is already meant for this job in CSS, but it’s going to take some adaptation given that ordering restrictions for object values are meaningless with JSON anyway (cause it’s unordered) and they also kill the utility of the operator.

So in JSONSelect, the sibling combinator (~) is simpler, it has no ordering restrictions and behaves as you might expect:

.author:val("Alex MacCaw") ~ .name

Now JSONSelect seems so complicated!

Not really. Similar to CSS, there’s some complexity and power lurking here, but you only rarely need to engage it. Ultimately to get the same level of power that other selector languages offer we’ve only had to add two pseudo functions and a new combinator (to use CSS speak). It’s worth noting that aside from the :nth-XXX family of functions there were no other changes made to the core JSONSelect language.

next up…

I’m curious to get feedback on these additions to JSONSelect, and you can kick the tires using the online demo and the reference implementation. Depending on your response, I’ll probably be slowing way down with language additions, though there may be be several convenience pseudo functions added that I alluded to above.

Provided folks are still as passionately supportive of JSONSelect, I hope to continue to incrementally refine the language while exploring some of its more interesting applications (simplifying APIs, stream filtering, and document databases).

I look forward to hearing your thoughts here, on github, and on the mailing list.

What's New In YAJL Two?

YAJL is a little sax style JSON parser written in C (conforming to C99). The first iteration was put together in a couple evening/weekend hacking sessions, and YAJL sat in version zero for about two years (2007-2009), quietly delighting a small number of folks with extreme JSON parsing needs. On April 1st 2009 YAJL was tagged 1.0.0 (apparently that was a joke, because the same day it hit version 1.0.2).

Given 2 years seems to YAJL’s natural period for major version bumps, I’m happy to announce YAJL 2, which is available now. This post will cover the changes and features in the new version.

First, Thanks!

Over the course of the last two years YAJL has had contributions from at least 24 different individuals and provided the heavy lifting for at 8 different high level language bindings. I’ve also received questions and comments from all over the planet from folks working in companies with market caps in the hundreds of billions, to university students in places like Cairo, Bangkok, and Bangladesh. YAJL has been particularly useful to the rails and iPhone communities, given its performance and low level streaming api. And anyone remember the twitpocalypse and YAJL Error 3? Dude, that’s not my bag!

At the risk of abusing my time on stage, I just wanted to say that there’s something very nice about world where one can go and build something as modest as YAJL and have it touch billions of people. So again, I’m pleased YAJL pleases you.

What’s new?

This post is intended to both tour new features and to serve as a quick and dirty porters guide for users of YAJL 1.x. If you want to skip all the language and examples, you’re more than welcome to head over to the ChangeLog and go from there.

License Changes

YAJL was three clause BSD, now it’s ISC. The functional difference between these two is that YAJL no longer includes a ‘non-endorsement’ clause, and really, I don’t care so much if you use YAJL in your product and decide to apply a “Lloyd Hilaiel inside” logo. But, YMMV with that.

So the new license is a bit more permissive but preserves everything I care about (you can’t say you wrote it), and is quite a bit smaller. That’s all.

Faster

YAJL 2 is somewhere between 20%-35% faster in its raw parsing performance. This is specifically due to lexer optimizations in string scanning. My above-average colleague Michael Hanson suggested these changes, and you’re encouraged to review the commit where the change was landed.

New “Tree” API

Given how prevalent YAJL is, it’s quite natural that lots of people get pointed to it when they ask about a good JSON parser for the C language. Unfortunately, many folks with simple parsing needs (they don’t need streaming, they have no data representation of their own) who just want to parse a little JSON file and extract some stuff have been repeatedly disappointed with YAJL. These folks want to pass in a buffer, get back a memory representation, plunk out a value or two, and go on their way.

Florian Forster decided to fill this need by writing a tiny little high level tree parser implementation on top of YAJL. The implementation is less than 10k of compiled object code, and if you statically link it’ll be stripped right out when not used. At that small cost, why not include an alternate high level API if so many people seem to want it?

To get an idea of how it works see the configuration parser example, and you can give the header file a read for more details.

Final note, this code is young, so expect several fixes and improvements in the coming months as people start sending in patches and bugs.

Changed YAJL Configuration

The 1.x YAJL implementation used C structures for configuration. So client code would allocate, and populate a structure, then pass it into the yajl_alloc() routine to change the behavior of the parser.

This design sucked for several reasons:

  1. Addition of new options couldn’t be done in a binary compatible way
  2. Client code always had to think about options, even if it just wanted default behaviors
  3. it wasn’t clear from reading client code what was going on

Brian Lopez suggested an alternative API that’s implemented in YAJL. Now parser setup is, in my opinion, a lot clearer:

hand = yajl_alloc(&callbacks, NULL, (void *) g);
yajl_config(hand, yajl_allow_comments, 1);
yajl_config(hand, yajl_dont_validate_strings 1);

The generator works similarly and now has a yajl_gen_config() function.

Little API Changes

There were also several smaller API breaking changes, which will be especially interesting as you port:

  1. YAJL no longer will build with a compiler that doesn’t support C99.
  2. size_t is now used instead of unsigned int wherever buffer lengths or offsets are represented.
  3. integers are now always represented with long long, which are at least 64bits on all modern compilers.
  4. yajl_parse_complete() is now yajl_complete_parse(), see the commit to understand why.

Big API Changes

One of the longest standing issues in YAJL was it’s default tendency to consume as little of input buffers as required to complete a parse. That is if you did this:

const char * buf = "2009-10-20@20:38:21.539575"
yajl_status s = yajl_parse(h, buf, strlen(buf));

YAJL would parse out that 2009 as an integer, and consider the parse complete! He got his value, the rest of your buffer is your business. Given that YAJL is a stream parser, the design goal was specifically that yajl would process as few bytes as possible. If you wanted to ensure that an entire buffer consisted of a single valid JSON value, you’d need to use yajl_get_bytes_consumed() after the completion of the parse to ensure that your entire input buffer was consumed.

This turned out to be a bad decision. Greg Olszewski fixed this with a patch that provided additional configuration flags to help people change the behavior of yajl into what they expected.

Beyond from this patch, I’ve folded the configuration into the new API mentioned above, and changed the default configuration to be what I expect you expect, specifically an entire buffer will be consumed (so trailing junk will be considered a parse error), and if you call parse_complete() and a complete top level value hasn’t been parsed, it’s a hard error.

For more information on the changed semantics, refer to the yajl_option documentation and take a look at the updated JSON reformatter example which demonstrates how error handling should fit into your parsing flow.

A Built-in Benchmark

Lots of people pick YAJL because it’s fast. But one thing that we’ve lacked in the past is a stable way to assess just how fast, and more specifically to gauge the performance impact of code changes.

To solve these problems I wrote a small in-tree performance test that spends some number of seconds parsing through three sample JSON documents (from popular APIs around the net), and represents how fast it can parse as a throughput.

If you want to give the tool a whirl, simple:

$ make
...
$ build/perf/perftest
-- speed tests determine parsing throughput given 3 different sample documents --
With UTF8 validation:
Parsing speed: 267.096 MB/s
Without UTF8 validation:
Parsing speed: 267.443 MB/s

Now it’s not a particularly sophisticated benchmark, but is a good starting point for giving a quick high level view of the performance implications of changes.

More Details

A couple less interesting changes are in there too, so feel free to peruse the ChangeLog and commit history, if you need more!

Happy YAJLing, lloyd

A Chromeless Snapshot

There has been a ton of development in the Mozilla Labs Chromeless project since the 0.1 release, and I wanted to take a moment to give a snapshot of our progress.

Application Generation

Chromeless lets you build desktop apps with web technologies, but how do package those up for distribution? Back in january we implemented the ability for chromeless to embed your application code into a standalone application. On OSX the output is an application bundle, that is a special type of folder with a .app extensions. On linux and windows the output is a directory with a single binary that is named the same as your application.

Usage is simple, just pass ‘appify’ to chromeless on the command line:

[lth@clover chromeless]$ ./chromeless appify examples/thumbnails
Generating a standalone, redistributable, application
Using Browser HTML at '/home/lth/dev/chromeless/examples/thumbnails/index.html'
Building application in >/home/lth/dev/chromeless/build/My Chromeless App< ...
  ... copying in xulrunner binaries
  ... placing xulrunner binary
Building xulrunner app in >/home/lth/dev/chromeless/build/My Chromeless App< ...
  ... copying application template
  ... creating application.ini
  ... copying in CommonJS packages
  ... copying in application code (/home/lth/dev/chromeless/examples/thumbnails)
  ... writing harness options
xul app generated in build/My Chromeless App

Once this is complete you have a standalone app that you can combine with your favorite installer technology to build a distributable application.

Native Menus, and Keybindings!!

Mike De Boer has contributed modules which expose access to native menus and allow you to control the display and behavior of your application menus. Full documentation isn’t yet available, but there’s some example code available that demonstrates how you can build up menus with icon and keybinding support:

Pretty Menus!

In addition to native menus, Mike has also put together a library to allow one to programatically configure shotcut key combinations, or hotkeys.

A Documentation System

As of 0.1, chromeless had no documentation. We’ve take the system from jetpack and reworked it a bit to fit for Chromeless. To generate docs from Chromeless, pass docs on the command line:

[lth@clover chromeless]$ python2 ./chromeless docs
Generating documentation
Created docs in /home/lth/dev/chromeless/build/docs.

While the content isn’t very complete at this point, we now have a usable system that we can update as we go. You can checkout a snapshot of the chromeless docs on our github pages.

Startup Parameters

It’s important to be able to control basic application parameters in a simple way. To specify things like the initial height and width of your application window, whether it’s resizable, the name of the application, and the icon that should be used to represent it. To address this problem we now support a configuration file that may be placed in the same directory as your application code: appinfo.json. This configuration file looks something like:

{
  "name": "My First Browser",
  "version": "0.1",
  "icon": "appicon.png",
  "resizable": true,
  "initial_dimensions": {
    "width": 500,
    "height": 300
  }
}

Now while the only keys which are actually implemented right now are name and version, this is a place where any configuration which changes the initial behavior of your application will reside.

File System Interaction

Contributions from Mike De Boer and Alexandre Poirot have brought good api breadth around filesystem interactions. The basic design of the file system APIs includes three different modules:

  • file – Reading and writing of individual files
  • path – Provides abstractions for the manipulation of file paths, but will never touch the file system.
  • fs – Includes functions which can query and manipulate the file system, and don’t fit into the two categories above (for example, directory manipulation and file copy live here).

In addition to these low level utilities, the app-paths offers an abstract way of getting at various standard filesystem paths.

OS Drag and Drop

Marcio Galli has put together example code for how one might go about supporting drag-in and drag out in their chromeless apps. It mostly leverages HTML5 support with a teeny tiny library to support native feeling drag-out.

Favicons and Mime-Type guessing

The mozilla platform (obviously) includes great tools for doing webby things! One example is displaying favicons, and that’s now wrapped up in a simple to use library. In addition there’s a new mime guessing library and lets you get a probable mime type given a file path.

Improved Embedding of Web Content

One of the primary goals of chromeless is to make it possible to build web browsers, and this means safely embedding web content is an important feature. The approach we take in chromeless is to transparently upgrade iframes that are direct children of your application code, to give you a deepened view as to what’s going on inside of them.

This upgrading concretely means two things:

  • iframes in chromeless emit several custom events
  • application code has increased priviledges to manipulate and inspect web content inside iframes, via the `iframe-controls' library.

Better Console Output

Mike De Boer has contributed tons of polish to the good ol' console.log() function, including pretty object introspection:

Introspection!

Simplified Web Requests

The addon-sdk community has written a nice little request library which simplifies issuing HTTP requests, and we’ve uplifted that into chromeless. Additionally, an API compatible version of XMLHttpRequest without the cross domain restriction is also available.

BIAB, and TTYL

Hopefully this whirlwind tour of new features in trunk gives you a good grasp of the project’s current state. The speed at which things are shaping up I feel is a testament to the quality of the platforms upon which chromeless is built, and the sheer awesome of the contributors.

Now, back to merging…

Open Web Apps: Refining The Manifest

In the month since we announced “Open Web Apps”, there’s been a lot of discussion around the particulars of the Mozilla proposal.

I specifically wanted to take a minute to jot down some of the proposed changes to the application manifest format from our initial design. The changes detailed here range from the drastic to the mundane, and have been contributed by my co-workers at mozilla and several community members.

NOTE: ALL of these changes are currently just ideas in want of review, and my listing them here in no way implies that they’ll be adopted.

#1 defaultLocales to default_locales

Underbars are used more predominantly in the manifest to separate multiple word property names, we should make that convention consistent.

#2 Removal of app_urls

The motivation of “application urls” was to define the scope of an application for two different purposes:

  1. navigation support: When clicking around on links inside an application, it is useful if the user agent can know whether a url is within an application. This allows the UA to implement some potentially useful behaviors that wouldn’t otherwise be possible (like keeping installed apps inside firefox ‘app tabs’ or google chrome’s ‘pinned tabs’).
  2. capability scoping: For applications which request different capabilities (or “permissions”), the app_urls array defined the scope of those capabilities. That is, the act of installing an application bestows those permissions on pages residing at URLs which match members of app_urls.

While the inclusion of app_urls may be useful for making navigation support possible for existing web applications, the risks it involves are too great. Specifically, if an application is able to request that capabilities are bestowed on arbitrary domains, several new types of attacks are possible.

One simple and concrete example of an attack could involve a hypothetical ‘navigation_consolidation’ capability. Imagine that an application existed and was designed to be a “singleton”: That application could request that if the user navigates to urls inside the app, rather than opening a new instance of the application, the existing instance of the app should be notified of the user’s navigation and the tab or window containing it focused.

This navigation_consolidation feature when combined with a broad specification of app_urls could offer a malicious application the ability to intercept user’s navigation to sites with whom they share personal or financial information (open the doors for a new and dastardly type of phishing attack).

This is one specific example of a class of problems that arises when an application author is able to change the way the browser behaves when rendering content from a site that that author does not control.

Part of a solution to this class of attacks is to simply remove the app_urls property from a manifest, in effect making it impossible for an application to span multiple domains. Once app_urls is removed, however, how is application scope determined? The next proposal attempts to answer that question.

#3 base_url defines “application scope”

Having removed app_urls in #2 it becomes unclear how, given a url, one determines whether that url belongs to an application. A simple solution is to say that any resource which has the base_url parameter of an application as a prefix is considered within the scope of that application.

A problem of overlapping application scope still exists: specifically if one app contains a base_url which actually contains the base_url of other applications, there is scope ambiguity. This problem is mitigated by the removal of app_urls (which could make scope ambiguity more prevalent and harder to reasonably address), and a reasonable solution may for the UA to simply refuse to install applications which would introduce ambiguity. This solution means that it’s possible for a rogue application to block the installation of others, but because the conflicting applications must always be from the same principle, it should be simple to resolve any such issues.

#4 Canonical Manifest URLs

Aaron Boodman of google suggested in an email that manifests be “served from a live URL, rather than embedded in … page(s)”. There are many reasons why this suggestion is interesting:

  1. (via Boodman) Every application has a “clear and unique ID”
  2. (via Boodman) Convenience: a developer may simply post their manifest URL to stores.
  3. Ownership: When combined with the requirement that urls inside a manifest must match the url from which the manifest is served, it builds a simple mechanism of proof of ownership into the system.

#5 Addition of a manifest_name property

Given the presence of base_url if a manifest_name were added to the manifest, it would mean that each manifest suddenly has a canonical url, which is manifest.base_url + manifest.manifest_name. This canonical url can be used to verify an application (see #6) and can be used as a unique identifier of the application.

manifest_name might be optional, with manifest.json being its default value.

One security feature of manifest_name, could be that the name must be a top level entity within the path described by base_url. This feature could address a class of attacks which arise when sites host user generated content. While it may be possible to publish a manifest on an unaware vulnerable site, it would not be possible to specify a base url within that injected manifest that included more than the directory from which the UGT was served (yeah, this only mitigates this type of attack, and we still may at some point consider stronger protections, such as new HTTP headers).

#6 Application verification at installation time

Combining #2 and #3 above, it becomes possible to define a series of steps that the application repository can perform to “verify an application”. If an application is verified, that means that the manifest resides on the site that hosts the application. This property is a speed-bump to many attacks, in that it’s no longer possible to casually grant permissions to a site you do not control.

The process of verification is simple:

  1. if installation was triggered with a manifest, extract the full canonical path to the manifest: base_url + manifest_name
  2. using the url passed to install() or the result of step #1 fetch the contents at that location
  3. if what’s fetched is not a valid manifest then the application cannot be verified
  4. extract base_url and manifest_name from the manifest fetched, if that matches the url from which the manifest was fetched, then the app is considered valid

Given this process to check that a manifest author has control over domain in which her web application resides, we can choose to either prevent the installation of unverified applications, or at the very least prevent them from requesting any capabilities outside of what’s available to a standard web application.

#7 installed_by

In the current proposal there is no way for a host of an application to explicitly delegate authority to a store to provision that application. In other words, if I wrote “Whap The Monkey” and want to allow “http://redonkulousgames.com” to legitimately sell my game, there’s no way that I can inform your UA that you are allowed to install WTM from my site and the redonkulousgames.com site, but from noone else.

Given the suggestions in this post, it could now be fairly trivial to specify in the manifest who may distributed an application. installed_by could simply be an array of url fragments, with the default being “disallow syndication”:

"installed_by": [ ]

The developer may explicitly allow promiscuous syndication of their app:

"installed_by": [ "*" ]

And the whap the monkey case of above could look something like:

"installed_by": [ 
  "http://whapthatmonkey.com",
  "https://redonkulousgames.com"
]

This simple addition further empowers application developers (or app publishing platforms) with more control to protect apps against casual (even accidental) plagiarism, and unwanted syndication. Further, the UA can enforce this mechanism to eliminate several new flavors of phishing.

#8 anonymous apps

While the above proposals tighten up the definition of an application and attempt to eliminate several areas of ambiguity, they do so at the sacrifice of an interesting feature: How might I publish a simple manifest that describes an existing website, so that I may launch that site from within my dashboard?

One solution is to create explicit support for “anonymous apps” (which are simply applications that the UA cannot verify with the steps in #6), these are applications which may not request “capabilities”, and in turn need not be hosted anywhere (all icons can be embedded in the manifest itself in the form of data urls, and hence the base_url property becomes unnecessary).

This approach aims to allow for the distributed development of “bookmark applications” apps in a safe way (allowing end users to have a convenient way to launch the applications that are important to the even if the site authors haven’t yet built any explicit support in for open web apps).

feedback

The focus of most of these features is to tighten up web apps and to reduce ambiguity and attack surface in preparation for more interesting and powerful capabilities. So while it may seem that some of these restrictions limit the power of web applications, this author believes that a reduced application scope will make it easier and safer once we start adding application capabilities.

What do you think?

Challenges in Repurposing the Addon SDK (aka, jetpack)

Lately I’ve been collaborating with Marcio Galli on the chromeless project in Mozilla Labs, and one thing I like about the approach is that it leverages huge swaths of the jetpack platform.

What’s Jetpack Got To Do With It?

At first glance you may wonder what jetpack (making it possible to write extensions in web technologies) could have to do with the very different problem of making it possible to create a browser in web technologies. The answer is that the folks working on jetpack had to solve a couple of interesting problems that are very relevant:

A Module System For JavaScript

Because JetPack represents a shift to writing potentially large and complex extensions in javascript, the first thing that comes up is the question of code organization. How do we break the implementation into several different files to support our abstractions? How do we share libraries between developers?

The solution in JetPack is something that looks like the CommonJS module specification, and one of the thing that jetpack provides is a mechanism for building and consuming libraries of code in javascript.

Module Documentation

If your familiar at all with jetpack, you might have seen their built-in documentation mechanism. The way it works is the python scripts that compose the interface to the platform spawn a local web server which can display a combination of dynamically generated and static documentation. Along the way, JetPack has made some technology decisions and introduced some conventions around documenting code that are not specific to generating extensions.

Mozilla Platform Abstraction

In addition to the concrete features mentioned above, the jetpack platform provides the ability to generate idiomatic JavaScript APIs which can then be implemented leveraging existing platform features, implemented themselves in XPCOM, JavaScript, or even ctypes.

How Do We Share the Jetpack Core?

While the above was enough reason for Marcio and I to agree that leveraging work in jetpack for chromeless was a Good Thing, this left the question of How? How can we have two distinct projects that efficiently share large swaths of code without adding undue complications, or limiting either’s ability to move rapidly. There seem to be at least four arrangements worth mentioning:

“One is the loneliest number” — jetpack-sdk could be the canonical source for building browsers, building addons, and whatever else we dream up in the future.

“The messy breakup” — fork, diverge, and cry.

“The great cuddlefish escape” — slice the common bit out of the jetpack-sdk and figure out how other repositories can consume that thing conveniently and then layer their own modules, documentation, and additional cfx features on top of it (i.e. addons want to export to .xpi, skinned browsers want export to .dmg/.exe).

“Port Style” — the new guy, “chromeless-skinner-thingy” can consume jetpack-sdk from github, and layer a bit of love on top of it. jetpack-sdk code wouldn’t be duplicated, but rather be pulled as a pre-build step.

Looking at these options, my thinking goes like this: One is the loneliest number puts too much in a single repository and creates the potential for far too much friction between the two different projects. The messy breakup (where we are today) is unacceptable because we have to find and fix bugs twice, and there’s no good way to share improvements and new features added to common code.

This leaves two interesting solutions, Port Style and The Great Cuddlefish Escape: Both would address some of the key issues, but the great escape would arguably add risk prematurely to a project with great momentum, jetpack.

For this reason the approach that I intend to pursue is Port Style in the short term, and to work with the jetpack team to consider a Great Escape in the future.

The Plumbing

Haven chosen an initial approach, the final question is how do we actually pull and patch jetpack sources? I’d offer the following proposed requirements:

pre-build step – required dependencies are fetched as a (discoverable) pre-build step.

solves xulrunner acquisition – the largest source of issues thus far has been in people trying to get the correct version of xulrunner on their platform so they can try chromeless. We there’s an issue open to address this, and because acquiring xulrunner is a similar problem to acquiring the jetpack SDK, we should at least consider a combined solution.

explicit version tracking – the version of jetpack-sdk being tracked should be captured in the chromeless repository. That is, it should be required that a developer confirm chromeless functions against a newer version before the version we’re targeting is updated.

automatic update – If bob has checked out chromeless, and Jane updates the version of the jetpack SDK that should be tracked (via sha1?) and then bob pulls latest changes into his checked out chromeless, then his jetpack-sdk should be automatically updated.

In considering these requirements, what seems to fit best is the application of a little tool called the bakery which was designed to solve these very problems. The benefit of using the bakery to fetch both xulrunner and jetpack sdk is that all of the above requirements would be satisfied, and it would be a very quick integration. The downside of using the bakery is that its written in ruby, and this would require people wanting to work on chromeless would have to have ruby installed, which is another build dependency and adds an unfortunate barrier to entry…

Despite this, It seems like using the bakery on an experimental branch may be a good place to start. Once SDKs are generated, the ruby requirement would not apply to folks wanting to use chromeless, only those developing on it.

JSChannel: Taming postMessage()

This post presents JSChannel, a little open source JavaScript library that sits atop HTML5’s cross-document messaging and provides rich messaging semantics and an ergonomic API.

Usage Overview

Let’s start with a quick demonstration of usage. Here’s a sample containing HTML page:

<html>
<head><script src="jschannel.js"></script></head>
<body>
<iframe id="childId" src="child.html"></iframe>
</body>
<script>

var chan = Channel.build(document.getElementById("childId").contentWindow, "*", "testScope");
chan.query({
    method: "reverse",
    params: "hello world!",
    success: function(v) {
        console.log(v);
    }
});

</script>
</html>

And here’s the child page that’s referenced by the former:

<html><head>
<script src="jschannel.js"></script>
<script>

var chan = Channel.build(window.parent, "*", "testScope");
chan.bind("reverse", function(trans, s) {
    return s.split("").reverse().join("");
});

</script>
</head>
</html>

The parent page builds a Channel abstraction around the embedded iframe’s content window. This abstraction manages several setup requirements such as adding a message handler for the message event to the window object, etc. Having built a channel instance, the parent page sends a query, invoking the reverse method, sending in a single string argument Hello world!, and specifies a function to be invoked upon success (which occurs when a response is received).

Next, in the child we see a similar bit of code to create a channel. Subsequently the child calls the bind method of the channel object to associate a function with the reverse method.

If you were to run this code, you would see !dlrow olleH output in your console log. There are several features baked into JSChannel which keeps this simple function invocation simple. We’ll explore those features in the following section.

postMessage, the Missing Parts

The little method behind HTML5’s cross document messaging, postMessage, is quite spartan. It provides a way to efficiently move a string between frames, even when those frames are not from the same origin (scheme + host + port). It also gives the recipient of a message a reliable way to know its real origin. And that’s all HTML5 gives you! Let’s briefly run through all the things that one would probably want to add if they wished to build something non-trivial on top of post message…

Origin Checking

The authors of the spec were careful to call out in bright red letters the importance of verifying the origin of messages:

Use of this API requires extra care to protect users from hostile entities abusing a site for their own purposes.

Authors should check the origin attribute to ensure that messages are only accepted from domains that they expect to receive messages from. Otherwise, bugs in the author’s message handling code could be exploited by hostile sites.

While code to implement the origin check certainly isn’t difficult, the cost of messing up that check is quite high as compared to the ease with which it could be left off:

window.addEventListener('message', receiver, false);
function receiver(e) {
  // behind the origin property of the event lies the true sender
  // of this message
  if (e.origin == 'http://example.com') {
    // here lies the guts of message handling, where you may possibly
    // share sensitive information with 'example.com'
  }
}

JSChannel makes origin checking a bit more prominent: the second argument to Channel.build() is where you can specify the expected origin of the remote side. The wild-card '*' may be specified, but the above warnings should be heeded.

(note: the origin argument should probably become more fancy and support some sort of safe globbing as well as arrays of host globs).

Message Structure

The HTML5 specification and early implementations leave the message content (event.data) as a string. Later implementations (chrome 6 at least), allow the payload to be a JavaScript object (without functions or complex objects). In either case, there’s no message structure suggested by or built into the specification. What this means is that there isn’t a standard way to indicate in a message what ‘function’ is to be invoked on the remote document, nor is it possible to specify in a response message which query it’s a response to.

JSChannel uses the JSON-RPC specification with several modifications to support some of its more sophisticated API semantics.

Here’s a flavor of messages ‘on the wire’ during the execution of the example at the start of this blog post:

[gHvq1-R] post   message: {"id":81351,"method":"testScope::reverse","params":"hello world!"}
[SeYpI-L] got    message: {"id":81351,"method":"testScope::reverse","params":"hello world!"}
[SeYpI-L] post   message: {"id":81351,"result":"!dlrow olleh"}
[gHvq1-R] got    message: {"id":81351,"result":"!dlrow olleh"}

Query/Response Semantics

An integral requirement in the definition of message structure was the requirement to be able to implement query/response semantics atop of it. JSChannel includes a unique identifier in the message structure, and includes a mechanism for determining starting id. These two features make query/response semantics possible while also making a message dump more scrutable.

Dispatch

In the HTML5 specification, all posted messages arrive in a document as message events that your listener will receive. While it’s fairly simple to write code that dispatches messages to handler routines based on the origin and ‘method’ of incoming messages, this code is boilerplate and it gets less fun to write each time you have to debug it.

Dispatch is built into JSChannel, and multiple channels may happily co-exist simultaneously in the same page. It’s expected that allowing channels to be created next to the code to which they’re relevant will be useful in larger projects. Instantiating multiple channel objects looks as you might expect, and can occur anywhere in your code:

var chan1 = Channel.build(document.getElementById("firstChildId").contentWindow,
                          "http://somesite.com", "scope1");

var chan2 = Channel.build(document.getElementById("secondChildId").contentWindow,
                          "http://someothersite.com", "scope2");

Scoping

As sites get larger and more complex, it becomes possible that you’ll have method name collisions, which can mess up message routing. It’s common practice to prepend a ‘scope’ to method names to help mitigate this.

The third argument to Channel.build() is a scope which will be prepended to message methods. This would allow multiple channels to be instantiated with the same frame for different purposes. Method names can remain terse and natural and different scopes will prevent collisions.

Error Handling

Given that postMessage leaves the questions of query/response semantics and message structure up to higher level code, there is really no place to hang error handling.

JSChannel introduces a protocol message to allow errors to be returned from invocations:

[0p1uq-R] got    message: {"id":483615,"error":"invalid_arguments","message":"argument should be a string"}

At the API level, there are some niceties, like automatic conversion of exceptions into error messages. For example, the following sample code would generate the error message above (when invoked without a string argument):

var chan = Channel.build(window.parent, "*", "testScope");
chan.bind("reverse", function(trans, s) {
    if (typeof s !== 'string') {
        throw { error: "invalid_arguments", message: "argument should be a string" };
    }
    return s.split("").reverse().join("");
});

Finally, the code that handles exceptions will automatically convert exceptions raised due to accidental programmatic errors into protocol messages with error type ‘runtime_error’. This hopefully will cause bugs to manifest earlier and more clearly, rather than to cause mysterious hangs.

Setup Race Conditions

Finally, one of the first things that this developer hit when employing postMessage was the race condition that arises when sending message into iframes. This issue occurs on both sides of the channel: If the parent does not wait for the child to load, any initial queries they send may be lost without any indication. If a child tries to send messages before a parent has instantiated their channel, the same is true.

JSChannel addresses this problem with a simple application level handshake which causes the queuing of messages until the remote end is ready. This feature allows the developer to safely instantiate and immediately send messages over a channel without worrying about instantiation timing.

Digging Deeper

Given that JSChannel is pretty new, this post is probably the best overview documentation available. You can kick the tires, file bugs, and read the implementation over on github.

Further Reading and Influences

JSChannel is largely built on the work of others, here’s a laundry list of things that have influenced me in no particular order:

  • pmrpc An abstraction very much like JSChannel, with a slightly different API and slightly different goals.

  • XAuth, a open platform for extending authenticated user services across the web that uses postMessage for RPC.

  • Learning from XAuth: Cross-domain localStorage – An Article by Nicholas Zakas which covers some of the techniques that JSChannel encapsulates and does a great job of highlighting some of the more exciting parts of how you can apply postMessage.

  • JSON-RPC a simple and beautiful message format for implementing RPC semantics using JSON.

Cheating on Subversion Without Getting Caught

This post lightly explores the problem of “automatically” backing up a git repository to subversion. Why would anyone want to do this? Well, if your organization has a policy that all code must make it into subversion, but your team is interested in leveraging git in a deeper way than just by using git-svn as a sexy subversion client, then you’ll find yourself pondering the question of repository synchronization.

For our purposes, we’re guided by the following requirements:

  1. An at least periodic (< 24 hour backup) of code must exist in subversion — While it’s not strictly necessary to have a perfect commit history stored in subversion, all code should be present and buildable with minimal fuss.
  2. Code must exist in git with full history.
  3. Branching & tagging must remain usable and should properly translated between svn and git
  4. To the extent possible, non-linear git histories should be handled
  5. it is not required that all features of git are usable (i.e., sub-modules), but there should be viable workarounds for each feature we sacrifice.
  6. it should be possible to have an external git/public mirror of source code.
  7. There should be a way to take changes committed to forks of the public mirror to be merged back into the system.

Given the requirements, their are at least two flows that make sense:

  • Subversion Primary is an arrangement where all commits are made into subversion (using the client of your choice, possibly git-svn) and “read only” git mirrors of the subversion repository support the other requirements.
  • The Dumb Subversion Backup flips the order around and says that there is a single git repo that is the source of truth, which is automatically backed up to subversion (perhaps in a lossy fashion).

The remainder of this document will discuss these two approaches.

The Subversion Primary

Subversion_primary

In this scenario, all commits funnel first through the subversion repository. There is an automated process that polls or pushes changes from subversion into an internal git staging repository. That staging repository uses git-svn to map the complete subversion commit history into git, and performs some processing to deal with mapping branches and tags among other things.

The internal staging repository can then mirror to an external repository which is visible to all for code review and contributions. External contributors can fork, change, and send pull requests in any form they like. Internal developers can then manage pulling commits into the main svn repository.

Key Features

The Subversion Primary is a straightforward solution that has the following characteristics:

  • Internal devs interact only with subversion and can use the client tool of their choice
  • Downtime that affects the internal git repo or the public git repo do not affect internal developers
  • All git repositories are ephemeral, and their loss or corruption is Not A Disaster.
  • External developers are insulated from the fact that SVN is in use, except for the git-svn artifacts attached to commit messages

Challenges

There are a couple obstacles to surmount in this arrangement, and we’ll briefly explore them here.

Username Mapping

One problem that exists with git-svn is that when commits are pulled from subversion the ‘author’ field is meaningless. Here’s an example:

commit f6c77de3df60340b01abb219cbe7215a93dcdc9c
Author: lloydh <lloydh@3fcf768e-b076-0410-b9b8-e8c34e2d470b>
Date: Thu Jul 29 19:36:48 2010 +0000

update changelog with the skinny on 2.9.7

git-svn-id: svn+ssh://svncorp/yahoo/BrowserPlus/public_platform/trunk@793 3fcf768e-b076-0410-b9b8-e8c34e2d470b

In order to map internal svn usernames (i.e. lloydh) to meaningful author entries (i.e. “Lloyd Hilaiel ”) we can use mechanisms built into git-svn. Simply create an authors.txt file that looks something like:

lloydh = Lloyd Hilaiel <lloyd@somewhe.re>
dgrigsby = David Grigsby <dg@wordmaven.org>
... etc

having created this file, use the -A argument on all git svn operations:

-A, --authors-file=

Syntax is compatible with the file used by git cvsimport:

loginname = Joe User <user@example.com>

If this option is specified and git svn encounters an SVN
committer name that does not exist in the authors-file, git svn
will abort operation. The user will then have to add the
appropriate entry. Re-running the previous git svn command after
the authors-file is modified should continue operation.

Mapping SVN branches and tags to git

Another problem that exists is how SVN branches and tag appear in git. Subsequent to a git svn clone you can view the remote branches with git branch -r. You’ll see something like this:

...
sdk_and_installer
service_api_v5
tags/2.10.0
tags/2.10.1
tags/2.10.2
tags/2.5.0
...

Because subversion has no proper notion of tags, you’ll notice that the tags set in subversion are branches in git. If you want your published git repository to look reasonable to an average git user, it’d be nice to turn these tags into proper git branches. To solve this problem I wrote a small shell script designed to be run after fetching changes from SVN.

#!/usr/bin/env bash

for r in `git branch -r | grep -v trunk`; do
  istag=x$(echo $r | egrep -v '^tags')
  if [ "$istag" == "x" ] ; then
     tn=$(echo $r | sed -e 's/^tags\///')
     git tag -f $tn refs/remotes/tags/$tn
   else
     git branch --track -f $r refs/remotes/$r
   fi
done

Accepting contributions

Given the flexibility of git, there are many tactics for merging changes forked off of the public mirror back into subversion’s linear view of the world: the least sophisticated approach would be to manually do the merge using diff and her friend patch and perhaps add a little --fuzz. A much nicer way would be to cherry-pick or rebase changes directly into a git repository that is svn cloned from the original, and then to dcommit back.

This latter approach seems like it would be much faster and less error prone, however there’s at least one gotcha: In order to simplify the merging of changes, one should ensure that commits are identical between the public repository and the version that’s cloned from SVN. This means when an internal developer clones a git repository from SVN, they should use the same authors.txt file as is used by the internal git repo (which pulls from subversion and publishes to external). Additionally, the internal developers clone should refer to the subversion repository in the same way as the internal staging git repo does, specifically hostname must be the same in both cases (two cases where they might diverge include using ssh tunnels and aliases, or using DNS shorthands or ip addresses).

Once commits are bitwise identical in the developers svn clone and the public mirror, it’s possible to use the advanced merging and rebasing features of git in a straightforward way to absorb external changes and then dcommit them to SVN. Even if some circumstances arise and it’s not possible to get bitwise identical commits, it’s still possible to fetch an entire remote branch and cherry-pick individual commits, an approach which is perhaps trivially better than manual diffing and patching.

NOTE: In order to artificially join histories without bitwise identical commits (messages or authors), one might try the “ours” merge strategy. I’ve no personal experience with this approach.

I think whatever strategy you end up employing, the key, is that the person or people on the team that are most in love with git should be the ones integrating external contributions.

The Dumb Subversion Backup

Dumb_subversion_backup

In this arrangement, rather than all commits flowing through subversion, its simply a mechanism for the backup of the git repository. The key idea here is that there is a single git repository that is backed up to subversion. While other requirements then require the ability to publish the repo externally and pull contributions internally, these tasks can be accomplished in a thousand different ways leveraging the built-in features of git. Restated, the only element of this arrangement worth discussing is the challenge of actually synchronizing changes from git to subversion.

Key Features

  • All developers interact with git, while subversion is a hidden implementation detail
  • The internal team is on the hook to set up and maintain a more complex system than with the Subversion Primary arrangement
  • Inevitably, there will be information stored in the many git clones that will cannot be represented in subversion, so restoration from subversion backup would inevitably imply loss of some information.
  • External developers see no artifacts which suggest that subversion is used as a backing store

Challenges

While this arrangement undeniably requires more systems administration, most of it is straightforward. The setup of the main internal git repo will probably require the installation of some scripts to allow multiple users to push changes over ssh to the same repository, like gitosis or gitolite.

The real challenge in this setup, however, is given a git repository, how do you back it up into subversion? The range of solutions includes:

  1. periodically take snapshots of the main branch of development (think, literally, a git archive), and overlay them on the current subversion view of the world. commit. prosper.
  2. the same as #1, but do so for all branches present in the git repository (automatically bring branches into existence in SVN as they’re created in git)
  3. #1 or #2 but on a per commit basis, with log message (and author?) preservation
  4. invent a system that will actually use git svn dcommit to perform the commits, squashing non-linear histories into a linear history as it goes.

So, I’ve seen #1 in place on a successful, vibrant, large scale project. Depending on the intent of the policy that requires a backup to SVN (and how much you really care), this may or may not fit the bill.

Options 3 and 4 are what are really interesting. These options would seem to be possible with a little bit of well-thought-out scripting on top of the existing facilities of git-svn, and hold the potential to capture a significant subset of the information that’s represented in the git original — preserving commit messages, authorship, and performing admirably when smashing non-linear histories into something subversion can understand.

The path to implement option 3 would be similar to fairly well documented tactics for moving a single git branch with a (mostly) linear history into subversion, but would have to take into account multiple branches. Specifically, some areas of difficulty would seem to be:

  1. keeping cruft out of the public git repository – when dcommitting using git-svn, the branch that you dcommit from is changed as meta-data is embedded into the commits. In order to use git-svn you would need to design a mechanism that could merge changes from the branch where a commit lands, into a branch which has been associated with a subversion branch, and dcommit from there.
  2. incremental commits — An extension of #1, commits are made into the internal git repository, it would be required that you then move these commits into the branch from where you can dcommit. Doing so will require that you establish a merge base (common ancestor) between the pristine actual branch in git, and the git branch that’s associated with subversion.
  3. non-linearity and git merges — Whenever you have a merge commit in git (a commit with more than one ancestor), you create an instance of non-linearity where git-svn will preserve changes but loose commit history. There may be cases where git-svn will simply break (!?)
  4. sub-modules — git sub-modules (which point to remote git repositories) cannot be mapped onto svn:externals (which point to remote subversion repositories), so there would need to be a mechanism that could either fetch and commit the remote sub-module, or
  5. changing history — in git it’s very possible to change the history of a remote repository. Subversion would get really pissed and the whole mechanism would probably go to hell.

Hairy, no?

Non-linearity and git-svn

Above it was mentioned that git-svn will attempt to collapse non-linear histories in git into something linear in subversion. A concrete example of this in action follows: This is your commit history in git:

* fef76c5 Merge branch 'newb'
|\
| * d49b6d4 a third change on the newb branch
| * 7a1ecfa a second commit on the newb branch
| * 9197078 commit #1 on the newb branch
* | 9332815 (testing) master branch commit
|/
* d1b9181 Merge branch 'test_branch'
|\
| * 2f74914 a test commit
* | 792cc2c a test commit on the main branch
|/
* a01c1ff fix svn:ignore props

This is your git commit history once dcommitted via git-svn

* 4090f22 Merge branch 'newb'
* c3a85f5 (testing) master branch commit
* 4ad5b30 Merge branch 'test_branch'
* 792cc2c a test commit on the main branch
* a01c1ff fix svn:ignore props

When looking at the commits, changes 9197078, 7a1ecfa, and d49b6d4 in the former non-linear history are all combined into 4090f22 once they’ve been routed through git-svn.

Now What?

I’ve implemented a Subversion Primary system myself, with great success, and I’ve seen the “periodic snapshot” approach used effectively to implement a Dumb Subversion Backup. While it seems like one could go further to backup more information from git incrementally into subversion, I fear the resultant system might be too brittle and high maintenance to make any sense. What do you think? Have you ever seen an automatic git –> subversion backup that didn’t suck?