Custom JSON formats

Ecmascript JSON shortcomings

The standard JSON format has a number of shortcomings when used with Ecmascript:

These limitations are part of the Ecmascript specification which explicitly prohibits more lenient behavior. Duktape provides two more programmer friendly custom JSON format variants: JSONX and JSONC, described below.

Custom JSONX format

JSONX encodes all values in a very readable manner and parses back almost all values in a faithful manner (function values being the most important exception). Output is pure printable ASCII, codepoints above U+FFFF are encoded with a custom escape format, and quotes around object keys are omitted in most cases. JSONX is not JSON compatible but a very readable format, most suitable for debugging, logging, etc.

JSONX is used as follows:

var obj = { foo: 0/0, bar: [ 1, undefined, 3 ] };
print(Duktape.jxEnc(obj));
// prints out: {foo:NaN,bar:[1,undefined,3]}

var dec = Duktape.jxDec('{ foo: 123, bar: undefined, quux: NaN }');
print(dec.foo, dec.bar, dec.quux);
// prints out: 123 undefined NaN

Custom JSONC format

JSONC encodes all values into standard JSON. Values not supported by standard JSON are encoded as objects with a marker key beginning with an underscore (e.g. {"_ptr":"0xdeadbeef"}). Such values parse back as ordinary objects. However, you can revive them manually more or less reliably. Output is pure printable ASCII; codepoints above U+FFFF are encoded as plain string data with the format "U+nnnnnnnn" (e.g. U+0010fedc).

JSONC is used as follows:

var obj = { foo: 0/0, bar: [ 1, undefined, 3 ] };
print(Duktape.jcEnc(obj));
// prints out: {"foo":{"_nan":true},"bar":[1,{"_undef":true},3]}

var dec = Duktape.jcDec('{ "foo": 123, "bar": {"_undef":true}, "quux": {"_nan":true} }');
print(dec.foo, dec.bar, dec.quux);
// prints out: 123 [object Object] [object Object]

The JSONC decoder is essentially the same as the standard JSON decoder at the moment: all JSONC outputs are valid JSON and no custom syntax is needed. As shown in the example, custom values (like {"_undef":true}) are not revived automatically. They parse back as ordinary objects instead.

Codepoints above U+FFFF and invalid UTF-8 data

All standard Ecmascript strings are valid CESU-8 data internally, so behavior for codepoints above U+FFFF never poses compliance issues. However, Duktape strings may contain extended UTF-8 codepoints and may even contain invalid UTF-8 data.

The Duktape JSON implementation, including the standard Ecmascript JSON API, use replacement characters to deal with invalid UTF-8 data. The resulting string may look a bit odd, but this behavior is preferable to throwing an error.

JSON format examples

The table below summarizes how different values encode in each encoding:

Value Standard JSON JSONX JSONC Notes
undefined n/a undefined {"_undef":true} Standard JSON: encoded as null inside arrays, otherwise omitted
null null null null standard JSON
true true true true standard JSON
false false false false standard JSON
123.4 123.4 123.4 123.4 standard JSON
NaN null NaN {"_nan":true} Standard JSON: always encoded as null
Infinity null Infinity {"_inf":true} Standard JSON: always encoded as null
-Infinity null -Infinity {"_ninf":true} Standard JSON: always encoded as null
köhä "köhä" "k\xf6h\xe4" "k\u00f6h\u00e4"
U+00FC "\u00fc" "\xfc" "\u00fc"
U+ABCD "\uabcd" "\uabcd" "\uabcd"
U+1234ABCD "U+1234abcd" "\U1234abcd" "U+1234abcd" Non-BMP characters are not standard Ecmascript, JSONX format borrowed from Python
object {"my_key":123} {my_key:123} {"my_key":123} ASCII keys matching identifer requirements encoded without quotes in JSONX
array ["foo","bar"] ["foo","bar"] ["foo","bar"]
buffer n/a |deadbeef| {"_buf":"deadbeef"}
pointer n/a (0xdeadbeef)
(DEADBEEF)
{"_ptr":"0xdeadbeef"}
{"_ptr":"DEADBEEF"}
Representation inside parentheses or quotes is platform specific
NULL pointer n/a (null) {"_ptr":"null"}
function n/a {_func:true} {"_func":true} Standard JSON: encoded as null inside arrays, otherwise omitted

Limitations

Some limitations include:

(See internal documentation for more future work issues.)