Get front row seat and watch the development of micro-isv. We make cool product that solves all your problems...

logo

Friends
         
Representing raw byte strings in JSON
2009-01-21

I'm not sure if I'm the only one to notice this, but there appears to be no way to represent an arbitrary string of bytes in JSON.

JSON is a serialization format that (for some reason) encodes Unicode strings, numbers, lists and records as a Unicode string. Which is good, as long as the only data types your application deals with are one of those above.

What about:

char* (c)
ByteString or (Haskell)

Which doesn't have a mapping to JSON. Strangely, many of the programs written over the last—oh—20 years deal in 'lists of bytes'. Consider JPEG image data or a million other examples.

There are workarounds: you could convert a list of bytes to a list of integers and encode that, or use base64 or use uuencode. You could even pick 256 Unicode code points and map the bytes to those. U+0000 to U+00FF would seem to be one option, equally U+0001 to U+0100.

I'm not saying that it isn't possible, just that a 'standard' that doesn't define how to encode the most common, universal data type ever is a wee bit broken.

In general, any standard that includes pseudo-formalisms on its homepage is broken.