(or
How to do Binary Ajax)
I've been fiddling around trying to download a binary stream using the object XmlHttpRequest (XHR) of Javascript in Mozilla Firefox.
What I wanted was to be able to create a byte array containing the original bytes of the downloaded binary file. This mechanism would be used to implement a ROM loader,
without modifying the original binary file (e.g. changing its binary encoding to some form of text at the server), because I wanted to reuse legacy binary files already stored in third-party servers which I couldn't modify.
After looking at
developer.mozilla.org, many online docs (e.g.
here and
here), IANA's docs, checking hundreds of technical blogs, googling around etc, I wasn't able to find any example (or even an implied acknowledgement) that the object XmlHttpRequest could be used to fetch a binary file into a byte array. Instead, the general consensus is that any attempt at doing that is destined to result in a garbled binary stream and the 'problematic' binary download should never, ever be tried with XmlHttpRequest.
I decided to set out to find out a way to use this object to download any generic binary file. The problem I faced is that a usual Javascript code, in Firefox, for downloading a text file, would by default use a unicode-derived encoding and it would (1) eat up bytes in the binary stream because they would correspond to a single unicode-like character or symply ignore some of them because they were undefined in the character encoding; (2) unpredictably map some bytes into others (it seems the most affected range in the
unicode charsets is between values 128-160, but in the default charset
the affected range is 128-255). The result using the code below to fetch a binary file is a mess.
//fetches text/plain file synchronously. Do NOT use for binary files!
load_url = function(url) {
netscape.security.PrivilegeManager.enablePrivilege("UniversalBrowserRead");
var req = new XMLHttpRequest();
req.open('GET',url,false);
req.send(null);
if (req.status != 200) return '';
return req.responseText;
}
After looking in the XMLHttpRequest documentation I realized I could force the HTTP request header to ask for a specific MimeType. In principle, the only necessary thing would be to force Javascript to interpret the received string characters as a constant 8-bit-length encoding, such as ASCII, as can be seen in the code below. A UTF-8 or any other variable-length encoding such as the default one should be avoided for the reasons above.
//should fetch binary files synchronously. But it DOESN'T work.
load_url = function(url) {
netscape.security.PrivilegeManager.enablePrivilege("UniversalBrowserRead");
var req = new XMLHttpRequest();
req.open('GET',url,false);
req.overrideMimeType('text/plain; charset=us-ascii');
req.send(null);
if (req.status != 200) return '';
return req.responseText;
}
However, I discovered that even though now the resulting string length was correct, the underlying US-ASCII charset conversion of Javascript still mapped bytes in the range 128-160 to garbage.
I tried many (I really mean many) other 8-bit-length charsets without success. It seems that all available charsets reserved part of their range to code undefined and control characters in a fierce purposeful attempt to overcome my initial objective.
Then, after greping around, I found the file
firefox/res/charsetalias.properties, which includes all accepted charsets in Firefox. After trying some extra charsets whose aliases were not available in any other source, I FINALLY found a simple charset that correctly instructed Javascript to map an identity between the received stream bytes and the characters returned by the string responseText. The code below is the correct one to download a byte stream in Javascript:
//fetches BINARY FILES synchronously using XMLHttpRequest
load_url = function(url) {
netscape.security.PrivilegeManager.enablePrivilege("UniversalBrowserRead");
var req = new XMLHttpRequest();
req.open('GET',url,false);
//XHR binary charset opt by Marcus Granado 2006 [http://mgran.blogspot.com]
req.overrideMimeType('text/plain; charset=x-user-defined');
req.send(null);
if (req.status != 200) return '';
return req.responseText;
}
After reviewing some
UNICODE documents, it seems that the explanation is that the charset
x-user-defined uses the
UNICODE Private Area 0xF700-0xF7ff to map its range.
Now you just access the bytes in the received binary stream as you would access characters in the returned string:
var filestream = load_url(url);
var abyte = filestream.charCodeAt(x) & 0xff;
where
x is the offset (i.e. position) of the byte in the returned binary file stream
. The valid range for x is from
0 up to
filestream.length-1.
If you use the code above, please leave the comment about the author (me) for recognition of the many hours of labor to find this out. Please bear in mind that even though it worked for me and I'm making this code available in the hope it may help you too, I do not take any responsibility for any problems this code may cause to you or your project!
Labels: ajax, binary, charset, firefox, javascript, mozilla, xhr, xmlhttprequest