<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Walt-O-Matic &#187; XML</title>
	<atom:link href="http://www.wwco.com/~wls/blog/category/programming/xml/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.wwco.com/~wls/blog</link>
	<description>Pure Walt, from Concentrated Thought</description>
	<lastBuildDate>Wed, 01 Sep 2010 01:04:33 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Using &lt;/SCRIPT&gt; In A JavaScript Literal</title>
		<link>http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/</link>
		<comments>http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/#comments</comments>
		<pubDate>Wed, 25 Apr 2007 19:39:55 +0000</pubDate>
		<dc:creator>Walt Stoneburner</dc:creator>
				<category><![CDATA[Bug Report]]></category>
		<category><![CDATA[Disclosure]]></category>
		<category><![CDATA[How To]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web/Ajax]]></category>
		<category><![CDATA[Workaround]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/</guid>
		<description><![CDATA[Today I got bit by a very interesting bug involving the </SCRIPT> tag.  If you're writing code that generates code, you want to know about this.]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m currently working on an application that takes content from various web resources, munges the content, stores it in a database, and on demand generates interactive web pages, which includes the ability to annotate content in a web editor.  Things were humming along great for weeks until we got a stream of data which made the browser burp with a JavaScript syntax error.</p>
<p>Problem was, when I examined the automatically generated JavaScript, it looked perfectly good to my eyes.</p>
<p>So, I reduced the problem down to a very trivial case.</p>
<p>What would you suppose the following code block does in a browser? </p>
<p><DIV STYLE="border: thin solid black; margin-left: 1em; padding: 0.5em; background-color: Aquamarine; font-family: Verdana, Courier; font-weight: bold;">&lt;HTML&gt;<br />
&lt;BODY&gt;<br />
&nbsp;&nbsp;start<br />
&nbsp;&nbsp;&lt;SCRIPT&gt;<br />
&nbsp;&nbsp;&nbsp;&nbsp;<SPAN STYLE="color:red;">alert( &#x22;&lt;/SCRIPT&gt;&#x22; );</SPAN><br />
&nbsp;&nbsp;&lt;/SCRIPT&gt;<br />
&nbsp;&nbsp;finish<br />
&lt;/BODY&gt;<br />
&lt;/HTML&gt;</DIV><br />
<CENTER><A HREF="http://www.wwco.com/~wls/MOZILLABUG/closescript.html">Try it and see.</A></CENTER></p>
<p>To my eyes, this should produce an alert box with the simple text <B>&lt;/SCRIPT&gt;</B> inside it.  Nothing special.</p>
<p>However, in all browsers (IE 7, Firefox, Opera, and Safari) on all platforms (XP/Vista/OS X) it didn&#8217;t.  The close tag <EM>inside</EM> the quoted literal terminated the scripting block, printing the closing punctuation.</p>
<p>Change &lt;/SCRIPT&gt; to just &lt;SCRIPT&gt;, and you get the alert box as expected.</p>
<p>So, I did more reading and more testing.  I looked at the hex dump of the file to see if perhaps there was something strange going on.  Nope, plain ASCII.</p>
<p>I looked at the JavaScript documentation online, and the other thing they suggest escaping are the single and double quotes, as well as the backslash which does the escaping.  (Note we&#8217;re using forward slashes, which require no escapes in a JavaScript string.)</p>
<p>I even got the 5th Edition of <a href="http://www.amazon.com/gp/redirect.html?ie=UTF8&#038;location=http%3A%2F%2Fwww.amazon.com%2FJavaScript-Definitive-Guide-David-Flanagan%2Fdp%2F0596101996%3Fie%3DUTF8%26s%3Dbooks%26qid%3D1177525833%26sr%3D8-1&#038;tag=slingcode-20&#038;linkCode=ur2&#038;camp=1789&#038;creative=9325">JavaScript:  The Definitive Guide</a><img src="http://www.assoc-amazon.com/e/ir?t=slingcode-20&amp;l=ur2&amp;o=1" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> from O&#8217;Reilly, and on page 27, which lists the comprehensive escape sequences, there is nothing magical about the forward slash, nor this magic string.</p>
<p>In fact, if you start playing with other strings, you get these results:<br />
&nbsp;&nbsp;<B>&lt;SCRIPT&gt;</B> &#8230;works<br />
&nbsp;&nbsp;<B>&lt;A/B&gt;</B> &#8230;works<br />
&nbsp;&nbsp;<B>&lt;/STRONG&gt;</B> &#8230;works<br />
&nbsp;&nbsp;<B>&lt;\/SCRIPT&gt;</B> &#8230;displays &lt;/SCRIPT&gt;, and while I suppose you can escape a forward slash, there should be no need to. Ever.  See prior example.<br />
&nbsp;&nbsp;<B>&lt;/SCRIPT&gt;</B> &#8230;breaks<br />
&nbsp;&nbsp;<B>&lt;/SCRIPTX&gt;</B> &#8230;works (note the extra character, an X)</p>
<p>With JavaScript, what&#8217;s in quotes is supposed to be flat, literal, uninterpreted, meaningless test.</p>
<p>It was after this I turned to ask for help from several security and web experts.</p>
<p><H2>Security Concerns</H2><br />
Why security experts?</p>
<p>The primary concern is obviously cross site scripting.  We&#8217;re taking untrusted sites and displaying portions of the data stream.  Should an attacker be able to insert &lt;/SCRIPT&gt; into the stream, a few comment characters, and shortly reopen a new &lt;SCRIPT&gt; block, he&#8217;d be able to mess with cookies, twiddle the DOM, dink with AJAX, and do things that compromise the trust of the server.</p>
<p><H2>The Explanation</H2><br />
The explanation came from <A HREF="http://www.wherry.com/">Phil Wherry</A>.</p>
<p>As he puts it, the &lt;SCRIPT&gt; tag is content-agnostic.  Which means the <EM>HTML Parser</EM> doesn&#8217;t know we&#8217;re in the middle of a JavaScript string.</p>
<p>What the HTML parser saw was this:<br />
<DIV STYLE="border: thin solid black; margin-left: 1em; padding: 0.5em; background-color: Gold; font-family: Verdana, Courier; font-weight: bold; color: drakgrey">&lt;HTML&gt;<br />
&lt;BODY&gt;<br />
&nbsp;&nbsp;start<br />
&nbsp;&nbsp;<SPAN STYLE="color:black;"><STRONG>&lt;SCRIPT&gt;<SPAN STYLE="background-color:yellow;">alert( &#x22;</SPAN>&lt;/SCRIPT&gt;</STRONG></SPAN><br />
&nbsp;&nbsp;&#x22; );<br />
&nbsp;&nbsp;<SPAN STYLE="color:IndianRed;"><EM>&lt;/SCRIPT&gt;</EM></SPAN><br />
&nbsp;&nbsp;finish<br />
&lt;/BODY&gt;<br />
&lt;/HTML&gt;</DIV></p>
<p>And there you have it, not only is the syntax error obvious now, but the HTML is malformed.</p>
<p>The processing of JavaScript doesn&#8217;t happen until <EM>after</EM> the browser has understood which parts are JavaScript.  Until it sees that close &lt;/SCRIPT&gt; tag, it doesn&#8217;t care what&#8217;s inside &#8211; quoted or not.</p>
<p>Turns out, we all have seen this problem in traditional programming languages before.  Ever run across hard-to-read code where the <em>indentation</em> conveys a block that doesn&#8217;t logically exist?  Same thing.  In this case instead of curly braces or begin/end pairs, it was the start and end tags of the JavaScript.</p>
<p><H2>Upstream Processing</H2><br />
Remember, this wasn&#8217;t hand-rolled JavaScript.  It was produced by an upstream piece of code that generated the actual JavaScript block, which is much more complex than the example shown.</p>
<p>It is getting an untrusted string. Which, to shove inside of a JavaScript string not only has to be sanitized, but also escaped in such a way that the HTML parser cannot accidentally treat the string&#8217;s contents as a legal (or illegal!) tag.</p>
<p>To do this we need to build a helper function to scrub data that will directly be emitted as a raw JavaScript string.<OL><br />
<LI>Escape all backslashes, replacing \ with \\, since backslash is the JavaScript escape character.  This has to be done first as not to escape other escapes we&#8217;re about to add.<br />
<LI>Escape all quotes, replacing &#39; with \&#39;, and &#x22; with \&#x22; &mdash; this stops the string from getting terminated.<br />
<LI>Escape all angle brackets, replacing &lt; with \&lt;, and &gt; with \&gt; &mdash; this stops the tags from getting recognized.</OL></p>
<p><DIV STYLE="border: thin solid black; margin-left: 1em; padding: 0.5em; background-color: lightgrey; font-family: Verdana, Courier; font-size: x-small;">private String safeJavaScriptStringLiteral(String str) {</p>
<p>&nbsp;&nbsp;str = str.replace(&#8220;\\&#8221;,&#8221;\\\\&#8221;); // escape single backslashes<br />
&nbsp;&nbsp;str = str.replace(&#8220;&#39;&#8221;,&#8221;\\&#39;&#8221;); // escape single quotes<br />
&nbsp;&nbsp;str = str.replace(&#8220;\&#8221;",&#8221;\\\&#8221;"); // escape double quotes<br />
&nbsp;&nbsp;str = str.replace(&#8220;&lt;&#8221;,&#8221;\\&lt;&#8221;); // escape open angle bracket<br />
&nbsp;&nbsp;str = str.replace(&#8220;&gt;&#8221;,&#8221;\\&gt;&#8221;); // escape close angle bracket<br />
&nbsp;&nbsp;return str;<br />
}</DIV></p>
<p>At this point we should have generated a JavaScript string which never has anything that looks like a tag in it, but is perfectly safe to an XML parser.  All that&#8217;s needed next is to emit the JavaScript surrounded by a <STRONG>&lt;![CDATA[</STRONG> ... <STRONG>]]&gt;</STRONG> block, so the HTML parser doesn&#8217;t get confused over embedded angle brackets.</p>
<p>From a security perspective, I think this also goes to show that lone JavaScript fragment validation isn&#8217;t enough; one has to take it in the full context of the containing HTML parser.  Pragmatically speaking, the JavaScript alone was valid, but once inside HTML, became problematic.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Great XSLT Tool for OS X</title>
		<link>http://www.wwco.com/~wls/blog/2007/03/27/great-xslt-tool-for-os-x/</link>
		<comments>http://www.wwco.com/~wls/blog/2007/03/27/great-xslt-tool-for-os-x/#comments</comments>
		<pubDate>Tue, 27 Mar 2007 22:38:37 +0000</pubDate>
		<dc:creator>Walt Stoneburner</dc:creator>
				<category><![CDATA[OS X]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Review]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Walt's Desktop]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://www.wwco.com/~wls/blog/2007/03/27/great-xslt-tool-for-os-x/</guid>
		<description><![CDATA[Found an awesome tool for performing XSLT transformations on Mac OS X.  It's called XSLPallette, and it worked flawlessly where web browsers fell down hard.]]></description>
			<content:encoded><![CDATA[<p>While working on some XML and XSLT stuff, I ran into some strange problems where transformed XML <a href="http://dev.rubyonrails.org/ticket/7919">content was making Firefox spin its wheels forever</a> and Safari was having problems rendering XSL variables.</p>
<p>I wasn&#8217;t engaged in a browser war shoot out, I just wanted to know that the XSLT was correctly transforming the XML into the desired output.  As various tools were slowly slipping from my fingertips, I figured I might just have to go back to the command line.</p>
<p><A HREF="http://www.ditchnet.org/xslpalette/"><IMG SRC="http://www.wwco.com/~wls/livejournal/XSLPalette.png" BORDER="0" ALT="XSLPalette" WIDTH="128" HEIGHT="128" ALIGN="RIGHT" STYLE="padding:0.5em"></A>But then I discovered <a href="http://www.ditchnet.org/xslpalette/">XSLPalette</a>.  It&#8217;s a &#8220;free, native, XSLT 2.0, XPath 2.0, and XQuery 1.0 debugging palette&#8221; for OS X (and it&#8217;s a Universal Binary).</p>
<p>All I have to say is that, as a developer, I&#8217;m impressed with the ease this tool provides for trying different XSLT engines.  I does basically one thing, and that one thing very, very well.  I like that in developer tools.</p>
<p>You give the palette an XML file, and XSLT file, select the engine, and it does the transformation, showing you messages along the way, in addition to the transformed output, a collapsible view, and a browser-like rendered view.</p>
<p><STRONG>Walt gives XSLPalette a thumbs up!</STRONG></p>
]]></content:encoded>
			<wfw:commentRss>http://www.wwco.com/~wls/blog/2007/03/27/great-xslt-tool-for-os-x/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
