Dan Newcome on technology

I'm bringing cyber back

Archive for August 2010

Making rich text possible on mobile devices

leave a comment »

This post is an overview of the techniques used in a rich text editor for mobile devices that I’m working on called Richie. I’m focusing on the iPhone first, followed by the iPad and other devices.

As mobile devices evolve and increase in popularity I think we’ll be doing more of the things that we do on our laptops on mobile devices like tablets and mobile phones. Content generation activities like blogging and using rich internet applications will only become more common, and many of these things rely on browser-based rich text editors, which generally don’t work on mobile devices.

I’m not sure why there haven’t been attempts at porting current editors such as TinyMCE over to work with mobile platforms. The authors of these editors seem to be obstinately waiting for Apple and other manufacturers to support designmode so that their editors will work as currently implemented. It is possible that I didn’t search long enough, or perhaps the demand is lower than I think. Also, for all I know right now the whole problem may be harder than it looks from here (I haven’t thought about things like copy/paste yet).

However, in general many of the basic DOM techniques that these editors use for manipulating rich text elements (e.g. inserting lists, bold tags) will still work without designmode, but there are two significant challenges that must be solved first: manipulating the insertion point and capturing user input. Browsers that support designmode or contenteditable solve these issues for us, but lacking them, we can still make things work.

Tracking the insertion point

There are two basic ways that I have experimented with for managing the insertion point. The first was using an HTML DOM range. The second, which is used in the current implementation of Richie, uses a simple <span> element as the cursor. The main advantage of using a range is that it doesn’t interfere with the editor text, but the disadvantage is that unlike a DOM element, we can’t easily get its coordinates in order to display a visible cursor and is more difficult to keep positioned in the right place. Using a DOM element has the additional advantage of being visible if we want to use it as the cursor in the UI.

The html for the cursor would look like this for most mobile devices, or if we don’t need to capture keystrokes using a floating text box (as we will see later):

<span id="cursor">|</span>

For the iPhone, I used an empty span and used the native text input’s cursor as the cursor UI:

<span id="cursor"></span>

Managing user input

Traditional HTML rich text editors such as TinyMCE use key handlers for things like shortcut keys, but we take things a step further and use key handlers for every user input. Printable text characters are inserted into the DOM by the handler as textNode elements using HTML DOM methods:

function handleKey( evt ) {
    // other key handlers

    // if no other handlers apply, insert character at insertion point
    var text = String.fromCharCode( evt.charCode );
    var textNode = document.createTextNode( text );
    cursor.parentNode.insertBefore( textNode, cursor );


On a desktop browser, the key handlers would work at this point and our job would be complete. On a mobile device like the iPhone, the keyboard would still be hidden and we wouldn’t even be able to type. All we really need to do is add a text input control and focus it.

<input id="keyboardinput" type="text"/>

This will work, but in addition to confusing the user by having two places where text is being entered, the focus of the screen will center around the text input, which may not coincide with where the rich text editor is inserting text. To solve this I absolutely positioned the text box directly over the insertion point — in fact, from a UI perspective it is the insertion point.

#keyboardinput {
    position: absolute;
    background: none;
    border: none;
    width: 1px;

Every time a character is inserted or we otherwise alter the position of the insertion point, we need to reposition the input box over the cursor position with a little fudge factor to line it up perfectly:

function repositionInputBox() {
	keyboardinput.style.top = cursor.offsetTop - 2 + "px";
	keyboardinput.style.left = cursor.offsetLeft - 6 + "px";

I’m not an expert on Mobile Safari, so there are several bugs in this initial implementation of Richie related mostly to repositioning the cursor when deleting characters. Also, the approach I’m taking here may break down as I try to tackle things like copy/paste — I’m not sure how that is going to work yet. Please do drop me a comment if you have a chance to look at the code.

Written by newcome

August 31, 2010 at 7:40 pm

Posted in Uncategorized

Updated: Jath Javascript XML processor

leave a comment »

I just pushed some updates to the Jath template-based XML processor. The new version adds support for including literal values in the templates and most importantly (for me, right now) support for processing files under Windows Script Host. I’m going to be using this to churn through tons of exported CRM config data soon, so I wanted to get this update out there.

To run under WSH nothing different is required, just include jath.js in the .wsf file and proceed as usual. Jath will detect the environment just as it detects Node.js and various browser versions.

A quick synopsis on using literals — a typical template might look like the following:

var template [ "//status", { id: "@id", message: ":literaldata" } ];

The only difference is that the value “literaldata” is prefixed by a semicolon, marking it as a literal rather than an XPath expression. The output would look something like the following:

	{ id: "1", message: "literaldata" }, 
	{ id: "3", message: "literaldata" } 

The The character can be changed by setting


to another character. Note that the value may be a single character only — no longer strings.

Written by newcome

August 24, 2010 at 6:54 pm

Posted in Uncategorized

Increase project velocity, decrease cost of change

leave a comment »

I’ve been thinking a bit lately about project management topics and how they relate to customer expectations. I realize that this is quite a broad topic sentence, and I’m only going to dive into a few very specific things here, but hopefully I can keep expanding these ideas in a series of posts.

We’ve all been on projects where progress looks something like: requirements are hashed out, initial design goes very quickly, morale is high and progress is happening very fast. However, the initial velocity is rarely sustainable. Roadblocks arise, things are rewritten, and complexity goes up. The final days stretch into weeks and morale takes a dive.

I’m going to propose that projects should optimize for a sustainable increase in velocity throughout the life of the project. This means managing the critical path very carefully and being vigilant for hard problems that need to be solved and potential show-stoppers. There is a tendency to ignore hard problems that arise in order to keep the level of optimism up, but I think that there is an advantage to really understanding the problems well as early as possible, even if it means slipping on the schedule early.

I need to dig up a few references that I can’t seem to find right now that support basically two things:
1) New features should get easier to add as the project progresses rather than harder. This indicates that the project is dealing with complexity by building up solutions early on that can be leveraged later.

2) Project should optimize for higher velocity late in the project rather than early. Avoid going full tilt on a single branch in the beginning and focus the new-project energy and excitement on solving chunks of the critical path in small spike projects which form the basis of the first principle.

I apologize that I’m posting these unsupported and jumbled ideas, but I’ve been thinking about these things for a while and I think that starting a series of posts may be the only way I can figure out if they are at all valid.

Written by newcome

August 24, 2010 at 1:17 pm

Posted in Uncategorized

Release small

leave a comment »

I just read a post by Tom Preston-Werner entitled ‘Readme-Driven Development’. This reminds me of some thoughts that I wanted to blog about.

I’ve had a few projects that I thought would be pretty simple to do but turned out to be a little more complicated than expected. Rather than hunker down and churn through the whole hairy problem in one swipe I started to compartmentalize some tricky parts. I realize that this is no different than just doing good modular software design, but the difference is that I released the code on its own and documented its API separately from the main project. The module is very small but it represents a major design decision and wrapping it up with its own readme helped to solidify the design of the overall project in my head.

Thinking a bit further, this seems like it would have advantages when proposing alternate solutions or evangelizing a certain technique used within a project in a team environment also.


Written by newcome

August 23, 2010 at 12:11 pm

Posted in Uncategorized

Javascript as a data manipulation language

leave a comment »

If you are like me you’ve been working with data in various forms for your whole programming career. Sometimes we have big datasets locked away in relational databases or complex content management systems. Other times we’re dealing with ad-hoc data from spreadsheets or web portals. Ironically it can be the larger data that are easier to work with since the tools for dealing with the data are right under your fingertips.

In the Unix tradition, where nearly everything is treated as line-oriented text files, we have a whole slew of tools like grep, awk and sed to cut through all of that plaintext data. The problem is that often times the data we want to deal with is relational or hierarchical, and in order to deal with it, line-oriented techniques become awkward. Languages such as Perl were designed around line-oriented text processing, so why not look to something that is fundamentally designed around a slightly higher-level data representation for these more complex tasks? I’m talking about Javascript.

JSON literals are becoming the de-facto standard for data on the web and elsewhere, and impose a pretty small overhead on top of what would be the raw delimited data that it represents (especially compared to XML). LISPers are probably laughing at this point since they have basically the same advantage with S-expressions, but unfortunately JSON has gotten quite a bit more mindshare now.

Previously, I would have used SQL for most data manipulation tasks, connecting to the spreadsheets or comma-delimited files using ODBC or used DTS to import the data. However, I’ve recently had to deal with a slew of small datasets from different sources, some of which are already JSON formatted. So I wondered how hard could it be to write a few simple data manipulation functions in Javascript?

For starters, we need to get everything together as Javascript objects. For JSON files all we need to do is read the file in and eval() it. I’m using Windows Scripting Host here, but the same techniques will work if you are using Rhino or Node.js. All you need is filesystem access.

var filename = WScript.Arguments.Item(0);
var fso = new ActiveXObject( "Scripting.FileSystemObject" );
var file = fso.OpenTextFile( filename, 1 );
var data = eval( "(" + file.ReadAll() + ")" );

The other data that I want to pull in is actually comma delimited. We can write some simple code to go through a file and produce an array of objects containing the rows in just a few lines of code:

function processDelimited( text, fields, delim ) {
	var ret = [];
	var linedelim = "\n";

	var lines = text.split( linedelim );
	for( var i=0; i < lines.length; i++ ) {
		var obj = {};
		var tokens = lines[i].split( delim );
		for( var j=0; j < tokens.length; j++ ) {
			obj[ fields[j] ] = tokens[ j ]; 
		ret.push( obj );
	return ret;

Note that this code doesn't allow for escaping delimiter characters or quotation marks around the data elements. I didn't need this functionality so I didn't implement it.

Once we have our data pulled in as tuples we can write functions to operate on it as needed. I needed to join datasets together on a key like a SQL equi-join so I wrote a bit of code to do it like this:

function innerJoin( obj1, obj2, func ) {
	var ret = [];
	for( var i=0; i < obj1.length; i++ ) {
		for( var j=0; j < obj2.length; j++ ) {
			if( func( obj1[i], obj2[j] ) ) {
				ret.push( merge( obj1[i], obj2[j] ) );	
	return ret;

I’ve omitted the code for merge() for the sake of clarity. All it does is copy the fields of one object to the second to create a composite tuple. Of course this code is a slow way to perform a join, but it doesn’t matter for small datasets. At this point we can see why Javascript makes a great language for doing data manipulation. The join condition is actually a function. In SQL we are limited to the operators given to us, but here we can do any kind of logic we want to. Here is an example usage where the join condition is a simple equi-join:

function( obj1, obj2 ) { 
    return obj1[ "field1" ] == obj2[ "field2" ];

This is equivalent to

where obj1.field1 = obj2.field2 

in SQL.

There are some more hidden advantages to having your data represented as Javascript that I’ll cover later on, but I’ll give you a hint that all of our data is actually represented as code, so doing things like generating GUIDs or re-using certain bits of data become trivial.

Written by newcome

August 22, 2010 at 6:24 pm

Posted in Uncategorized

In defense of the Storarray

leave a comment »

I just ran across an old DailyWTF article again, courtesy of Hacker News. I’m pretty sure that I read the article when it first ran in 2006 when I was working for a government contractor. Back then I used to devour DailyWTF, Slacker Manager, Lifehacker and a raft of other blogs through Google Reader. It was always good to read about some poor unfortunate sap who didn’t know any better and their “failings” as a programmer too make you feel a little better about yourself. Smug even. It’s good safe fun for the average programmer to pile on with the masses in a “me-too” ritual crucifixion of the unnamed villain — a villain that we all have our own version of: the embodiment of all that is evil or threatening to us in our little programming worlds (or that with which we dislike or happen to disagree with).

I’ve been doing a fair bit of self-reflection recently about what helps a programmer move the ball forward in his or her world of programming knowledge. One thing that I’ve been trying out recently is attacking what in programming lore are “hard” problems (problems that best practices dictate that you don’t touch as a mere mortal) by just doing something naive at first and letting myself run headlong into the complications that are sure to abound. In order to allow yourself this indulgence you really have to shrug off all of the “WTFs” that you have read over the course of your career. All of the dos, don’ts, best practices, and all of that stuff just scream “don’t do it” right in your face. However, if we all really heeded the best practices, we wouldn’t have things like the resurgence of schema-free databases and the NoSql movement to name just two.

I have a lot of thoughts on this stuff, but unfortunately they are not well organized yet, so hopefully I can post a bit more and try to link together the pieces into something that makes a little sense.

Written by newcome

August 22, 2010 at 2:26 pm

Posted in Uncategorized

Transforming JSON to XML using Mustache

leave a comment »

I had the need recently to take a deeply-nested JSON structure and transform it to a relatively flat XML file. Ideally I wanted the equivalent of XPath for JSON, where I could have flattened a nested structure like this:

{ name: form1, fields: [ ... ] }, 
{ name: form1, fields: [ ... ] }

using XPath to select the fields:


I had been using the following custom bit of code to apply a flat template to a Javascript array, which I could have applied to the nodeset from the XPath listed above.

function applyTemplate( data, template ) {
    for( var i=0; i < data.length; i++ ) {
    	var output = template;
	var item = data[i];
	for( field in item ) {
		var regex = new RegExp( "\\${" + field + "}", "g" );
		output = output.replace( regex, item[field] );
	WScript.Echo( output );	

This very simply applied the entire template to each item in the array, replacing ${} tags with corresponding fields on the JS objects. I had to do some manual work to cobble together the entire output that I wanted using this quick-and-dirty script. I was trying to avoid using a template language but now that I wanted to do nested repetitions I didn’t want to spend the time to write any more code.

I looked around for something that would let me do something similar to what I did in Jath to go from XML to JSON. I found an interesting hack for making jQuery work against JS objects, but I really wanted something more declarative than functional in this case.

I found JsonPath, which only solved half of my problem — that is, the selection of source data. I would have had to use my hack script to do the rest of the job. The closest thing that I found in Javascript to solve the problem was JsonT. JsonT looks interesting, but I wasn’t too crazy about having to write so many individual rules to do what I wanted. It felt like doing XSLT where you are applying a ton of small rules and it is hard to see at a glance what your output looks like.

I was afraid I’d end up using something like StringTemplate or one of the Ruby template engines that Rails uses. But, I found a Javascript implementation of Mustache called mustache.js.

I’m using JScript under Windows Script Host here, and mustache.js worked just fine in this environment. I was thinking that I’d have to provide my own lambda functions in order to get the data that I needed from the source JS object since it didn’t look like Mustache supported nested objects. For example, you can’t say:


You can, however reference a function, so my source data would have had to be a wrapper around my actual data. For example:

var data = {
   name: "form1",
   fields: function() { 
       /* iterate through source data */ 
       return fields;

Then we could do:


without having to use dot notation. The real reason for this post, however, is to show how nested enumerable sections work. I realized that we could emulate the equivalent of XPath’s ‘/form/fields’ using the following:


My template now looks something like:


Hopefully this helps someone out. There are a lot of JSON to XML and XML to JSON libraries out there, but most of them are pretty specific to one thing. For template-based XML to JSON using XPath queries check out my Jath project. For rule-based JSON to XML serialization JsonT is pretty interesting, and I have a project that is aimed at providing rule-based namespace support for XML serialization called js-xml-serializer (I ran out of fun names).

Written by newcome

August 18, 2010 at 7:51 pm

Posted in Uncategorized

Release: Javascript XML serialization library

with 10 comments

I’m working on a project that will access MS CRM directly via Javascript. Microsoft uses SOAP web services as their API protocol, and arguments have to be in a very particular XML format. I couldn’t find anything off-the-shelf that would allow complex SOAP calls, so I started hacking. Part of the solution turned into a pretty flexible XML serializer for Javascript, so I’m open-sourcing that part of it now.

The source is on github here, and I’ve pasted the readme below.


js-xml-serializer is a Javascript serialization library that allows an arbitrary Javascript object to be serialized as XML according to a set of supplied rules and namespace definitions. The motivation behind this project was the author’s inability to find an existing Javascript serialization method that supported namespaces and allowed enough control to allow sending complex Javascript objects as arguments to a SOAP web service.


In the nominal case, given a Javascript object like the following:

var obj = {
    attr1: "two",
    attr2: new MyClass( "hello" ),
    attr3: [ "three", "four" ]

We can produce a simple XML document:


Using the following code:

serialize( obj );

In order to enable more control over serialization, we can create a structure outlining specific serialization rules to be applied to the object:

var rules = {
    Object: {
        __def__: { nodetype: "element", nodename: "obj", namespace: "http://example.org/" },
        attr1: { nodetype: "element", nodename: "first", namespace: "http://example.org/" },
        attr2: { nodetype: "element", nodename: "myc-attr2", namespace: "http://example.org/" }
    Array: {
        __def__: { nodetype: "element", nodename: "arr", namespace: "http://example.org/" }
    String: {
        __def__: { nodetype: "element", nodename: "str", namespace: "http://example.org/" },
    MyClass: {
        __def__: { nodetype: "element", nodename: "myc", namespace: "http://example.org/" }

The rule denoted by __def__ will be applied to an appearance of the type in the object to be serialized if no other rule matches — that is, if no rule that applies to a particular member of another type overrides it.

Namespace prefixes are specified as the following:

var namespaces = {
    "http://example.org/": "ex",

The serialized XML output looks like this:

<ex:obj xmlns:ex='http://example.org/'>


js-xml-serializer is being developed as part of a larger project so it is only designed to solve serialization issues as-needed for that particular project. It is provided in the hopes that others may suggest extensions to cover cases that the author has not considered. This software is not being used by the author in production yet, and it is suggested that others test it fully before using it.

Future work

  • Clean up output: Output is not indented and some spurious (semantically insignificant) whitespace is emitted.
  • Provide default rules: Support for supplying a base rule set would make using the library more convenient. Provisions for merging rule sets could also prove useful.
  • Allow default namespace: Output explicitly defines a prefix for all namespaces. While semantically correct, it may produce more readable output to determine one namespace to use for the default.
  • Change def mechanism: the mechanism by which the default serialization rule is specified may not be the best solution. Comments are welcome for addressing this.
  • Don’t pollute namespace: No wrapper object is provided around the basic serialize() function.


js-xml-serializer is copyright 2010 Dan Newcome and is provided under the MIT free software license. See the file LICENSE for
the full text.

Written by newcome

August 11, 2010 at 4:34 pm

Posted in Uncategorized