matt ryall’s weblog

Pushing my own barrow since 2002.

Articles

Shortening content to fit with JavaScript

11 October 2009

Very often when working in an application, you deal with data of variable length. File names provided by users, titles that can be arbitrarily long, all these need to be catered for.

Fitting this variable length data into a grid or fixed layout in a web interface can be problematic. The current crop of browsers only has limited support for shortening content so it can fix in a fixed amount of space.

This week I’ve been working on a more general mechanism for shortening text until it fits within a predefined width. Specifically, I need a solution that worked for breadcrumbs where there was a maximum total width for the breadcrumb, and the left-most breadcrumb items are less important than the right-most. Therefore, the left-most breadcrumbs should be abbreviated first.

Here’s a demonstration of the JavaScript I came up with: abbreviation demo.


Before abbreviation


After abbreviation

The JavaScript to shorten text this way is quite simple. It uses jQuery for simplicity, but would work quite well in core JavaScript as well. Below is the guts of the algorithm.

function abbreviateUntil(element, condition) {
    var children = $(element).children();
    var current = 0;
    while (current < children.length && !condition.call(element)) {
        var $child = $(children[current]);
        if ($child.text() == "\u2026") {
            current++;
            continue;
        }
        if (!$child.attr("title")) $child.attr("title", $child.text());
        $child.text($child.text().replace(/[^\u2026]\u2026?$/, "\u2026"));
    }
}

In the real implementation, this function is wrapped in a jQuery plugin that you use like this:

var fixedWidth = $(".fixed").width();
$(".fixed ul").abbreviateUntil(function () {
    return $(this).width() <= fixedWidth;
})

More information on how to use the code and how to construct the markup and styles is available on the demo page. Please let me know If you have any ideas on how to make this more generally useful.

 
 

Why software quality is important

12 August 2009

Even though he’s talking just about the development of his iPhone application, Buzz Andersen hits the nail on the head about why quality matters for the long-term success of any software product:

One of the hardest things about shipping Birdfeed was staying committed to slaving away on such minutae while other, often less polished, clients beat me to market.

While such attention to detail may not be appreciated in the specific case, however, I’ve found that in aggregate it leads to an overall impression of quality that attracts the kind of fanatically devoted users who form the backbone of a growing, long term user base.

Attention to detail is the only way to develop high quality software, and high quality software is what leads to users so happy to use your product, it sells itself by word-of-mouth.

 
 

On a slow boat to China

11 May 2009

Early next week, Liz and I are heading off to China for a long holiday. We’re travelling for eight weeks, with only the loosest of itineraries. It’s going to be an exciting adventure into the land widely regarded as having the worst toilets in the world.

Tourist boat, Pudong, Shanghai
Note: Artist’s impression. Actual boat may vary.

Unfortunately, we’re not really going to catch a slow boat there. Rather, we found incredibly cheap return flights from Sydney to Macau with Viva Macau. From our starting point in Macau, we plan to travel across most of the provinces in southern China over the following weeks.

Some might remember that I was learning Mandarin last year. My studies haven’t really progressed very far. If I’m lucky I might manage to be understood while saying ‘hello’ or ‘goodbye’. Anything more complicated, and I’m likely to get chased out of town for referring to somebody’s mother as a horse. (Um, yeah, I guess that’s an in-joke for Mandarin speakers.) Certainly, the language barrier is still going to be one of the biggest challenge in our travels.

I’ll be trying to post pictures from the trip as we go. Watch my Flickr account for updates on that front.

Red Oak beer photograph
Join us for farewell drinks at Red Oak

If you’d like to see us off, Liz and I will be having will be some farewell drinks this Friday, 15 May at the infamous Red Oak Beer Café. We should be there from around 6pm if you want to catch up before we go.

 
 

Smart Quotes: library for curly quotes in Java

2 May 2009

Recently, I’ve become a fan of using proper punctuation in web pages. This includes simply expanding the range of punctuation I use to include a variety of dashes, fractions and symbols, and also the use of correctly curled quotation marks and apostrophes.

Even for the enthusiastic, entering the proper quotation marks manually is a pain. On Mac OS X, you need to hold down the option key and remember which of the unmarked bracket or brace keys corresponds to your desired quotation mark. On Windows, it’s even more painful to locate the quote with the character map or keyboard shortcut.

To solve this problem, I’ve written a new Java library, Smart Quotes, to automatically correct "straight quotes" to “curly quotes” in Java web applications like Confluence.

Even now, a Google search for “Java smart quotes” or “Java curly quotes” returns lots of results about how to remove the quotes rather than add them! While curly quote removal is a very simple search-and-replace, inserting correct curly quotes is a bit more of a hairy problem. You need to correctly curl the double quotes the right way, and recognise the distinction between apostrophes and single-quoted phrases.

Smart Quotes processes HTML documents. It replaces straight double or single quotes only in the text sections, avoiding the tag attributes and preformatted blocks like <code> and <pre>.

For example, the following markup:

"Occasionally, quotations may contain 'internal <em>quoted passages</em>', which shouldn't confuse one's quote-curling library."

is correctly converted by Smart Quotes into this HTML:

“Occasionally, quotations may contain ‘internal quoted passages’, which shouldn’t confuse one’s quote-curling library.”

This would work the same, even if the content were sprinkled with HTML tags.

The Smart Quotes source code contains a large number of tests that verify the quoting behaviour, interaction with tags, and so on. If you use the library and find bugs, please consider submitting a patch with a test case.

You may remember that when I started using Markdown on my blog, I also added SmartyPants. SmartyPants is a Perl library which does the same job as my new library, and I’m indebted to it for a lot of assistance with the algorithm and the original inspiration for the project. While SmartyPants already solves this problem very nicely in Perl, it isn’t suitable for integration into a cross-platform Java application.

To integrate Smart Quotes usefully in Confluence, I’ve written a Smart Quotes macro plugin. This plugin takes any other Confluence wiki content as its body, and automatically replaces the quotes throughout. For this reason, you can just wrap the entire contents of the page in a {smart-quotes} macro to get proper curly quotes.

I’m hoping the Smart Quotes library proves useful to the many Java web application developers who would like proper punctuation in their web pages. It’s available under an Apache open source license, so you can reuse and redistribute it under very flexible terms.

 
 

Oracle acquires Sun, Java

21 April 2009

Yesterday, Sun was acquired by Oracle for $9.50 a share. Oracle CEO Larry Ellison is quoted in the New York Times:

He said Java, the language used in most computer science schools and a technology used daily by millions of software developers, was “the single most important software asset we have ever acquired.”

It worries me that Larry considers Java a “software asset”. Assets must have some value to the owner. What is the value of the Java programming language to Oracle? Perhaps the ability to better sell Java consulting. Or maybe, repositioning WebLogic as the standard Java enterprise platform.

Regardless of what the primary commercial benefit is for Oracle, it’s unlikely to have much to do with improvements to the language and the platform which help the average Java developer.

As a developer who works with Java quite a bit, I see the development of the programming language not as an asset but as a liability for a company. It’s a lot of time-consuming work with very few commercial benefits. Assuming the buyout is approved by the regulators and Sun’s shareholders, it will be interesting to see what Oracle does with their so-called asset in the next few years.

 
 

The infamous Turkish locale bug

11 February 2009

I discovered a quirky comment today in Confluence’s Permission.forName(String) method:

// use the english locale to avoid the infamous turkish locale bug
String upperName = permissionName.toUpperCase(Locale.ENGLISH);

Naturally the question popped into my mind: what is the ‘infamous Turkish locale bug’? Looking into the JIRA issues related to the commit (CONF-5931, CONF-7168), I found a link Agnes put to this article about a common Java bug in the Turkish locale: Turkish Java Needs Special Brewing.

In the Turkish alphabet there are two letters for ‘i’, dotless and dotted. The problem is that the dotless ‘i’ in lowercase becomes the dotless in uppercase. At first glance this wouldn’t appear to be a problem; however, the problem lies in what programmers do with upper- and lowercases in their code.

The two lowercase letters are \u0069 ‘i’ and \u0131 ‘?’ (dotless ‘I’) and are totally unrelated. Their uppercase versions are \u0130 ‘?’ (capital letter ‘I’ with dot above it) and \u0049 ‘I’. The issue is that this behavior does not occur in English where the single lowercase dotted ‘i’ becomes an uppercase dotless ‘I’.

With the statement String.toUppercase(), most Java programmers try to effectively neutralize case. Consider a HashMap with string keys and you have a key that you want to look up. If you want to ignore case, you’ll probably uppercase everything going into the map, its entries, and the string you’re doing the lookup with. This works fine for English, but not for Turkish, where dotless becomes dotless.

This is a nice example of where you need to be very careful how you handle upper- and lower-casing in your application. Changing the word ‘quit’ to uppercase in the Turkish locale will result in ‘QU?T’, not ‘QUIT’. I’ve heard of other examples where the German ß (sharp ‘s’) doesn’t behave exactly as English speakers would expect either.

There are two ways to properly perform a case-insensitive comparison of Strings in Java in any locale:

  • (preferred) use String.equalsIgnoreCase()
  • use a fixed locale (like Locale.ENGLISH) as an argument to String.toUpperCase(Locale) or String.toLowerCase(Locale).

You can also use Character.toLowerCase() or Character.toUpperCase() to derive a locale-independent case-insensitive String value. This was the solution used in a recent (and still unreleased) fix for the same problem in the Commons Collections CaseInsensitiveMap.

 
 

Wiki visualisations with JavaScript: Processing.js and Raphaël

3 November 2008

Once every few months, Atlassian holds a “FedEx day”. Developers are free to work on a project of their choosing, with the aim being to deliver something within 24 hours.

For the October FedEx day, I wrote some visualisations for wiki data. My goal was to deliver some amazing visuals which could be applied to any data set. I wanted to explore the potential of two great new graphics libraries for JavaScript: Processing.js and Raphael.

The specific data I used wasn’t really so important, but in this case I used the last 1000 comments on Atlassian’s Extranet. Surprisingly, one thousand comments on Extranet blogs takes only about 3 weeks to accrue.

For the live demos in this public example, I’ve changed the titles and usernames in the dataset from the original Extranet ones to some automatically generated usernames and titles from a news website.

Screenshots

Here are some screenshots:

Live demos

Try these in Firefox 3 or higher:

  • Contributors — a tree graph visualisation linking commenters and blog post authors.
  • Activity — a rippling visualisation of comment activity on the wiki. Based loosely on the Apple Arabesque screensaver.
  • Comments — a falling bar-graph visualisation of comments by blogpost. Based very much on a Flash visualisation by Digg, but reimplemented in JS.

All three visualisations are done with real data in time-lapse, so you’re seeing real data on the Extranet appear as you watch (albeit with the titles and usernames changed).

The last two only work properly in Firefox 3 or higher (because it has support for text in Canvas), and they run the best in a recent Firefox nightly build. The first one, contributors, is based on Raphaël and should work with any browser. These demos really tax your computer, and will fully peg one CPU of a quad-core Mac Pro. The animation may be very slow on older machines.

What they mean

The Contributors visualisation links together blog post authors with the people who comment on their blog posts. This means that people with a lot of links either comment on many blog posts or receive a lot of comments on their own.

Each icon repels each of the others, but all icons are drawn towards the centre of the image. Links between the icons pull them together, but at close range the repulsion is usually stronger than the attraction from a link.

Each author’s icon starts to shrink unless they post another comment, in which case it jumps back to full size. Thus you can tell the age of each author’s comment from the size of his or her picture.

The Activity visualisation is a fairly abstract visualisation of commenting on a wiki. It displays a new ripple whenever a new comment appears. The colour and location of the ripple is dependent on the username, based on the username’s position in alphabetical order.

If there are many comments at a particular point in time, the entire graph rises up as the activity increases.

The Comments visualisation shows quite clearly where the comments on the wiki are going. Each falling block is a new comment, and it falls in the column representing the blog it was posted on. Blogs with more comments get brighter colours as time goes on. New blogs appear from the right as they are posted.

The titles at the bottom of the chart show the titles of the last seven blogs which received comments. The colours of the titles match the colours of the columns: brighter green means more comments.

This visualisation was an attempt to recreate the Digg Stack visualisation for our wiki in JavaScript. Unlike Digg, this visualisation has to use time-lapse data though, because comments on our wiki aren’t frequent enough to always be an interesting animation.

Tools and tricks

Probably the single most interesting thing I learnt was how to achieve useful colours in animations. The secret is to use HSB colour, which stands for Hue, Saturation and Brightness. Controlling these three variables separately — rather than quantities of red, green and blue in regular RGB colour — allows much more simpler animation of changing colours.

The frameworks I used gave a lot of help with getting simple things up and running. The frameworks were:

I found the raster graphics of Processing.js and Canvas particularly powerful for the fading animations used in “Activity”. This amazing effect is produced by simply drawing the black background in a semi-transparent manner with each frame rather than completely opaque. This is also the method used to leave “trails” behind the falling blocks in the “Comments” visualisation.

Where raster graphics falls down is with interactivity. I decided to use Raphaël for the “Contributors” visualisation to support drag-and-drop, and at one point I also had the pictures hyperlinking to the relevant personal spaces on our Extranet.

Processing.js had a few minor bugs that I fixed along the way. The HSB colour method didn’t work properly, and I needed to correct the font metrics functions. Hopefully I’ll be able to submit patches for these back to John soon.

All the animations are run with the standard JavaScript window.setInterval() method, but because JavaScript is single-threaded the framerate may actually vary based on the performance of the browser. You can click on any of the animations to pause them.

The overall structure of all the animations is a pipeline of components to render. The components are modelled as Objects in JavaScript, and I keep track of them in an array while the animation is running. Each object has two functions: update(), which is responsible for updating the internal state of the object to reflect its movement; and draw(), which is responsible for drawing the object onto the screen.

At each call of the timer we’ve set, we loop through all the objects twice: once to update them, and once to display them. The separate stages allow an update() function to remove its own object from the rendering queue. Some of the update methods are quite complex, updating many variables related to the component’s position, colour, velocity and interaction with other components.

In the “Comments” visualisation, the most technically complex one, there are about 7 distinct types of components. Many of the components are linked together. The top-level component is a Column, representing one vertical bar in the graph. The Column object is linked to, and manages, any Blocks that are falling down in its column, as well as the Title which appears underneath the graph and is coloured the same as the column itself. The Author component, which appears at a certain height when a block is falling, is not tied to the column and manages it own separate lifecycle.

I came up with some interesting functions to get the right colour and movement effects for some of the components. For example, the radius of each circle in the “Activity” visualisation is a quadratic function of its age in frames. As the age ranges from 0 to 100 frames, the radius accelerates from 30 up to 400 pixels.

A similar quadratic formula was used for fading the author names on the Comments visualisation. In this case, I wanted the author to get slightly brighter initially before fading away. The function I came up with you can see above.

Another simple bit of math that turned out quite nicely was the overall shape of the Activity visualisation. I wanted some way to indicate overall activity on the wiki, and increasing the height of the graph seemed quite natural. What ended up working best was to plot the circles along a sine wave and increase the amplitude of the sine wave based on the activity. You’ll see this in the visualisation where the centre of the graph rises when the activity is highest.

The algorithm for the tree in the Contributors visualisation is original, but not really optimal. Perhaps I’ll come back and fix that up at some later point.

Summary

I’m really happy with the way the visualisations turned out. I think there’s some amazing visuals there. I’m keen to get them up on some big screens around Atlassian, visualising our collaboration as it happens.

 
 

HTML 5, headings and sections

7 October 2008

Tonight there was a presentation at the Web Standards Group by Lachlan Hunt about some of the new facilities provided by new and upcoming web standards: HTML 5 and CSS 3. One point that proved interesting was his coverage of the sectioning feature of HTML 5.

Whereas HTML 4 had just six levels of headings for the entire document, the working draft for HTML 5 stipulates that each section has its own heading hierarchy. An h1 element that appears at the top level in a document is considered to “rank higher than” an h1 element found in a section or article within the document.

For example, rather than using <h1>, <h2> and <h3> elements for the headings in the sample shown below, you can use three nested <section> tags, each with its own <h1>.

Diagram showing HTML 5 markup for sections and headings
HTML 5 section and heading example

This might not seem much simpler in this basic example. In fact, to me it seems decidedly less simple. In the case where each section has its own distinct hierarchy of headings, the situation becomes even more confusing. However, I think the change makes a bit more sense if you consider it in light of two things.

First, the spec recommends keeping the heading hierarchy sane by using either <h1> tags throughout the document or keeping the headings in sync with the levels of sectioning. This latter case is similar to how you do it currently, just without the <section> tags.

Sections may contain headers of any rank, but authors are strongly encouraged to either use only h1 elements, or to use elements of the appropriate rank for the section’s nesting level.

Second, one of the main reasons why sections and other elements are allowed to contain their own heading hierarchy is to handle parts of a document included from elsewhere. There are many examples of this on the web: blogs, where a few articles appear each with its own heading structure; news sites, made up of sections each which comes from its own page with its own headings; search engines, which display excerpts from other sites.

The point of the improvement is so that these sites that include other content don’t have to do any special processing to embed an external article or section with its own heading structure. The levels are automatically adjusted by the browser to account for the fact that these headings are relevant only within one subsection of the page.

So given these considerations, is it still worth the extra complexity of allowing six heading levels in every section within a document? I’m not sure. It does add a lot of complexity. In just a few minutes at the WSG meeting, we came up with a number of significant problems:

  • Search engine optimisation, or SEO, relies on extracting the heading information from the page. Rather than simply matching <h1>...</h1>, search engines now need to follow the fairly complicated process to determine the ranking of headings within the document.

  • Styling headings with CSS, particularly providing default styles, becomes much more verbose. Rather than using h1, h2, h3 { ... }, with HTML 5 you would need to define h1, section h1, section section h1 { ... }. This would probably be in addition to the old rules, if you’re including content with nested headings.

  • Automatically determining a table of content is a lot more complex. As linked above, you need to follow a fairly tricky algorithm to determine the heading structure of a document.

  • With current DOM APIs, you can easily find all headings at a particular level in the document with document.getElementsByTagName('h2'). You’d need to use a query selector to do this with the new-style use-a-h1-for-everything structure. Without an efficient query selector, this is much trickier. Even with a query selector, it’s going to probably be a fair bit slower, which is a problem if you’re doing it often.

Given these issues, I don’t consider it beneficial to make headings relative to the section that contains them. What authors gain by not having to adapt included content on the server side so it uses appropriate heading levels, they end up potentially losing due to the increased complexity in determining the outline of a document and styling headings consistently in different sections.

Perhaps if there is some benefit other than just for included content as I’ve mentioned above, a compromise solution might be to have better HTML APIs which allow access to sections and headings in a more meaningful way than the existing DOM methods like getElementByTagName. I could imagine methods like HTMLElement.getSections() and HTMLElement.getHeadings() proving useful in addressing some of the concerns above.

 
 

Webjam 8 roundup

26 September 2008

Webjam 8 was the first Webjam I’ve been to, and I had a great time. Frenetic 3-minute talks from inspiring presenters and chatting to loads of web guys about their work — what could be more fun?

I found all the presentations interesting, but here are a few that still stand out the morning after:

  • Mister Speaker demoed his amazing TurnTubeList, which allows dynamic cross-fading of YouTube videos for party music mixing.
  • Dmitry gave a coding demo of Raphaël, where he threw together a dynamic reflection page in two minutes. TextMate wizardry FTW.
  • Diana gave an awesome talk about her work on the Local Government web network. She was just captivating: interesting ideas, great delivery.
  • A couple of Opera guys gave demos of their latest technology. What blew me away was a demo of their <video> tag implementation, and embedding video content in animated SVG. Totally amazing.

There were many other cool demos, and I had a great time catching up with contacts and friends. Thanks to Lachlan and the Webjam 8 team for putting on a great show.

Webjam photo
Webjam 8 (photo: Halans)

 
 

SpringSource has a new business model

23 September 2008

The Server Side broke the news a few days ago that SpringSource is changing their maintenance policy for the Spring Framework source code. In essence, they’ll no longer make public patch releases of the Spring framework modules more than three months after a major release.

Customers who are using SpringSource Enterprise, available under a subscription, will receive maintenance releases for three years from the general availability of a major new version. These customers receive ongoing, rapid patches as well as regular maintenance releases to address bugs, security vulnerabilities and usability issues, making SpringSource Enterprise the best option for production systems.

After a new major version of Spring is released, community maintenance updates will be issued for three months to address initial stability issues. Subsequent maintenance releases will be available to SpringSource Enterprise customers. Bug fixes will be folded into the open source development trunk and will be made available in the next major community release of the software.

Charles has written a good analysis of this change of attitude:

The language changes, often around the time the outside investors show up. The people who are downloading and using your software are no longer your community, they’re the ones who are taking your code without giving anything back. They’re the free-loaders.

The attitude of commercial software firms is always going to be at odds with what is best for an open source community to flourish around software. A software company tracks every piece of work done on open source software against their bottom line. “Should we implement this bug fix? It depends — how much will people pay for it?” It’s really hard to come up with a business model which also allows people to use the software freely at the same time as turning a profit, so instead the freedom of the software suffers.

So what’s the solution? How can a company sponsor open source without appearing to restrict work on the open source project to promote their commercial offerings?

One of the best ways is to set up an organisation which is separate to the company and responsible for the open source project. This organisation can be sponsored for a fixed amount by the company (or perhaps set up as a foundation), and is managed independently of the company.

In this model, the open source organisation has complete autonomy and owns the copyright for the code. Its goals are to enhance and promote the use of its code throughout the world. Many of the most successful open source organisations are set up as foundations in this way: Apache, Mozilla, Eclipse.

This avoids nasty situations like we have here. SpringSource, the company, could never decide that Spring, the open source project, should not publish patch releases to help the profitability of the company. Rather, SpringSource is just the company you go to for Spring expertise, because they have a great reputation and have most of the Spring committers on staff.

 
 

Archive

Site

Portrait of Matt Ryall

 

About me

Feed icon Articles feed

Feed icon Comments feed

Archive

Photography

Europe trip 2004

More photos

Software

NoteWiki

Other Pages

About Me

Uni timetable

SysProg Journal

The List