A Core Textpattern Technique Addressing Internationalization Interests

Background

I originally wrote this article about 4 months ago, and it’s been sitting neglected ever since then. I have not published it until now because it was my intention to translate it into French and publish both articles as an example of what the write-up demonstrates. As much as I hate to admit it, however, my French writing abilities are still too rudimentary for long, technical articles like this one. Further, some fine gents in the Textpattern community got busy on the forthcoming l10n plugin—called a Multi-Lingual Pack, or MLP (I’ve been playing with a sponsors release and it’s pretty damn cool)—and I started thinking that maybe this article of mine had become obsolete, so I sat on it some more.

Then something interesting occurred to me. If you’ve made the rounds to enough Weblogs, generally those of popular people in the design industry, you’ve likely seen an article here and there that is noted to be in one or more other languages than the language of the authors’ own site. In these cases, links are provided that take you to a third party’s site where the original article was translated into another language. Think about this a second. The original author benefits from the recognition of having their work translated, both parties benefit from a link exchange (one that’s actually worth having), and readers benefit because now the article is reaching a wider audience base. Not a bad trifecta.

What struck me is that this technique, which I was considering not to publish, is perfectly suited for this kind of ‘collaborative, international publishing’ in a way that beats the pants off of how it’s typically done. As such, this technique has regained some relevance again and perhaps worth publishing after all…serving a niche that even the great new MLP is not designed nor intended to do. Maybe you’ll agree and give this technique a try.

Even if you don’t use the technique for this collaborative publishing concept, you could still use the technique as I originally intended to present it, to publish articles in your own site in more than one language. Since that was the original intent of this article, I’ll begin by describing the technique in this way, then later interject how I’m actually employing the technique myself (again, because I’m not yet doing my own French writing).

Core Technique for Bilingual Publishing in Textpattern

The ability to publish articles in multiple languages in Textpattern is a common desire among many users, and for good reason. When you publish information in more than one language, you potentially open the doors for increased traffic to your Web site. Simple.

In Textpattern’s history, multilingual publishing has been a difficult thing to do. Various approaches have been taken, often involving some kind of hack to the core code that quickly became obsolete with the next Textpattern release.

There’s also the issue of modifying one’s site architecture to accommodate multilingual publishing, and of course how this will influence site usability. If a new site is being planned and built with multilingual publishing in mind, that’s one thing, but converting an existing site to two or more languages can be a bit more tricky. Hence, no historical approach has really been a viable solution for the masses.

The method I’m going to cover is probably not practical for complete site translations (though as mentioned in the beginning, advanced multilingual publishing is near at hand with the forthcoming MLP plugin); however, it might be just the ticket for anyone wanting to publish just the occasional translated article. Furthermore, we can even manage the process so translated articles are published as mirror images of one another, which in turn provides navigational context between language versions of a given article. This mirroring aspect is a big key here and will help tremendously with keeping coherence in the modified architecture, and thus with overall usability. We can accomplish this by using a combination of Textpattern’s custom fields and clever use of certain conditional Tags.

Note: This technique for mirroring translated articles was first introduced by the very talented Yuriy Linnyk in the Textpattern Support Forum. I don’t think too many folks saw it so I’m taking his buried gem and buffing it up to a delightful sparkle. All brilliance is accredited to Yuriy.

Initial Uncertainty

When fist thinking about this project, I was concerned with how URLs would factor into everything. The second language I intended to use is French, and like many of the latin alphabets, French has a number of letter accents involved like é, è, ç, ô, and so forth. These accents are not easily handled in a clean manner in the URL if you try to retain them. For example, if I used a word like intérêts (meaning “interests”) in an article’s title then the word appears as follows in the respective URL: ...int%C3%A9r%C3%AAt.... You can see that %C3%A9 and %C3%AA are supposed to be é and ê, respectively.

This crufty look is normal and reflects the standard manner in which particular strings of an URL (and other types of URIs) are generated from sequences of octets used in Internet protocols. All that’s way out of scope for this article, but if you are so inclined you can see the RFC 3986 memo on URIs, and one of it’s earlier and descriptive predecessors, RFC 1738. At any rate, such URLs were not going to be very helpful from a user standpoint, so I was concerned about how to proceed.

The solution, actually, is to not try and retain the letter accents in the URLs at all, but rather dirify them, and many thanks to Sencer Yurdagül, one of Textpattern’s developers (see Team Textpattern), who reminded me that Textpattern dirifies article titles into readable URLs automagically. A key aspect of this process is the internationalization file located in each Textpattern install at ../lib/i18n-ascii.txt, where language character codes exist to essentially help dummy-down the complex strings of octets into more user-friendly URLs. Hence, one simply types the article title using letter accents characteristic of the language (as one would be accustomed to doing), and when the article is saved Textpattern strips the accents leaving a more base set of characters that the browser doesn’t maul.

I can only assume that other alphabets like Cyrillic, Arabic, and Chinese function similarly assuming the right languages packs, IMEs, and so forth are installed.

Preparing for Multilingual Publishing

I highly recommend you read this Basic Overview for Creating Multilingual Web Documents, which should get you on the right track for both Windows and Mac systems.

You will sensibly want to ensure all necessary components for creating multilingual articles are installed, configured, and ready for use, including:

  1. the language (or language groups) you want to use,
  2. any appropriate fonts/font-sets,
  3. the relevant keyboards/IMEs,
  4. and perhaps a language toolbar on your desktop to easily switch between languages. (Or if you are using Windows, simply click Alt-Shift to move back and forth between IMEs.)

Two other things normally needed for drafting and publishing multilingual Web documents are a Unicode-based editer for drafting Web copy, and template headers having the necessary coding to tell Web browsers the page’s content is properaly encoded. We don’t have to worry about these two things here because Textpattern is already Unicode Transformation Format-8 (UTF-8) compliant. Textpattern sets explicit Unicode information in a template header for both the public and admin side of your site. This means that every browser that supports UTF-8 will let you use Textpattern as a fully UTF-8 compliant text editor. As long as you have the right languages and fonts installed, you’re ready to go with Textpattern.

If you are like me, you might sometimes pefer using a text editor to draft your initial copy instead of Textpattern’s Write panel, that’s fine too, just ensure that the editor is configured to encode text in UTF-8 before you start writing. If the editor doesn’t support UTF-8, get a different text editor.

Accommodating a Second Language in Site Structure

Before you start publishing articles in multiple languages, you will want to consider how they will be integrated into your existing site. There are many things to take into consideration here, which are better addressed in another article, but they include such things as site architecture, usability, internationalization, search engine referencing, and so forth.

Let me just say this for now, the easiest way to deal with accomodating secondary language publications in a Textpattern site is to add a new Section named after the ISO two-letter language codes for the given language you need (e.g., French would be “fr”). This is a good idea for a number of reasons: it’s short, easy to work with, and universally understood. This will be what shows up in the URL of a given secondary language article (assuming you use Sections in your URLs). From there you can use custom building blocks (Pages, Styles, Categories, Forms, ...), content, and semantics as needed. Again, I won’t describe all this in detail as it veers into topics of general site construction, which is not the focus of this article.

The main advantage of going with a separate Section is it allows you to keep your translated content separate from the publishing channels of your default language. For example, if you are working with English and French, you might not want your French translations appearing back-to-back with your English versions, that is pretty poor organizatin and bad usability; hence, you might create a separate publishing channel solely for French publications, where you can then design the entire presentational interface in French too. The two language versions can be mirrored to establish topical and navigational context between them. Nice and effecient. Again, the forthcoming MLP plugin will enhance your abilities in this area tremendously; the thechnique here is for much smaller translation interestes without need of a plugin.

The Process

Or as I like to say, the mechanics of it all.

Establishing the Mirror with Custom Fields

When you have adjusted your site architecture for the new language content and have all the language drafting protocols in place (language packs, fonts, IMEs, etc.), you’ll need to setup the necessary custom fields that you will use to create the contextual links between the default language article and its translated version. This dual-direction linking between language versions of the same article is known as mirroring an article, and it’s the same concept that Wikipedia employs for managing articles in multiple languages. To help establish the mirroring effect, you need to use two custom fields, one for the default language article and one for the article in the alternate language (you would need a third custom field for a third language, and so forth).

Assigning the custom fields is accomplished in the Advanced view of Textpattern’s Preferences panel (admin side). Scroll down the Advanced panel view to the Custom section, where the ten custom fields are situated.

To initially make the custom fields available for use, you give them a name by typing it into the corresponding field. By default, two custom fields are already named (custom1 and custom2), which you can simply rename for your needs. You want to give your custom fields names that logically indicate the language they will represent, and again the ISO two-letter standard is an intuitive and easy method to use; hence, if you are using English and French then you would name the two custom fields as en and fr, respectively (Figure 1).

Custom fields for language versions.

Figure 1: Assigning custom fields in the Advanced preferences view of the Preferences subtab panel.

Any custom field that is given a name (or is renamed) will automatically appear in the Advanced Options column of the Write subtab panel. The names originally assigned will appear as field labels, while the actual fields will be empty and ready for use (Figure 2).

Custom fields for language translation use.

Figure 2: Custom fields ready for use in the Advanced Options of the Write subtab panel.

Before you can apply the custom fields effectively, you need to modify your article Form.

Modifying the Article Form with a Filtering Conditional

The objective here is to customize the main article Form you use so it can detect whether or not a given article has a translated version, and if so, to show the link for that version in the published article. Let’s say your current main article Form contains something similar to the following:

<txp:permlink><txp:title /></txp:permlink>
<txp:author link="0" />
<txp:posted />
<txp:body />
<txp:comments_invite />

You would likely have more XHTML in there too, but I’ll leave that out for clarity. The block of code is pretty straightforward at this point, it’s nothing but Textpattern Tags that define several things: an article title serving as a link to the article’s permanent location, the article’s author, a publishing date, the body of the article, and an invite message to add comments to the article.

The next step is to modify such a Form to recognize when an article is being published for translation purposes. Since we’re making use of custom fields, we can achieve the conditional process by using Textpattern’s <txp:if_custom_field>...</txp:if_custom_field> Tag. Hence, keeping in mind the names of the custom fields created earlier, we can set up a custom block of code as follows:

<txp:if_custom_field name="fr">
<a href="<txp:custom_field name="fr" />">Lisez en français</a>
</txp:if_custom_field>
<txp:if_custom_field name="en">
<a href="<txp:custom_field name="en" />">Read in English</a>
</txp:if_custom_field>

The above code reflects two instances of the <txp:if_custom_field>...</txp:if_custom_field> Tag, let’s walk through it. The first checks for articles written in French (name="fr")...

<txp:if_custom_field name="fr"> ... </txp:if_custom_field>

...and the second checks for articles explicitly assigned as English…

<txp:if_custom_field name="en"> ... </txp:if_custom_field>

Between the opening and closing of each custom field conditional is a simple XHTML anchor tag…

<a href="<txp:custom_field name="" />"> ... </a>

...but there’s a couple of key things to note.

First, the values for the href="" attribute of each anchor tag are defined with another Textpattern Tag, namely <txp:custom_field />

href="<txp:custom_field name="" />"

...which serves as a link to a custom field, and more accurately so with help from the Tag’s name="" attribute. In this case, the two custom fields being linked to are the ones we created earlier, namely, en and fr.

Second, the hyperlink label is globally set here as well, and it’s done in such a way so when a reader is viewing the English article, the link label to the French translation is shown in French, and visa versa.

The entire block of conditional code is added to the existing article Form where you want the link labels to appear. If you’re not reading this from print, scroll up and take a look at the title of this article. I’ve used a form similar to the one I’ve just described to display the link labels at the top of the article (under the title) for maximum usability, so I inserted the code in the same line of text as the <txp:author link="0" /> and <txp:posted /> Tags.

The code for the entire article Form is (again, minus the XHTML here for simplicity):

<txp:permlink><txp:title /></txp:permlink>
<txp:author link="0" />
<txp:posted />
<txp:if_custom_field name="fr">
<a href="<txp:custom_field name="fr" />">Lisez en français</a>
</txp:if_custom_field>
<txp:if_custom_field name="en">
<a href="<txp:custom_field name="en" />">Read in English</a>
</txp:if_custom_field>
<txp:body />
<txp:comments_invite />

Now everything is in place, and a site owner just needs to write an article in two languages and set the custom fields accordingly. Do not be worried if you don’t plan on translating every article you write. The conditional logic above only outputs a second language link if you go the next steps of using the custom fields.

Reminder: Don’t be confused by where my French link goes at the moment. Remember at the beginning I said I was employing this technique for ‘collaborative international publishing’ purposes, not for bilingual publishing within my own site. At this point in the article I’m still describing the process as you would do it for either objective, but I’m describing it in context to bilingual publishing in your own site. I’ll switch gears in a little bit.

Applying the Custom Fields at Writing Time

The first article a person is likely to draft is the default language version; in other words, their native language.

It is my habit to make sure all my article variables are established correctly before getting too involved in the writing of a new article, whether for translation purposes or otherwise. In other words, I first add a good title and a couple lines of body text, then I immediately select my article status as “Draft”, choose one or two Categories as appropriate, make sure the right Section is selected…and then initially save it to the database. The article itself still needs to be completed, but that’s no big deal, I can come back to it. The important thing is my article is now saved into Textpattern so I can use the permanent location for mirroring purposes.

If you are writing your second language version too, then you would immediately repeat the same process above for the second version of the article. Again, don’t finish the article right away, but rather just get it started so the permanent location (and title) is established.

Now your two language drafts of the article are created (albeit unfinished), and because you created the drafts back-to-back, they are conveniently positioned next to each other in the list of articles in the Articles subtab panel of the administration interface, which is nice for article management reasons.

Finally, you want to make your link associations between the two language drafts, which establishes the mirroring effect. What this entails is adding the relative URL of one language draft to the custom field in the mirrored article that represents the alternate language, and visa versa. Yes, you will need to do some manual copy-and-paste between articles.

Now let’s look at all this in a little more detail.

Making the Mirror Association

For demonstration purposes, let’s say the titles for your two articles are My Interests, and the French equivalent, Mes Intérêts.

First open one of the language drafts (makes no difference which one) in the Write subtab panel and click the Advanced Options link to expand the Advanced Options column.

Tip: With respect to expanding the Advanced Options in the Write admin panel, there are a couple of nice plugins that can keep it expanded by default. The first is Rob Sable’s rss_admin_show_adv_opts, which is a simple little plugin designed specifically for this purpose. The other is Yuriy Linnyk’s ied_hide_in_admin, which enables this along with many other alternate abilities of hiding features you don’t need users to see. You certainly would not need both plugins in the same site (if you even cared about the column being expanded by default to begin with), choose only the one that suits your overall objectives.

At the bottom of the Advanced Options column is the URL-only title field, which is what you’re after at first. When in your English article you simply need to copy the title in the URL-only title field (Figure 3).

URL-only title field in Textpattern's Advanced Options.

Figure 3: The URL-only title field in Textpattern’s Advanced Options showing English title.

Thanks to Textpattern’s ability to dirify article titles, the process is exactly the same for the French article title, because by this point the accented characters should have been replaced by the dirifying process (Figure 4). However, should you happen to see any accented characters, be sure to manually change them or you’ll end up with the crufty URLs I talked about earlier in this article.

URL-only title field showing dirified French article title.

Figure 4: The URL-only title field showing dirified French article title.

Advanced Tip: The disassociation between the URL-title only field and the published title of an article (as read in a Web view) is convenient for allowing an author to abbreviate the length of a given URL-only title while keeping the main article title in tact. You may find this useful on an article-by-article basis if a given title is overly long.

For example, let’s say you had an article titled as Everything You Ever Wanted to Know About Stone-ground Mustard. If you use the popular section/title type URLs, then you could be faced with an URL like http://www.yourdomain.tld/section/everything-you-ever-wanted-to-know-about-stone-ground-mustard. That’s readable, but not exactly convenient. Look how it potentially—or actually—pushes right out of my note box here because of its length. You don’t want to hand that off to a hapless user.

We could modify the URL-title only value to something like all-about-ground-mustard. This makes the URL much more conducive to hyper-linking and/or displaying in text, http://www.yourdomain.tld/section/all-about-ground-mustard, without changing the original title as it appears in the article. The change also retains enough topical meaning to keep the section/title type URL relevant as it was probably desired to begin with.

The copied URL titles now need to be pasted into the corresponding custom fields of the translated articles and the appropriate Section name needs added to finish the relative path. Using our example title, the en custom field of the French article gets /journal/my-interests, while the fr custom field of the English article will have /fr/mes-interets (Figures 5 and 6, respectively).

English language custom field.

Figure 5: French article’s “en” custom field to which is added the English-version article path.

French language custom field.

Figure 6: English article’s “fr” custom field to which is added the French-version article path.

Once this is done, you will see the custom language links appear in the respective location of each article. As mentioned earlier, my links appear right after the publishing date under the article’s title. You are now ready to go back and finish drafting each of the two article versions, likely starting with your comfort language first. When you finally publish both articles, they’ll output to their respective locations in your site and can easily be accessed from one to the other via the mirror links. Pretty sweet.

Notes: When you make use of the custom fields, your language links will appear in Web view whether you have finished translating the second article or not, so if your second article is not finished, don’t use the custom fields yet. Why? Because providing links to content that is not even there will turn potential international readers off, not attract them. You don’t want to lose visitor interest from a silly mistake like that.

Collaborative International Publishing: The Alternative

Now you know the mechanics of how it all works. I can now switch focus from bilingual publishing within a site to collaborative international publishing between separate sites.

Collaboration Between Two Individual Authors

Under normal conditions, one might use this technique to collaborate with another author who wants to translate and host a different language version of your article. There’s several positive things to take from this:

  1. You have the satisfaction of being recognized for writing an article deemed worthy of translation. This is generally not your choice, rather someone else contacts you and says “hey, great article, can I translate that into [insert language here]?” This is a cool link quality control aspect that is inherent of the whole collaboration process. If your article is really good, it gets recognition and potentially translated. That’s a nice feather in the cap that should encourage you to write better.
  2. Both parties (you and the translator) benefit from the link exchange because links help improve referencing in search engines. Additionally, and this is what I really value it for, it’s a quality link exchange; that is to say, it’s a link exchange that not only boosts your rankings, but it provides targeted, usable content at each end. To me, that’s link exchange done right and for admirable reasons.
  3. The international community benefits because now quality information is opening up to a wider audience base. This contributes to the positive feedback loop of bringing more positive attention your way and perhaps even more interest to translate your work in more languages.

Collaboration Between an Individual and a Community

As I have alluded to earlier, this is currently the way I’m using this conditional filtering technique. It is often my position to support TextBook as much as possible. Being that TextBook is multinational, in fact is run by the same wiki system as Wikipedia which utilizes a similar mirroring model as I’m using here, I can use that to everyone’s advantage (mine and the Textpattern community).

Here’s how it works. I add a link to my article in the TextBook’s English version of the Article Contributions page. From that page, wiki users can now easily get to my site for the English version of the article. This is effectively the collaborative link back to my site. Thereafter, translators can begin the translation process by first translating the Article Contributions page, and then the articles on that page themselves. From my site, I’m particularly interested in pointing to a French version of the article as the mirror, so my mirror link points to the TextBook page representing the French version of the articles. Other languages can jump into the fray via the wiki too, which takes the overhead of link management off my shoulders here at wion.com (an issue I present in the following Discussion).

And there it is. We’ll give this a run for a while and sow how the community responds. If the wiki turns into stale bread, as might be the case, then I’ll work with a single French translator, or I’ll simply do my own translations once my French technical writing is better.

Discussion

The previous sections point out a lot of positive gain from multilingual publishing, and different benefits come into play depending on which way you might utilize the conditional filtering technique described in this article. However, there’s a potential problem rising from all of this; what if you get asked by 20 different authors, representing 20 different languages? Heck, what if it’s even just 5?

Should such a situation present itself, the value would be great, but accommodating it poses some interesting issues to consider. As an example, how would you implement it in your architecture and workflow? The conditional filtering technique described here could be modified in clever ways, but would also require a lot of manual custom field handling, not to mention use up most (if not all) of your custom fields. Maybe this is an interesting focus for a new plugin, or perhaps it’s simply handled by a list of language links. The latter would certainly be a viable solution, but ideally you would still want it to be in context on an article-by-article basis as the conditional logic provides. Interesting things to think about indeed.

  1. Maniqui :: 25 November 06 :: #

    [Editors note: comment originally submitted 23/11/2006 at 07:30 GMT+1]

    Another great article, Destry, teaching the power behind TxP.

    This method seems to be great for simple small business websites when the client desires to publish a product description in more than one language.

    Thinking loud:

    I haven’t tested graeme’s gbp_permanent_links plugin but maybe, it can be used to rewrite the URL of the translated article when it’s published inside the same site.

    In your example, /fr/mes-interets may be rewritten as /journal/fr/mes-interets, adding more “consistency� to the URLs.

  2. Destry :: 25 November 06 :: #

    Thanks, Maniqui. I think this is a good method in general for situations where you just want to publish the occasional article in another language, and not actually duplicate your whole site. In that sense, there’s many applications of use.

    I rather like the collaborative (shared link) publishing idea I talked about too because it allows people who are not bilingual to get in on the fun.

    I haven’t tested that plugin either, but it’s an interesting idea. As for the inclusion of the journal/, that’s a really good point and starts to lap back into the discussion about the scope of the sites translations (i.e., occasional article versus more of the site as a whole). There’s also the issue of having URLs in the language they reflect, in this case “journal� just happens to be the same in both English and French, but usually sections will require different labels respective of the language. More to consider.

    Multilingual publishing on the Web opens many interesting puzzles with respect to standards, architecture, semantics, design…it’s quite interesting and there’s lots to talk about there.

Latest Ten Articles

  • Finally Pro-MacBooked 6 April 07

    Six months later than expected, and on the eve of the new Apple Leopards, I am nevertheless a happy owner of the MacBook Pro, and it feels good.

  • What Makes a Good Web Accessibility Guide for the Business? 2 April 07

    With pressure mounting on web developers and companies alike to provide quality eAccessibility products and services, it makes good business sense for companies to have their own eAccessibility guidelines to help ensure development objectives are being met in efficient and cost-effective ways. However, just knowing guidelines are needed is one thing, producing and integrating them into a development workflow is something else. What breadth and depth of information should they cover? How should they be written and structured for maximum understanding? What format provides the best utility? Seemingly, the preparation of eAccessibility guidelines is not a fundamental task, the considerations are many.

  • Main Points Delivered at the First European eAccessibility Forum14 March 07

    I’m not exactly punctual on this one, but making a long story short, here are the main points as I took them from the eAccessibility Forum held in Paris nearly 6 weeks ago…already.

  • In Paris for the First European eAccessibility Forum27 January 07

    It’s going to be a whirlwind trip on the train, but should be interesting nonetheless.

  • Lose Readers by Moving Themes?14 January 07

    We hear about free Weblog themes all the time, and see the same ones all over the Web, but it’s not often you read about someone moving a personal theme from one self-owned domain to another, or the implications of doing so.

  • Book Reviews Coming to Wion18 November 06

    Reading and writing is my kind of chocolate, and when it comes to book reviews, everybody wins. I hope you’ll find them helpful. Stay tuned.

  • A Core Textpattern Technique Addressing Internationalization Interests14 November 06

    This article presents a core Txp technique for managing internationalization efforts, and three methods of use are described: 1) multilingual publishing within site, 2) collaborative international publishing between individuals, and 3) an alternate approach to #2 that essentially takes a community slant.

  • Textpattern Building Block Mechanics30 May 06

    Here is the second article in a two-article series about Textpattern building blocks. If you missed the first article, Understanding Textpattern Building Blocks, you might check it out too.

  • Heatmap Presentation of Eye-tracking Data17 May 06

    Isotopic heatmaps of eye-tracking sessions are easier too see and understand; hence, great slide material for stakeholder presentations (management, clients, and so forth).

  • IE Conditional Comments: Where Have They Been?19 April 06

    Conditional comments have been with us for years and largely unknown, but with the coming of IE 7 they may be the swan song that’s about to go platinum.