HTML validation

A client recently had a problem where content that was sent for translation wasn't translated fully. The returned translation was only half done and appeared to be truncated. After a quick investigation I discovered there was an error in the HTML for a node.

Instead of:

<a href='...'>link text</a>

they had:

<ahref='...'>link text</a>

(a space was missing between 'a' and 'href')

Many times, browsers manage to display pages with broken HTML, so it's difficult to notice there's a problem. In this case the link is missing but the text is all there.

Broken HTML leads to problems

This is what ICanLocalize does when you send content for translation:

  1. The node title and body text are sent to the ICanLocalize server
  2. The ICanLocalize server parses the HTML
  3. The HTML parser extracts the text for translation

The HTML parser extracts the text in such a way that translators only have to edit text and not HTML tags. While translators are editing, a preview panel shows them how the translated document would appear.

Our parser is fairly robust but obviously in this situation it failed.

Remember that the ICanLocalize server is not the only computer to process your pages. Search engines (a.k.a Googlebot) read your pages and try to make sense of them. When they encounter broken HTML, they get confused. Parts of the page, or even entire pages can be lost if search engines cannot process them and cannot follow links.

Make sure your HTML is valid

Before sending content for translation it's always a good idea make sure the HTML in your nodes are valid. I've done a quick search for drupal markup validation modules and this one looks useful:

You can use the validator directly:

My personal favorite is the Firefox HTML validator. It's pretty simple. Green means GO, red means no-go and yellow means check. You get instant validation for entire pages as soon as you save the first draft.

Using different images in translated content

I've been looking at how we can include different images in translated content. Translating the text in content is fairly straight forward but keeping track of hundreds of different images in many languages quickly becomes a serious problem.

We have had an inquiry from a client that has an online help system using Drupal. Many of the help pages include screen shots and these screen shots need to be different in each language. We're talking about thousands of images, translated into at least six languages.

Translating Nodewords for multilingual SEO

The nodewords module allows you to add meta tags to your content to improve search engine optimization (SEO). When you translate content into other languages you want to translate the the meta data as well, so translations are also optimized for search engines.

ICanLocalize translator now translates nodewords meta tags along with the rest of the node's content. Nodewords fields that need to be translated (abstract, description, keywords) will be sent to translation and other fields that need to be synchronized (like index enable) are copied to the translated nodes.

This allows running search-engine optimized multilingual websites.

Managing translated menus - part 2

In the previous post I talked about managing translated menus in Drupal.

The new menu-sync functionality we've added to ICanLocalize translator helps run multilingual menus without having to spend time managing translation.

It detects any changes you make in the menu in the original language and automatically applies them to all translations.

The clip below shows it in action below. You can see how I'm editing the English menu and the German menu follows.

In the video I show:

  • The original English and German menu structure
  • I modify the English menu structure
  • Save the English menu
  • Show how the German menu is synchronized with changes to the English menu

Managing translated menus

I've been investigating ways for organising translated menus in Druapl. In the past I used to create one big menu that contains both the original menu items and the translated menu items. This works reasonably well for smaller menu structures but when the menu size and the number of languages grows it can quickly become unwieldy.

Consider a site that has 100 menu items organized in a menu hierarchy 3 or 4 levels deep. Then add 5 languages and you now have a menu of 600 menu items!

An alternative is to have a menu for each language and use an i18n variable to choose which menu to display according to the language. To do this:

1) Set the i18n_variables in the settings.php file (idea came from Drupal's Multilingual Variables page):

$conf['i18n_variables'] = array(
  // Primary and secondary links

2) Create a menu for each language, Administer > Site building > Menus > Add Menu and give them a meaningful title and name like "Main Navigation", "Main Navigation (Spanish)", etc. Then when creating translated content use the corresponding root menu.

3) Go to the menu settings page Administer > Site building > Menus > Setting for each language and set the source for primary and secondary menus for each language.

Translating a Drupal blog entry

With the ICanLocalize Translator module it's easy to get content translated. The following screen casts goes through the steps of using the module to translate a blog post from this website.

The steps are:

  • Select content to translate
  • Select languages to translate to
  • Click "Translate selected documents"

The blog that we are translating is:

The screen cast was done on Friday and it's now Monday morning and the translations are already published.übersetzen-von-strings-mit-dem-lokalisierungs-clientón-de-segmentos-con-el-módulo-localization-client

Note: ICanLocalize Translator can handle any Drupal content type, not just posts. This includes pages, books, CCK, taxonomy, menus and strings.

Translating strings with the localization client

Simple on-screen translation.

Another way of translating the user interface is with the localization client (l10n_client). This module adds a simple user interface to all your pages so you can simply search for and enter translations for un-translated text.

Below is a screencast showing the localization client in action.

The video covers:

Translating strings in the Drupal user interface

User interface strings

As well as translating content on a website several other things need translating. Strings appear in the user interface from several sources, core code, modules, themes, etc. A lot of the translations can be obtained by downloading translations from, module and theme sites.

Drupal's translate interface

If the translations for strings in the user interface isn't available how do we translate them?

Avoid huge URL aliases for non-English languages

When translating the title the automatic alias module creates an alias which can be a very long text in Asian and Middle Eastern languages (Arabic, Hebrew and Persian).

To overcome this it may be better to maintain the English alias. The following code does just this. What it does is to copy the alias for the English nodes to the translated nodes.

We will add this feature to our ICanLocalize module, allowing to choose what languages to copy the aliases to.

ICanLocalize donates Hebrew translation to Open Atrium and Drupal

We're very excited about Open Atrium and its applications for running distributed operations (just like our business). Since I'm a native Hebrew speaker and run a translation business, I thought it might be a good idea to help out with Hebrew translation for Open Atrium and Drupal.

Our Open Atrium affair

The folks at Linnovate, the Israeli Drupal powerhouse, introduced us to Open Atrium and shared some of their plans for it. We immediately caught the virus and saw how it can help in our own projects. Since it's a product for end users and not just a geek-project, we figured it needs to speak the user's language.

As I'm writing these lines, Open Atrium has some legacy Hebrew translation, accumulated over time. 22% of the GUI is translated, so we've got 78% to go. That's over 57,000 words.

Challenges translating Open Atrium

This project is the first one ICanLocalize is doing that involves large scale right-to-left GUI localization, so before we could start translating a single string, a lot of ground work has to be done.

In order for the translation to look good and read native, it needs to be consistent and fluent. The translator must understand the system and cannot translate out of context.

ICanLocalize's resource translation system vs. Drupal's localization server

When we set out to do this work, we first had to select the translation tool. We considered three alternatives:

Syndicate content