The far side of localisation

I’m sure many of you have to deal with localisation of strings and inputs from users. You probably outsource the actual translation to a translation house and you get back a list of strings you can’t really read, but that’s why you hire a company to make your website say “你好!” instead of “Hello!”. Don’t worry however, I am not going to write a post on the wonders of testing in a language you don’t know, mainly because I’ve never done it. Though the subject is almost certainly immortalised in blog posts already, it’s certainly a worthy subject for it.

My experience lies on the far side of that localisation. You might be set up to receive the input of a string in a non-roman language, but how exactly does the user input that string?
My experience is mostly with Mandarin Chinese, so my examples will largely be in that language as opposed to Japanese, Korean or other types of Chinese. If I forget to mention which IME I am using, you can probably assume it’s Microsoft’s Chinese Simplified PRC.

So first of all, why am I sharing this information? Well, simply put I think that this is something that many people will be unfamiliar with and one that it might help you to understand your non-latin userbase more.

So, what does an IME look like?

IME-basic
Something like this. Here I’ve typed in the phrase “hanzi”,meaning Chinese characters in Chinese. As you can see there’s a range of options, because Chinese is a language of homophones differentiated (sometimes), by tone*. Thus for any arrangement of valid syllables, there can be many valid translations of these into Chinese characters. Thus the user is presented with a drop down menu showing all the choices.

How does this affect you? Perhaps your input options include hints. How do these hints interact with the IME? Is the layout of your page conducive to this kind of input? If you test a desktop application, does it try and steal input from the OS? A good example of this would be a game with a chat channel, would you steal input while the chat channel isn’t focussed? Do you have a way to ensure that you don’t steal input if someone enables or disables the IME inside the game?

You can try this out yourself by installing the Chinese keyboard input pack on Windows, SCIM on Linux or I am sure there’s a version for the Mac, but I’m not familiar with the platform. You’ll probably want to try out a few different languages if you decide to really see what the issues are around IMEs, the so called CJK languages (Chinese, Korean and Japanese).

However for me these languages present a slightly different challenge. Due to the nature of the software I test, we can’t devolve language handling to the OS completely, instead we have to implement some areas of it ourselves. Thus while I can read and write a little Chinese and much less Japanese, I’ve recently had to pick up enough Korean to test the Korean IME. To this end I actually resorted to the Microsoft documentation on the Korean IME* which proved that while I could type hangul, it wasn’t the characters for Hangul that were being shown.

I’m not suggesting that everyone who wants to test using an IME learn the languages of the IME, because that would take far too much time and not everyone enjoys learning languages. However it is possible that a few words of each and, importantly, how to input them on your supported platforms, could be extremely useful.

* https://support.microsoft.com/en-us/kb/130053

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.