Convert text label files to UTF-8 encoding

By | March 28, 2019

In a lot of multi-language enterprise applications where you have to support labels in different languages the text files with the different labels for each language are scattered across your source code.

In my case I all the labels are kept in Java property files with “.properties” file extension. Then for each language I have a suffix at the end of the file name. For example a Romanian property file will be like *_ro.properties

There is a new development when updating from Java7 to Java 8 see link:

Java 8 “fixes” an error in Java 7 and replaces invalid UTF-8 byte sequences with a replacement string, which is in accordance with the UTF-8 specification.

So to have Java8+ correctly interpret translation files we have to make sure that they are encoded in UTF-8 not in ISO-8859-1.

In Linux we have a nice application that does the conversion “iconv”. The following is a bash script that will find and re-encode to UTF-8 all the translation files (*_ro.properties) from the current directory.

See here the script file.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.