Multilingual folder and file names, UTF-8 and transliteration

Written by ColinM. Posted in Language Related

Multilingual folder and file names, UTF8 and transliteration

-->

Multi-lingual folder and file names

It is clearly much better for users if they can see items in their own language.  Throughout Joomla! there are many language translations, and the same applies to jDownloads where many languages have been added and more continue to be added.  These language files use what is known as utf-8 encoding to represent each character in the choosen language.  If you would like to know more about utf-8 then a good place to start is https://en.wikipedia.org/wiki/UTF-8 (opens in new tab/window)
 
Since about 2008 utf-8 has been the dominant character encoding on the web.
Joomla! supports utf-8 in content and in the database.  Likewise jDownloads supports this for content and in the database.  Even further jDownloads supports utf-8 in Download names (file names) and in Category names (folders/directories).

Option Settings


The settings for utf-8 is in Options - 'Files and Folders' tab along with various other file and folder naming options as shown opposite.

To use utf-8 character encoding for file and folder names just set the 'Use UTF-8' option to Yes as shown.  It is the default setting and suits the large majority of sites.

Summary:  Basically if your site supports UTF-8 then all is well and no further action is required.
utf8 01
But using utf-8 in file names and folder names may prove a challenge!  This is not because of jDownloads or Joomla! but because your hosting site may not support utf-8 in its control panel!  Most hosts use a Linux based system.  Intrinsically Linux allows any character in a folder or file name except for '/' and 'null'. If the control panel does not support utf-8 then the file and folder names are unintelligible.  Even worse they may appear visually the same as the original could have included a whole variety of so called illegal characters that have been replaced.  That is the main problem is often the host server file system.  A very common control system for servers is cPanel.  This is not always configured for utf-8.  For example see the link below 
https://kb.siteground.com/how_to_change_the_language_for_cpanel/

Transliteration

If your host site does not support utf-8 in files and folder names then there is another alternative which is also supported by jDownloads.  This is 'Transliteration'.  This is suitable for those languages that basically use accented characters such as French, German and so on. In this case transliteration means that, for example, German characters with an umlaut are converted to standard ascii equivalent without the umlaut.  Most European languages and many other languages are fundementally based on Latin characters with some accent or similar 'extras'. Cyrillic languages are the same in this regard.  For example with transliteration the French e acute and e grave become just the letter e.

How does transliteration work

The table opposite shows part of the Joomla! transliteration table.

All the characters that are on the right hand side of the table are changed to the coresponding character on the lefthand side in the website file system but the original character is retained in the name stored in the database.
cyrillic01

The User will see the 'native' name but underneath jDownloads will work on the Latin character equivalent.  Note that jDownloads uses the Joomla! transliteration table which is essentially limited to accented characters based on the Latin characters.

Transliteration examples

The views below illustrate what happens when transliteration is activated.
German umlaut example
French circumflex example
Category View in Backend
Category View in Frontend

transliteration01
transliteration02
transliteration03
transliteration05

Activate transliteration

To Activate transliteration set Option "use utf-8" to No.

This reveals two further options as illustrated opposite.  It is recommended both of these are set to Yes.
utf8 02

In summary then the main problem is often the host server file system does not support utf-8 as noted above.

ColinM November 2019, reviewed July 2020.

Tags: index

Print