The Persian alphabet is based on the Arabic script, although the two languages are linguistically very different. The alphabet has 33 main letters (including hamza) and a dozen of diacritics (short vowels) and punctuation marks. To type Persian text in the computer, however, some extra invisible characters are essential. These include the space key (of course!) and the Zero-Width Non-Joiner (ZWNJ) key.
Design goals
- The Persian layout should allow the user to insert anything needed for typing a Persian text. These include Persian letters, numerals, punctuation marks, diacritics, and control characters.
- The layout should be easy to use. This means:
- Frequently-used letters should be available to type directly, without using a “shift” or “alt” or other modifier keys.
- Less frequently used letters should be hidden as extended keys to save screen real estate, so that the rest of the keys are larger and hence easier to access.
- As I wrote in my first post in this series, it is preferred that keys from different rows are NOT horizontally aligned, to reduce chance of mistyping letters. This condition requires neighboring rows in the keyboard to have different number of keys.
- The layout should be familiar for users of the standard Persian layout for the desktop, as defined in the standard ISIRI 9147.
Letter usage frequency in Persian
To decide which characters are used frequently and which are not, I analyzed the text of a free Persian ebook from Project Gutenberg, using a small tool I made with bash and gnuplot (there is also a similar tool as a perl script from Jadi). This is the distribution of letter usage in Persian.
This distribution is of course different from text to text. In this ebook, for example, there is an unusually low number of ZWNJ’s (as shown by ⟘ in the plot above) and not a single Persian numeral. I made the same distribution for a collection of my own posts in Google+, and also for articles in the political page of a mainstream Persian newspaper. In all cases, the following letters were always used the least: ژ, ؤ, ئ, ث, ض, ظ and غ. This is also consistent with what Jadi obtained by analyzing another pile of contemporary Persian text.
Layout design proposals
Based on the above constraints and considerations, I came up with two designs for the Persian layout.
Proposal #1:
First row, 10 keys, extended keys for numerals. Letters غ and ض are NOT hidden to make enough keys for 10 extended numbers ج ح خ ه ع غ ف ق ص ض ۰ ۹ ۸ ۷ ۶ ۵ ۴ ۳ ۲ ۱ Second row, 9 keys م ن ت/ث ا ل ب ی س ش Third row, 9+1 keys (including backspace) ← چ گ ک و پ د/ذ ر ز ط/ظ
Proposal #2:
First row, 9 keys, less frequent letters are hidden چ ج ح خ ه ع/غ ف ق ص/ض Second row, 10 keys, extended keys for numerals, letter ث is not hidden م ن ت ث ا ل ب ی س ش ۰ ۹ ۸ ۷ ۶ ۵ ۴ ۳ ۲ ۱ Third row, 8+1 keys (including backspace) ← گ ک و پ د/ذ ر ز ط/ظ
Both of the two proposals look good. I finally decided to go with proposal #1, since it has numerals in the top row (as it is more common in soft keyboards). In the picture at the top of this post I have sketched the final layout (in the normal and extended modes), also including the 4th row.
Preliminary keyboard layout implementation
The following is my implementation of the Persian layout for Ubuntu touch as of now. There is a problem with aligning the backspace key. Also the ZWNJ key shows an empty label (since it inserts an invisible character into the text), while the symbol shift key (?123) still shows the English label.
Next, I am going to fix the small problems I mentioned above. Also, I will start the design of symbols keyboard, i.e., the keyboard that appears by pressing the ?123 (or rather, ۱۲۳؟) key.