Arabic

This page provides information about the characteristics of the script used to write Arabic. It is not intended to be exhaustively scientific – merely to give a basic idea of the essential features of the script.

Click on the orange text in the features list (right column) to see examples and notes. Click on highlighted text in the Sample section to see the characters.

Sample (Arabic)

عندما يريد العالم أن ‪يتكلّم ‬ ، فهو يتحدّث بلغة يونيكود. تسجّل الآن لحضور المؤتمر الدولي العاشر ليونيكود (Unicode Conference)، الذي سيعقد في 10-12 آذار 1997 بمدينة مَايِنْتْس، ألمانيا. و سيجمع المؤتمر بين خبراء من كافة قطاعات الصناعة على الشبكة العالمية انترنيت ويونيكود، حيث ستتم، على الصعيدين الدولي والمحلي على حد سواء مناقشة سبل استخدام يونكود في النظم القائمة وفيما يخص التطبيقات الحاسوبية، الخطوط، تصميم النصوص والحوسبة متعددة اللغات.

مَمِمّمَّمِّ

Context-based positioning

Diacritics for vowel sounds are typically not used. The shadda (w-like diacritic, that doubles the consonant value) is often not used either. When they are used, however, they need to be placed in different positions, according to the context.

In the highlighted examples you can see that shadda being placed at different heights, depending on the height of the base character that it appears above.

The sample text at the end shows how the vowel signs are placed above and below the shadda, rather than above and below the base character, when a shadda is present.

Multiple combining characters

Diacritics for vowel sounds are typically not used. The shadda (w-like diacritic, that doubles the consonant value) is often not used either. But if they are used together, they have to be displayed together above the same base consonant.

Contextual shaping

This example is font dependent. In more classical fonts, you will often see the join between certain characters above the baseline, rather than at the baseline. Certain letters in the highlighted sequence may join above the line, like this:

or on the baseline, like this:

Richer fonts will also apply ligatures. Ligatures are also used for justification and styling. The following shows a word using no ligatures, on the left, and ligatures, on the right.

 vs.

Cursive script

Arabic script joins letters together. The highlights show the same letter, ع , with three different shapes, depending on how it joins:

Text direction

Arabic script is written right-to-left in the main, but as with all RTL scripts, numbers and embedded LTR script text are written left-to-right (bidirectional text). In the following example, the Arabic words are read RTL, starting with the one on the right, and numbers are ten and twelve (ie. read left-to-right). The date range is also ordered RTL, ie. it starts with 10 and ends with 12:

Large glyph sets

Since Arabic is a cursive script, there are many more glyphs in a font set than there are characters, since you need glyphs that join in various ways. Not only that, but Arabic also has ligatures, which themselves join to characters or other ligatures alongside them. One ligature (lam-alif) is mandatory. Otherwise, the use of ligatures varies by font. Compare the two screen captures below, which show the same word - ligatures are used in a couple of places in the example on the right:

 vs.

Justification

The Arabic justifies text by stretching the baseline and the glyphs of the text, rather than expanding inter-word spaces. Justification can also introduce ligatures in order to squeeze more text onto a line.

This is not supported by many browsers, but if your browser supports it, you can play with justification and vary the extent of the stretching using the controls below.

    Set the kashida value (%):

Character list

The Arabic script characters in Unicode 7.0 are spread across 3 blocks:

  1. Arabic (255)
  2. Arabic Supplement (48)
  3. Arabic Extended-A (47)

The following is an incomplete list of languages and the number of characters they use, per version 26 of CLDR's lists of characters (exemplarCharacters).

Available at: rishida.net/scripts/samples/arabic

Content first published 2002-12-16. This version 2014-10-21 6:51 GMT