Bengali script notes

I am compiling these notes as I explore the Bengali script as used for the Bangla language. They may be updated from time to time.

The page contains brief notes on general script features. See also the companion document, Bengali Character Notes, which describes the characters used in Bengali script one by one. For more detailed information, especially about the history and phonology of the Bengali script, follow the links in the text and at the bottom of the page.

When you see red text (examples of Bangla) you can click on it to reveal the component characters.

You can obtain fonts for this page free from the Web. I created this page using Solaiman Lipi. Click the blue vertical bar at the bottom right of the page to apply other fonts, if you have them on your system.

Brief script introduction

Bengali has its own script, called বাংলা baɱla in the Bangla language.

The script is an abugida. This is characterised by consonant characters that include an inherent vowel sound. The inherent vowel can be overridden using vowel signs appended to the character. There are also independent vowel signs to represent vowels that are not preceded by consonants. The syllable is the unit for various aspects of the behaviour of the script.

The alphabet is split into vowels and consonants. With one exception (ɔ-kar), each vowel is represented by an independent version or a combining vowel sign.

Example of Bangla:

সমস্ত মানুষ স্বাধীনভাবে সমান মর্যাদা এবং অধিকার নিয়ে জন্মগ্রহণ করে। তাঁদের বিবেক এবং বুদ্ধি আছে ; সুতরাং সকলেরই একে অপরের প্রতি ভ্রাতৃত্বসুলভ মনোভাব নিয়ে আচরণ করা উচিত্‍।

Characters used for Bangla language

Vowel harmony

The pronunciation of a vowel can be affected by the vowel in the following syllable. Radice provides the following table, though this is a simplification and there are many exceptions.

Followed by i or u Followed by ɔ, o, e or a
o → u o → ɔ
ɔ → o u → o
e → i e → æ
æ → e i → e

For example, the verb শোনা ʃona to hear with an i ending becomes ʃuni, দেখা dækʰa to see becomes dekʰi, etc. This sometimes accounts for the pronunciation of the inherent vowel, eg. অতিথি ôtithi guest and অনুবাদ ônubad translation start with o rather than ɔ.

Inherent vowel

The inherent vowel, unlike Hindi, is ɔ or o. (And sometimes halfway between these two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences – thinking of this as one sound.

Note that there is also a vowel pronounced o. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular context.

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two tendencies can help:

The inherent vowel is pronounced at the end of some words and not others, eg. গরম gɔrôm, hot vs. গড়ান gɔɽanô, to roll . There is no real way to tell when it is pronounced and when not in this position, except that it is usually pronounced following a word-final conjunct, eg. যুদ্ধ y̌uddhô war. When pronounced in this position, the sound is usually o .

Refs: Radice 3, 7-8, 21, 148; Daniels 400

Vowel ligatures

Some vowel signs can form ligatures with a preceding base consonant in certain contexts, but do not ligate in others (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

The default behaviour of a given font can be modified using the zero-width joiner and zero-width non-joiner characters, eg. গু vs. গ‌ু.

Conjuncts

The absence of vowels between consonants can be represented in the following ways:

Unlike languages written in the Devanagari script, consonant clusters are often not represented as conjuncts in Bengali. It is necessary to just know that the vowel should not be pronounced, eg. রিকশা rikʃa rickshaw. Grammatical suffixes and endings are typically written without conjuncts, eg. খাননা khanna, which is the present tense form khan plus negative suffix na; করছ kôrchô, which is stem kôr from kôra plus present continuous ending chô.

Conjunct shapes are commonly formed by displaying a slightly smaller version of the non-final consonants in a cluster (equivalent to half-forms in Hindi), eg. see the m in ক্যম্পাস kyæmpas campus, or by combining the consonants into a more complex conjunct shape, eg. khr and ʂʈ in খ্রিষ্টান khriʂʈan christian.

Like other scripts, U+09B0 BENGALI LETTER RA is displayed in a non-standard way in consonant clusters. A syllable initial is displayed as a mark to the top right of the cluster, eg. rt in গর্ত gɔrtô hole, and a trailing is typically displayed as a wavy line below the other consonants, eg. gr in গ্রাম gram village.

Bengali also has a particular way of representing a cluster-final j semi-vowel. This is typically represented using the full form of the preceding consonant followed by a special form of U+09AF BENGALI LETTER YA , known as y̌ɔ-phɔla, eg. হ্যাঁ hyæ̃ yes.

When the virama is often used it is usually because the font doesn't have a particular conjunct ligature, but it may also be visible in places where the phonology is unusual, eg. ফ্‌ল্যাট phlæʈ flat; লান্‌চ lanc lunch (though these may also be spelled with conjuncts, eg. ফ্ল্যাট phlæʈ flat). It is also quite common to see উদ্‌যাপন to distinguish it from words like উদ্যান. These words are etymologically related, but distinct phonetically.

Nasals in conjuncts tend to conform to phonological patterns. Velar consonants (k, kh, g, etc) combine with ŋɔ, palatal consonant (c, ch, ..) combine with ñɔ, retroflex ɳɔ, dental , and labial .

Character choices

The Bengali character block in Unicode provides 3 precomposed characters that are decomposed in Normalization Form C. The NFC form should be used.

Don't useUse
ড়
09DC BENGALI LETTER RRA
+
09A1 BENGALI LETTER DDA +
09BC BENGALI SIGN NUKTA
ঢ়
09DD BENGALI LETTER RHA
+
09A2 BENGALI LETTER DDHA +
09BC BENGALI SIGN NUKTA
য়
09DF BENGALI LETTER YYA
+
09AF BENGALI LETTER YA +
09BC BENGALI SIGN NUKTA

The Bengali block also has code points that enable you to split vowel signs that circumvent the base into two parts; the single code point should be used. (If you do use the two code points, they must be both input after the base, and in the correct order.)

Don't useUse
+
09C7 BENGALI VOWEL SIGN E +
09D7 BENGALI AU LENGTH MARK

09CC BENGALI VOWEL SIGN AU
+
09C7 BENGALI VOWEL SIGN E +
09BE BENGALI VOWEL SIGN AA

09CB BENGALI VOWEL SIGN O

Text formatting

The following is a brief summary of script features until such time as I have more detailed information.

Text runs left to right, horizontally, and lines typically break at the spaces between words. The script has no upper-/lowercase distinction.

The basic unit for text segmentation is the syllable. Unicode grapheme clusters don't cover consonant clusters, so some additional processing is needed to identify text unit boundaries.

Initial letter styling can be applied to Bengali text.

Punctuation

The danda (U+0964 DEVANAGARI DANDA ) is used for sentence final punctuation. I haven't seen much evidence for the use of the double danda (U+0965 DEVANAGARI DOUBLE DANDA ).

Western punctuation, such as commas, semicolons, colons, quotation marks and hyphens are also used quite commonly.

The bisɔrgô U+0983 BENGALI SIGN VISARGA is sometimes used to mark initial abbreviations.

Refs: Bhattacharya

Counters

The Predefined Counter Styles document contains a single counter style for Bengali. It is numeric.

@counter-style bengali {
system: numeric;
symbols: '\9E6' '\9E7' '\9E8' '\9E9' '\9EA' '\9EB' '\9EC' '\9ED' '\9EE' '\9EF';
/* symbols: '০' '১' '২' '৩' '৪' '৫' '৬' '৭' '৮' '৯'; */
} 
  

Examples: 1 ⇨ , 2 ⇨ , 3 ⇨ , 4 ⇨ , 11 ⇨ ১১, 22 ⇨ ২২, 33 ⇨ ৩৩, 44 ⇨ ৪৪, 111 ⇨ ১১১, 2222 ⇨ ২২২২.

List of basic symbols

Bangla

This is a list of main characters or character combinations needed for Bangla. Clicking on these characters will open a page in another window. If the character is underlined, the new page will display additional information about that character.

 

Consonants ড় ঢ় য়   ্য
Independent vowels
Vowel signs   া   ি   ী   ু   ূ   ৃ   ে   ৈ   ো   ৌ
Combining marks   ঁ   ং   ঃ   ়   ্
Symbols & punctuation
Numbers
Other symbols in the Bengali block   ৗ   ৄ   ৢ   ৣ

 

To see a list of ligatures and alternative shapes go to the 'shape' view of the Bengali character picker. (Hint: to see the composition of a conjunct, click on it and select 'Codepoints' .)

Further reading

  1. Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0
  2. Wikipedia, Bengali Script
  3. William Radice, Teach Yourself Bengali, Hodder & Shoughton, ISBN 0-340-86029-4
  4. The Unicode Standard v5.2
  5. Private correspondance with Tanmoy Bhattacharya, July 2004.
  6. Ishida, Predefined Counter Styles

Available at: rishida.net/scripts/bengali/

Content first published 7 April, 2006. This version 2015-01-18 7:26 GMT