Bengali script notes

I am compiling these notes as I explore the Bengali script as used for Bengali. They may be updated from time to time. See also the companion document, Bengali Character Notes, which describes the characters used in Bengali script one by one.

The page contains brief notes on general script features and discussions about which Unicode characters are most appropriate when there is a choice.

For more detailed information, especially about the history and phonology of Bengali, follow the links in the text and at the bottom of the page. You can also click on the symbols in the next section to jump to a description of that character.

If you click on the Bengali example text, the page will list the constituent characters at the bottom right of the page.

List of basic symbols

Consonants:
ক খ গ ঘ ঙ চ ছ জ ঝ ঞ ট ঠ ড ড় ঢ ঢ ণ ত থ দ ধ ন প ফ ব ভ ম য য়র ল শ ষ স হ ৎ ্য

Independent vowels:
অ আ ই ঈ উ ঊ ঋ এঐ ও ঔ

Vowel signs:
া ি ী ুূ ৃ ে ৈো ৌ

Combining marks:
ঁ ং ঃ ় ্

Symbols & punctuation:
৳ । ॥

Numbers:
০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯

Other symbols in the Bengali block:
৲ ৴ ৵ ৶ ৷ ৸ ৹ ৺ ঽ ৱ ৰ ৗ ৠ ঌ ৡ ৄ ৢ ৣ ড় ঢ় য়

To see a list of ligatures and alternative shapes go to the 'shape' view of the Bengali character picker. (Hint: to see the composition of a conjunct, click on it and select 'Codepoints' or 'Analyse'.)

Brief script introduction

Bengali has its own script, called বাংলা baɱla in the Bengali language.

The script is an abugida. This is characterised by consonant characters that include an inherent vowel sound. The inherent vowel can be overridden using vowel signs appended to the character. There are also independent vowel signs to represent vowels that are not preceded by consonants. The syllable is the unit for various aspects of the behaviour of the script.

The alphabet is split into vowels and consonants. With one exception (ɔ-kar), each vowel is represented by an independent version and a combining vowel sign, shown one above the other.

Vowel harmony

The pronunciation of a vowel can be affected by the vowel in the following syllable. Radice provides the following table, though this is a simplification and there are many exceptions.

Followed by i or u Followed by ɔ, o, e or a
o → u o → ɔ
ɔ → o u → o
e → i e → æ
æ → e i → e

For example, ʃona becomes ʃuni, dækha becomes dekhi, etc. This sometimes accounts for the pronunciation of the inherent vowel.

Inherent vowel

The inherent vowel, unlike Hindi, is [ɔ] or [o]. (And sometimes halfway between these two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences - thinking of this as one sound.

Note that there is also a vowel pronounced [o]. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular context.

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two tendencies can help:

The inherent vowel is pronounced at the end of some words and not others, eg. গরম gɔrôm, hot vs. গড়ান gɔɽanô, to roll . There is no real way to tell when it is pronounced and when not in this position, except that it is usually pronounced following a word-final conjunct, eg. যুদ্ধ y̌uddhô war. When pronounced in this position, the sound is usually [o] .

Refs: Radice 3, 7-8, 21, 148; Daniels 400

Vowel ligatures

Some vowel signs can form ligatures with a preceding base consonant in certain contexts, but do not ligate in others (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

The default behaviour of a given font can be modified using the zero-width joiner and zero-width non-joiner characters, eg. গু vs. গ‌ু.

Refs: Unicode 313-314

Conjuncts

The absence of vowels between consonants can be represented by

Like other scripts, is displayed in a non-standard way in consonant clusters. A syllable initial is displayed as a mark to the top right of the cluster, and a trailing is typically displayed as a wavy line below the other consonants.

Bengali also has a particular way of representing a cluster-final [j] semi-vowel. This is typically represented using the full form of the preceding consonant followed by sign y̌ɔ-phɔla .

Consonant clusters are not always represented as conjuncts. Grammatical suffixes and endings are written without, eg. খাননা khanna, which is present tense form khan plus negative suffix na; করছ kôrchô, which is stem kôr from kôra plus present continuous ending chô.

Nasals in conjuncts tend to follow phonological rules. Velar consonants (k, kh, g, etc) take ñɔ, palatal consonant (c, ch, ..) combine with ñɔ, retroflex take ɳɔ, dental , and labial .

Punctuation

The danda (found in the Devanagari block) is used for sentence final punctuation. I haven't seen much evidence for the use of the double danda.

Western punctuation, such as commas, semicolons, colons, quotation marks and hyphens are also used quite commonly.

bisɔr͟gô 0983 BENGALI SIGN VISARGA is sometimes used to mark initial abbreviation.

Refs: Bhattacharya

Character choices

The Bengali character block in Unicode provides 3 precomposed characters that are decomposed in Normalization Form C. The NFC form should be used.

Don't useUse
ড়
09DC BENGALI LETTER RRA
+
09A1 BENGALI LETTER DDA +
09BC BENGALI SIGN NUKTA
ঢ়
09DD BENGALI LETTER RHA
+
09A2 BENGALI LETTER DDHA +
09BC BENGALI SIGN NUKTA
য়
09DF BENGALI LETTER YYA
+
09AF BENGALI LETTER YA +
09BC BENGALI SIGN NUKTA

The Bengali block also has code points that enable you to split vowel signs that circumvent the base into two parts; the single code point should be used. (If you do use the two code points, they must be both input after the base, and in the correct order.)

Don't useUse
+
09C7 BENGALI VOWEL SIGN E +
09D7 BENGALI AU LENGTH MARK

09CC BENGALI VOWEL SIGN AU
+
09C7 BENGALI VOWEL SIGN E +
09BE BENGALI VOWEL SIGN AA

09CB BENGALI VOWEL SIGN O

Further reading

  1. [Daniels] Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0
  2. [WPScript] Wikipedia, Bengali Script
  3. [Radice] William Radice, Teach Yourself Bengali, Hodder & Shoughton, ISBN 0-340-86029-4
  4. [Unicode5.2] The Unicode Standard v5.2
  5. [Bhattacharya] Private correspondance with Tanmoy Bhattacharya, July 2004.

Author: Richard Ishida.

Content first published 7 April, 2006. This version 2014-02-07 8:25 GMT