Balinese script notes

I am compiling these notes as I explore the Balinese script as used for the Balinese language. The page contains brief notes on general script features and discussions about which Unicode characters are most appropriate when there is a choice. See also the companion document, Balinese Character Notes, which describes the characters used in the Balinese script one by one.

If you click on the Balinese example text, the page will show the constituent characters at the bottom right of the page. (Hint: since the examples are all displayed using graphics (to ensure that you see what is expected), you can copy the characters in the example by clicking on the example and then copying the red text that appears at the bottom right of the page, just above the list of characters.)

For more detailed information, especially about the history and phonology of Balinese, follow the links in the text and at the bottom of the page. You can also click on the symbols in the next section to jump to a description of each character.

Brief script introduction

The script type is abugida. Consonants carry an inherent vowel a, although that is pronounced ə at the end of a word.

Text runs left-to-right, and words are not separated by spaces.

Characters

Consonants

Consonants have an inherent -a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is silenced by U+1B44 BALINESE ADEG ADEG ᭄ (the Balinese equivalent of the Sanskrit virama), and the following consonant is subjoined or postfixed, often with a change in shape.

Only 18 of the consonants are used for pure Balinese language text. The remainder are used for words derived from Sanskrit or Kawi. There are also a few characters in the Unicode block that are used for the Sasak language. (It's not clear to me whether the fact that these relate to aspirated or retroflex forms originally affects the pronunciation.)

A number of the Sanskrit or Kawi consonants are rather poorly attested. The letter ca laca is only found in non-initial position as ◌᭄ᬙ, and most of the retroflex series is often omitted in books about the script. The letter JA JERA õ (jha) seems to be known from only one word, ᬦᬶᬃᬛᬭ nirjhara (pond). (It is possible that an original ai may have been lost in Balinese, to be replaced by the glyph for jha.)

The symbols for vocalic r and vocalic l have been reclassified as consonants (see below for details).

Consonant clusters

To represent consonants without intervening vowels, the non-initial consonant is typically drawn below the initial consonant, and with a slightly different shape. There can be up to 3 consonants combined in this way, and the third consonant must be one of ya, ra, la or wa. In some cases the following consonant appears to the right of the initial consonant.

Otherwise, the sign adeg-adeg is used to show that no vowel is present, eg. ᬓᬧᬮ᭄kapal (ship).

In Unicode, the adeg-adeg character is used between consonants to cause the conjunct combining behaviour.

Because there is no word separator, consonants at the end of one word and beginning of the following word are normally stacked, too. In some cases this leads to ambiguity about whether this is one or two words. If you really want to make clear which is which, you can use an explicit adeg-adeg, eg. ᬧᬓ᭄ᬭᬫᬦ᭄pakraman (membership) vs. ᬧᬓ᭄‌ᬭᬫᬦ᭄Pak Raman (Mr. Raman).

You can do this in Unicode by including a zero-width non-joiner after the adeg-adeg.

A somewhat ambiguous situation arises where apparently norms prevent certain combinations stacking. For example, the name of the village tamblung should not stack the mbl, but should look like ᬢᬫ᭄ᬩ᭄ᬮᬂ. This would look exactly like this if you used a zero-width non-joiner after ma, but it could be achieved also by intelligence in the font, as was actually the case when I generated this example (click on it to see). It's not clear to me what is the preferred approach: put zwnj in only when the font doesn't do what you want, or use it always. The latter may lead to more consistent content where different fonts are applied to the text (eg. after cut and paste). In theory, this shouldn't affect searching and sorting, although some applications may not ignore the zwnj as they should.

Ra repa

Balinese doesn"t use ra + pepet to represent the sound . Instead it uses ra repa ᬋ. U+1B0B BALINESE LETTER RA REPA at the beginning of a syllable, such as in ᬓᭂᬋᬂ kěrěng (eat a lot), is treated as a consonant.

Ra repa has a postfixed form and a subjoined form. The postfixed form ◌᭄ᬭᭂ is seen where the consonant form of ra repa follows a word which ends in a consonant, such as ᬧᬓ᭄ᬋᬋᬄ Pak Rěrěh (Mr Rereh). The sequence of characters to be used for this is <consonant, adeg-adeg, ra repa> (ie. not using U+1B3A BALINESE VOWEL SIGN RA REPA).

The subjoined form ◌ᬺ is used to represent the original vocalic r. In such cases, it follows a syllable-initial consonant, as in ᬓᬺᬰ᭄ᬡ Krěsna (Krishna). This is where U+1B3A BALINESE VOWEL SIGN RA REPA is used. The sequence of characters to be used here is <consonant, vowel sign ra repa>.

Repertoire extension

The combining mark rerekan ᬴ is used, as is a similar sign in Javanese, to extend the character repertoire for foreign sounds. Attested in Library of Congress transliterations and in earlier Sasak orthography are: ᬓ᬴ x, ᬕ᬴ ɣ, ᬗ᬴ ʕ, ᬚ᬴ z, ᬧ᬴ f, ᬯ᬴ v, and ᬳ᬴ ħ. ᬤ᬴ could be used for one-to-one transliteration for Javanese .

In rendering, the dots of these letters appear above the top character, which can cause some ambiguity in reading; ᬓ᬴ could be xja <ka, rerekan, adeg-adeg, ja>, or kza <ka, adeg-adeg, ja, rerekan>, or indeed xza <ka, rerekan, adeg-adeg, ja, rerekan>. In practice these combinations are probably rather rare.

In recent times, Sasak users abandoned the use of the Javanese-influenced rerekan in favour of a series of modified letters (see above), making use, in addition, of some of unused Kawi letters for the Arabic sounds. In place of ᬓ᬴ x and ᬕ᬴ ɣ, for instance, the new fusion (of ᬓ and ᬳ) ᭆ khot sasak and the Kawi letter ᬖ ga gora are used.

Vowels

Consonants carry an inherent vowel a, pronounced ə at the end of a word and also in prefixes ma-, pa- and da-. There are vowel signs for all vowel sounds in Balinese except the inherent vowel.

There are also independent vowel forms for most vowels for use at the beginning of a word. In the middle of a word, the vowel sign is used over ha. The vowels pepet ᭂ and pepet tedong ᭃ don't have an independent form, and have to be used over ha at the beginning of a word.

In Sasak, independent vowel akara ᬅ can be treated as a consonant insomuch as it can be followed by an explicit adeg adeg ᭄ in word- or syllable-final position, where it indicates the glottal stop, eg. ᬳᬫᬅ᭄ amaq; other consonants can also be subjoined to it.

Shaping and positioning

Many of the subjoined and post-fixed consonant forms have different shapes from the standard glyph for that character, for example na ᬦ becomes ◌᭄ᬦ.

In addition, many conjunct clusters combine characters with special shapes, or subtly change parts of glyphs to join smoothly. Often the changes are significant, especially the medial consonants, ya, ra, wa and la. For example, see the sequence <ba, adeg-adeg, ra, adeg-adeg, ya> in ᬩ᭄ᬭ᭄ᬬᬕ᭄ briag laughter.

Combining vowel signs can also have different shapes depending on the context. For example, the vowel sign tedung typically ligates with the preceding consonant, eg. ha is ᬳ but <ha, tedong> is ᬳᬵ and subjoined ya is ◌᭄ᬬ but <consonant, adeg-adeg, ya, tedong> is ◌᭄ᬬᬵ.

When two diacritics appear above a consonant, the shape and position needs to be adapted.

You can experiment with other examples using the Balinese picker.

Numbers

There are a set of Balinese digits, and they are used in the same way as Latin digits.

However, many of the digit symbols are indistinguishable from other Balinese letters. Numbers are typically surrounded by carik siki, so that they are easily recognisable, eg. ᬩᬮᬶ᭞᭓᭞ᬚᬸᬮᬶ᭞᭑᭙᭘᭒᭟ (Bali, 3 July 1982).

Punctuation

Both ᭚ panti and ᭛ pamada are used to begin a section in text.

᭝carik pamungkah is used as a colon, and ᭞ carik siki and ᭟ carki pareren are used as comma and full stop respectively.

At the end of a section, ᭟᭜᭟ pasalinan and ᭛᭜᭛ carik agung may be used (depending on what sign began the section). These are encoded using the punctuation ring ᭜ windu together with ᭟ carik pareren and ᭛ pamada.

In some texts, "holy letters" or modre symbols are made by using ᬁulu candra with these: ᭜ᬁ, ᭟ᬁ, ᭛ᬁ .

Text layout

Line breaking and hyphenation

Common practice is to break the sentence at any point when it reaches the end of a line, except that no line breaks should be allowed within syllable boundaries and no line breaks are allowed just before a colon, comma or full stop.

In lontar texts where a word must be broken at the end of a line (always after a full syllable), the sign ᭠ pameneng is inserted. This sign is not used as a word-joining hyphen; it is used only in linebreaking.

List of basic symbols

This is a list of main characters or character combinations needed for Balinese. Clicking on these characters will open a page in another window. If the character is underlined, the new page will display additional information about that character.

Native consonants ᬳ ᬦ ᬘ ᬭ ᬓ ᬤ ᬢ ᬲ ᬯ ᬮ ᬫ ᬕ ᬩ ᬗ ᬧ ᬚ ᬬ ᬜ
Sanskrit/Kawi consonants ᬔ ᬖ ᬙ ᬛ ᬝ ᬞ ᬟ ᬠ ᬡ ᬣ ᬥ ᬨ ᬪ ᬰ ᬱ
Sasak consonants ᭅ ᭆ ᭇ ᭈ ᭉ ᭊ ᭋ
Independent vowels ᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ
Vowel signs ᬵ ᬶ ᬷ ᬸ ᬹ ᬺ ᬻ ᬼ ᬽ ᬾ ᬿ ᭀ ᭁ ᭂ ᭃ
Diacritics ᬀ ᬁ ᬂ ᬃ ᬄ ᭄ ᬴
Digits ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙
Punctuation ᭚ ᭛ ᭜ ᭝ ᭞ ᭟ ᭠

 

To test the various contextual forms of these characters, use the Balinese character picker.

Further reading

  1. [Sudewa] The Balinese Alphabet
  2. [Everson et al.] Proposal for encoding the Balinese script in the UCS
  3. [Wikipedia] Balinese script
  4. [Unicode] The Unicode Standard v6.0
  5. [Omniglot] Balinese
  6. [ScriptSource] Balinese

Available at: rishida.net/scripts/balinese/

Content first published 2012-03-10. This version 2014-10-13 10:22 GMT