Skip to main content
OCLC Support

Unsupported non-Latin characters

Discover information about which Unicode characters are currently unsupported in WorldShare Record Manager.

Some Unicode characters are not currently supported for use in Record Manager and will cause a validation error to occur. For information about the Telugu and Sinhala scripts specifically related to the Library of Congress scripts expansion project, see Details from OCLC: Library of Congress scripts.

If you need to enter these characters in a bibliographic or authority record in Record Manager:

  1. Enter the name of the character within square brackets, using the Unicode standard if available (e.g., enter [schwa]).
  2. You may also enter the hex values provided into Connexion.

Currently unsupported characters

Name Unicode Hex value Character Title Details
BALINESE ADEG ADEG U+1B4C &#x1B4C  

Used in older texts in place of the ja + nya conjunct.

Script/Block: Balinese (U+1B00–U+1B7F)

BALINESE LETTER NA RAMBAT U+1B4F &#x1B4F    

Block: Balinese (U+1B00–U+1B7F)

CANADIAN SYLLABICS NATTILIK HA U+11AB4 (part of the Supplementary Multilingual Plane) 𑪴 𑪴   Part of the Nattilik dialect of Inktitut.
LAO LETTER PA U+0E90 &#x0E90    

Block: Lao (U+0E80–U+0EFF)

LAO LETTER SO TAM U+0E8B &#x0E8B    

Block: Lao (U+0E80–U+0EFF)

NKO DANTAYALAN U+07FD &#x07FD ߽  

Used to abbreviate units of measure.

Script/Block: NKo (U+07C0–U+07FF)

SINHALA SIGN CANDRABINDU U+0D81 ඁ  ඁ සංස්කෘත-සිංහල ශබ්දකෝෂය Used at the end of කෝෂය (kōṣayaṁ)
TELUGU LETTER NAKAARA POLLU U+0C5D ౝ శ్రీ కృష్దేవమహారాయల ప్రభుత్వము Used in కృష్దేవ (archaic spelling of Kṛṣṇadēva)
TELUGU SIGN NUKTA U+0C3C ఼ తెలుగు-ఉర్దూ ఫ఼ారసీ పదకోశము Used in ఖ఼ాన్ (author name) and ఫ఼ారసీ (Fārsī in title)
TELUGU SIGN SIDDHAM U+0C77 ౷ ౷ సిద్ధిరస్తు Used as an invocation at the beginning of inscriptions.
TELUGU SIGN VISARGA U+0C04 &+0C04; తెలుగు-ఉర్దూ ఫ఼ారసీ పదకోశము Used in ఖ఼ాన్ (author name) and ఫ఼ారసీ (Fārsī in title)