Search non-Latin records

Last updated
Save as PDF

Discover how to search for non-Latin records in Connexion client.

Enter searches

Search for records containing non-Latin script data using either script search terms or romanized (Latin-script equivalent) search terms.
Both interactive and batch searching support non-Latin script search terms (Cataloging > Search > WorldCat and Batch > Enter Bibliographic Search Keys).
Alternatively, copy and paste non-Latin script data into client searches from sources external to the client.
- Non-Latin script search terms must be based on Unicode.
- If Unicode characters that are not convertible are in the search term, you may find no matching records.
About using search indexes for non-Latin script search terms:
- Use the same indexes (labels and punctuation) for non-Latin script searches that you use for Latin script searches. Enter the index labels and punctuation using Latin script.
- Do not use derived searching for non-Latin scripts.
- Add the same qualifiers to both Latin and non-Latin script searches. Enter them using Latin script.
- Browsing to scan indexes for a match is available for any supported script. Browsing scans for the exact data string followed by any other data, providing automatic truncation. Enter only as many characters as are unique enough to retrieve matching record(s).
- For Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Greek, Hebrew, Syriac, and Tamil script searches, use word or phrase search indexes and word or phrase browse indexes.
- For CJK script searches, the system indexes both single characters and immediately adjacent characters in a field. Use the following search strategies:
  - Word search – Enter an index label and a colon (for example, ti:) followed by a character string with no spaces to find a single word, or followed by more than one character string separated by a space to find multiple words, anywhere in an indexed field.
  - Phrase search – Enter an index label and an equal sign (for example, ti=) followed by a character string to find exact occurrences, starting with the first character in an indexed field and including each succeeding character. Truncate the character string to find the string followed by any other data without having to enter the entire data string as it appears in a field or subfield.
    Note:
    - If you enter a search string in quotation marks (e.g., "[search string]"), then the results returned will include both records containing the exact string and records containing each character in the string.
    - Enter a minimum of three CJK characters if you truncate a search.
  - Phrase browse – Enter the Scan command, an index label, and an equal sign (for example, sca ti=) followed by a character string. Phrase browsing scans an index for occurrences of the browse string at the beginning of indexed fields, followed by any other data (automatic truncation).
    Note: Since all MARC-8 CJK characters are indexed singly if you browsed for a word, the system would scan for the first character only, and results would not be significant.
- For Thai script searches, the system treats the entire data string you enter as both a word and a phrase, since Thai text has no spaces between words. Search for Thai terms using word or phrase search indexes and word or phrase browse indexes.
- If you want to retrieve all records or see sample records containing a particular script, use the Character Sets Present WorldCat search index (label vp:) with the assigned code for a script. A list of all possible script codes that could be searched is included here.

To enter one of the searches above to retrieve all records that contain a specified script, use the command line in the Search WorldCat window (Cataloging > Search > WorldCat).

Note: If a search for a particular script alone retrieves too many WorldCat records (limit 1,500 records), you must limit the search and try again. See Use WorldCat search results for more about how the client displays WorldCat search results.

Examples

vp:ara/1991-2 (search for Arabic script records limited to those published in 1991 and 1992)
vp:ara and la:per (search for Arabic script records limited to those describing Persian language items)

See Search WorldCat for more about word and phrase searching and search methods.

Sort order of search results

You can select how the results of non-Latin script WorldCat searches are sorted:

Alphabetically by the Latin script data
Or
In Unicode order by the non-Latin script data

To check or change the option for sort order for WorldCat search results:

Navigate to Tools > Options or press <Alt><T><O>.
Click the International tab.
Select the Primary Sort by Latin Script check box to select or deselect the option to sort search results in alphabetical order by the Latin script data.
- If you deselect the check box, search results are sorted in Unicode order by the non-Latin script data.
- The sort order selected also determines the sort order of local bibliographic save file and local constant data search results.
- Tamil Unicode 4.0 codes are not in collating order. The default, alphabetical
  sorting by Latin script, is recommended if romanized (Latin-equivalent) data is included in the record with Tamil script data.
When finished, perform one of the following actions:
- Click Close or press <Enter> to apply the settings and close the Options window.
- Click Apply to apply the settings without closing the window.

Need help?

Follow OCLC

Support

Related sites

Stay in the know.