Baraha SDK overview

The functionality of Baraha software is available in the form of SDK (Software Development Kit) for building custom Kannada applications using ASP, Visual Basic & Visual C++ programming tools. The Baraha SDK is implemented in several DLLs in the form of API (Application Programming Interface) functions and also as COM (Component Object Model) objects. The DLLs that ship with the Baraha software and the Baraha SDK are exactly the same. In fact, Baraha is built using the Baraha SDK.

Benefits of SDK:

Baraha SDK can be split into 2 sets of API.

1) Baraha Direct.
2) Baraha Script Processing


Baraha Direct API:

Baraha Direct is a utility of the Baraha software using which one can type Indian language text directly in other application using Baraha's popular transliteration keyboard. This is a tremendous feature, which allows the use of Indian languages as easy and simple as using English, in applications such as MS Word, PageMaker, Notepad, or custom VB/VC++ applications.

The Baraha Direct functionality is implemented in a Dynamic Link Library called brh_direct.dll. The BarahaDirect.exe is actually a very thin application, calling API functions in the brh_direct.dll. When developers use brh_direct.dll in their VB/VC++ applications, the Barah Direct becomes part of their application. Therefore, for inputting Indian language text, the end-users need not have the Baraha software installed in their system. When developers ship their applications, they have to distribute the brh_direct.dll, and the language dlls (brh_kanapi.dll, brh_devapi.dll e.t.c.) and the required fonts only.

The Visual Basic sample programs "BarahaDirectVB", and "people_db" show the usage of the Baraha Direct API.

Baraha Direct API Reference


Baraha script processing API:

The various scripts supported by Baraha are implemented using plugins. Each script has its own plugin. The plugins support conversion between various data formats. The text of any given language can be stored in any of the formats based on the need and context. All the plugins contain the same API functions for converting the data from one format to another. The plugins also contain API functions for comparing the strings stored in any of these formats. In addition to API functions, the plugins contain COM objects, which expose the same functionality through COM interfaces. The COM objects are useful in ASP/VB Script programming for building web-based applications.

Script Kannada Devanagari Tamil Telugu Malayalam Gujarati Gurumukhi Bengali Oriya
Languages Kannada
Konkani
Tulu
Hindi
Marathi
Sanskrit
Konkani
Kashmiri
Nepali
Sindhi
Tamil Telugu Malayalam Gujarati Punjabi Bengali
Assamese
Manipuri
Oriya
Plugin DLL brh_kanapi.dll brh_devapi.dll brh_tamapi.dll brh_telapi.dll brh_malapi.dll brh_gujapi.dll brh_gurapi.dll brh_benapi.dll brh_oriapi.dll
Library brh_kanapi.lib brh_devapi.lib brh_tamapi.lib brh_telapi.lib brh_malapi.lib brh_gujapi.lib brh_gurapi.lib brh_benapi.lib brh_oriapi.lib
Headers Files brh_kanapi_api.h
brhcode.h
collcode.h
brh_devapi_api.h
brhcode.h
collcode.h
brh_tamapi_api.h
brhcode.h
collcode.h
brh_telapi_api.h
brhcode.h
collcode.h
brh_malapi_api.h
brhcode.h
collcode.h
brh_gujapi_api.h
brhcode.h
collcode.h
brh_gurapi_api.h
brhcode.h
collcode.h
brh_benapi_api.h
brhcode.h
collcode.h
brh_oriapi_api.h
brhcode.h
collcode.h
COM ProgID Brh_Kanapi.kanapi50 Brh_Devapi.devapi50 Brh_Tamapi.tamapi50 Brh_Telapi.telapi50 Brh_Malapi.malapi50 Brh_Gujapi.gujapi50 Brh_Gurapi.gurapi50 Brh_Benapi.benapi50 Brh_Oriapi.oriapi50
Data Formats KANTRANS, KAGAPA
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
DEVTRANS1, DEVTRANS2
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
TAMTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
TELTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
MALTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
GUJTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
GURTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
BENTRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
ORITRANS
BRHCODE
COLLCODE
ANSI
UNICODE
UTF-8
LATIN
TrueType Fonts BRH Kannada
BRH Vijay
BRH Kailasam
BRH Bengaluru
BRH Sirigannada
BRH Amerikannada
BRH Kasturi
(Karnataka Govt. Standard fonts)
BRH Devanagari
BRH Kalidasa
BRH Tamil
(TSCII 1.7 compatible fonts)
BRH Telugu BRH Malayalam BRH Gujarati BRH Gurumukhi BRH Bengali BRH Oriya


BRHCODE:
Baraha uses a common 8 bit internal code called BRHCODE for all the languages it supports. The plugins use the BRHCODE as the intermediate stage when converting data from one format to another. Since all the plugins support BRHCODE, it is possible to convert text from one script to another script by using BRHCODE as the intermediate stage. BRHCODE is also the efficient way to store the Indian language text in the database/files. The BRHCODE is declared in the brhcode.h header file.
 
Transliteration Code:
The KANTRANS, KAGAPA, DEVTRANS1, DEVTRANS2, DEVTRANS3, TAMTRANS, TELTRANS, MALTRANS, GUJTRANS, GURTRANS, BENTRANS and ORITRANS formats specify the human readable encoding using the transliteration rule of the corresponding languages. The transliteration rules of the various languages are almost the same. There are only slight differences between the different languages.

"KANTRANS": Baraha Kannada Transliteration scheme.
"KAGAPA":      KGP Keyboard.
"DEVTRANS1":  Baraha Devanagari Transliteration scheme for Sanskrit.
"DEVTRANS2":  Baraha Devanagari Transliteration scheme for Hindi and Marathi.
"TAMTRANS":  Baraha Tamil Transliteration scheme.
"TELTRANS":  Baraha Telugu Transliteration scheme.
"MALTRANS":  Baraha Malayalam Transliteration scheme.
"GUJTRANS":  Baraha Gujarati Transliteration scheme.
"GURTRANS":  Baraha Gurumukhi Transliteration scheme.
"BENTRANS":  Baraha Bengali Transliteration scheme.
"ORITRANS":  Baraha Oriya Transliteration scheme.

ANSI:
ANSI is the 8 bit code that is generally used by various applications for displaying text. The ANSI output by a plugin is specific to the TrueType font encoding supported by that plugin. The ANSI output from one plugin can't be displayed using fonts of other plugin.
 
The BRH Kannada font follows the Karnataka Govt standard. The BRH Tamil font follows the TSCII 1.7 standard. The fonts of Devanagari, Telugu, Malayalam, Gujarati, Gurumukhi, Bengali and Oriya follow Baraha's proprietary standard. For a given language, any other fonts that follow the same font encoding as the above fonts can be used to display the text.
 
UNICODE:
UNICODE is the 16 bit code which is accepted worldwide as the standard for storing multi-lingual data. Every plugin understands the codes in its own slot in the UNICODE range only. For example, brh_kanapi.dll plugin understands UNICODE in the range 0x0C80-0x0CFF only. If Tamil UNICODE is passed to this plugin for conversion, it will be ignored.
 
UTF-8:
UTF-8 is an encoding used to store UNICODE data using 8 bit codes.

COLLCODE:
COLLCODE is the 16 bit code used by Baraha for sorting the text in Indian language alphabetical order. In the BrhCompare, BrhCompareEx API functions, the strings are first converted into COLLCODE and then binary compared. The COLLCODE is declared in the collcode.h header file. Every script is given a range of 256 codes in the private use area of the UNICODE (0xE000 - 0xF900). By doing this, we avoid clash with the real UNICODE codes.

When the strings stored in COLLCODE format are sorted by the database in the binary mode, it will return the strings in the correct sort order of the Indian languages. We know that the databases support different types of sorting such as English Case Sensitive, English Case Insensitive, Binary, e.t.c. Normally databases are setup to use English case insensitive sort order, which gives the best support for sorting the English text. If we use binary sort order for the sake of Indian languages, it affects the way English text is sorted. But, there is a technique using which we can obtain the proper sort order for Indian language and the English text at the same time. The technique is to store the COLLCODE in the form of hexadecimal strings. Every COLLCODE which is 2 bytes, is converted to 4 bytes (“0000”-“FFFF”) of collation string. It is evident that, irrespective of the sorting mechanism used by the databases, hexadecimal strings are always sorted properly. Storing text in COLLCODE as hex strings in the databases has advantages and disadvantages.

The advantage is that we don't have to setup binary sort order for the database. Setting up binary sort option for the database affects the way in which English strings are sorted. Instead, we can setup the database to use English Case Sensitive, or English Case Insensitive sort orders. This way we can ensure proper sort order for both Indian language & English text fields at the same time. The disadvantage is that it requires more space for storing the data. For example, the word "baraha" requires just 6 bytes when stored in BRHCODE format. Where as, it requires 24 bytes when stored in COLLCODE hex strings format. Therefore, we should use this technique only for the fields that are used in the sort criteria. It is not necessary to use this format for the fields, which are not used as sort criteria.

LATIN:
LATIN is the 8 bit code which when formatted with BRH Latin font, will display latinized Indian language (ISO15919).


The Visual Basic sample programs "BarahaConvertVB", "people_db" and VC++ sample programs BarahaConvert, BarahaSort, show the usage of Baraha Script Processing API.

Kannada script processing API reference
Devanagari script processing API reference
Tamil script processing API reference
Telugu script processing API reference
Malayalam script processing API reference
Gujarati script processing API reference
Gurumukhi script processing API reference
Bengali script processing API reference
Oriya script processing API reference


Notes:

BarahaConvertVB sample uses Forms 2.0 (Fm20.dll) ActiveX Controls for demonstrating UNICODE functionality. You should have these controls installed in your system. See MSDN article Q224305 for more information.

BarahaConvert VC++ sample uses a UNICODE edit window for showing the UNICODE conversion functionality. If you try to run this sample program in Windows 95/98 systems, you may not see the UNICODE window at all. On Windows NT, you will see the UNICODE window, but you may not see the Indian language text properly.

See also: Indian language UNICODE support in Windows operating systems.