Baraha SDK overview
The functionality of Baraha software is available in the form of SDK (Software Development Kit) for building custom Kannada applications using ASP, Visual Basic & Visual C++ programming tools. The Baraha SDK is implemented in several DLLs in the form of API (Application Programming Interface) functions and also as COM (Component Object Model) objects. The DLLs that ship with the Baraha software and the Baraha SDK are exactly the same. In fact, Baraha is built using the Baraha SDK.
Benefits of SDK:
Baraha SDK can be split into 2 sets of API.
1) Baraha Direct.
2) Baraha Script Processing
Baraha Direct API:
Baraha Direct is a utility of the Baraha software using which one can type Indian language text directly in other application using Baraha's popular transliteration keyboard. This is a tremendous feature, which allows the use of Indian languages as easy and simple as using English, in applications such as MS Word, PageMaker, Notepad, or custom VB/VC++ applications.
The Baraha Direct functionality is implemented in a Dynamic Link Library called brh_direct.dll. The BarahaDirect.exe is actually a very thin application, calling API functions in the brh_direct.dll. When developers use brh_direct.dll in their VB/VC++ applications, the Barah Direct becomes part of their application. Therefore, for inputting Indian language text, the end-users need not have the Baraha software installed in their system. When developers ship their applications, they have to distribute the brh_direct.dll, and the language dlls (brh_kanapi.dll, brh_devapi.dll e.t.c.) and the required fonts only.
The Visual Basic sample programs "BarahaDirectVB", and "people_db" show the usage of the Baraha Direct API.
Baraha script processing API:
The various scripts supported by Baraha are implemented using plugins. Each script has its own plugin. The plugins support conversion between various data formats. The text of any given language can be stored in any of the formats based on the need and context. All the plugins contain the same API functions for converting the data from one format to another. The plugins also contain API functions for comparing the strings stored in any of these formats. In addition to API functions, the plugins contain COM objects, which expose the same functionality through COM interfaces. The COM objects are useful in ASP/VB Script programming for building web-based applications.
| Script | Kannada | Devanagari | Tamil | Telugu | Malayalam | Gujarati | Gurumukhi | Bengali | Oriya |
| Languages | Kannada Konkani Tulu |
Hindi Marathi Sanskrit Konkani Kashmiri Nepali Sindhi |
Tamil | Telugu | Malayalam | Gujarati | Punjabi | Bengali Assamese Manipuri |
Oriya |
| Plugin DLL | brh_kanapi.dll | brh_devapi.dll | brh_tamapi.dll | brh_telapi.dll | brh_malapi.dll | brh_gujapi.dll | brh_gurapi.dll | brh_benapi.dll | brh_oriapi.dll |
| Library | brh_kanapi.lib | brh_devapi.lib | brh_tamapi.lib | brh_telapi.lib | brh_malapi.lib | brh_gujapi.lib | brh_gurapi.lib | brh_benapi.lib | brh_oriapi.lib |
| Headers Files | brh_kanapi_api.h brhcode.h collcode.h |
brh_devapi_api.h brhcode.h collcode.h |
brh_tamapi_api.h brhcode.h collcode.h |
brh_telapi_api.h brhcode.h collcode.h |
brh_malapi_api.h brhcode.h collcode.h |
brh_gujapi_api.h brhcode.h collcode.h |
brh_gurapi_api.h brhcode.h collcode.h |
brh_benapi_api.h brhcode.h collcode.h |
brh_oriapi_api.h brhcode.h collcode.h |
| COM ProgID | Brh_Kanapi.kanapi50 | Brh_Devapi.devapi50 | Brh_Tamapi.tamapi50 | Brh_Telapi.telapi50 | Brh_Malapi.malapi50 | Brh_Gujapi.gujapi50 | Brh_Gurapi.gurapi50 | Brh_Benapi.benapi50 | Brh_Oriapi.oriapi50 |
| Data Formats | KANTRANS, KAGAPA BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
DEVTRANS1, DEVTRANS2 BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
TAMTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
TELTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
MALTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
GUJTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
GURTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
BENTRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
ORITRANS BRHCODE COLLCODE ANSI UNICODE UTF-8 LATIN |
| TrueType Fonts | BRH Kannada BRH Vijay BRH Kailasam BRH Bengaluru BRH Sirigannada BRH Amerikannada BRH Kasturi (Karnataka Govt. Standard fonts) |
BRH Devanagari BRH Kalidasa |
BRH Tamil (TSCII 1.7 compatible fonts) |
BRH Telugu | BRH Malayalam | BRH Gujarati | BRH Gurumukhi | BRH Bengali | BRH Oriya |
BRHCODE:
Baraha uses a common 8 bit internal code called BRHCODE for all the languages it supports.
The plugins use the BRHCODE as the intermediate stage when converting data from one format
to another. Since all the plugins support BRHCODE, it is possible to convert text from one
script to another script by using BRHCODE as the intermediate stage. BRHCODE is also the
efficient way to store the Indian language text in the database/files. The BRHCODE is
declared in the brhcode.h header file.
Transliteration Code:
The KANTRANS, KAGAPA, DEVTRANS1, DEVTRANS2, DEVTRANS3, TAMTRANS, TELTRANS, MALTRANS,
GUJTRANS, GURTRANS, BENTRANS and ORITRANS formats specify the human readable encoding
using the transliteration rule of the corresponding languages. The transliteration rules
of the various languages are almost the same. There are only slight differences between
the different languages.
"KANTRANS": Baraha Kannada Transliteration scheme.
"KAGAPA": KGP Keyboard.
"DEVTRANS1": Baraha Devanagari Transliteration scheme for Sanskrit.
"DEVTRANS2": Baraha Devanagari Transliteration scheme for Hindi and Marathi.
"TAMTRANS": Baraha Tamil Transliteration scheme.
"TELTRANS": Baraha Telugu Transliteration scheme.
"MALTRANS": Baraha Malayalam Transliteration scheme.
"GUJTRANS": Baraha Gujarati Transliteration scheme.
"GURTRANS": Baraha Gurumukhi Transliteration scheme.
"BENTRANS": Baraha Bengali Transliteration scheme.
"ORITRANS": Baraha Oriya Transliteration scheme.
ANSI:
ANSI is the 8 bit code that is generally used by various applications for displaying text.
The ANSI output by a plugin is specific to the TrueType font encoding supported by that
plugin. The ANSI output from one plugin can't be displayed using fonts of other plugin.
The BRH Kannada font follows the Karnataka Govt standard. The BRH Tamil font follows the
TSCII 1.7 standard. The fonts of Devanagari, Telugu, Malayalam, Gujarati, Gurumukhi,
Bengali and Oriya follow Baraha's proprietary standard. For a given language, any other
fonts that follow the same font encoding as the above fonts can be used to display the
text.
UNICODE:
UNICODE is the 16 bit code which is accepted worldwide as the standard for storing
multi-lingual data. Every plugin understands the codes in its own slot in the UNICODE
range only. For example, brh_kanapi.dll plugin understands UNICODE in the range
0x0C80-0x0CFF only. If Tamil UNICODE is passed to this plugin for conversion, it will be
ignored.
UTF-8:
UTF-8 is an encoding used to store UNICODE data using 8 bit codes.
COLLCODE:
COLLCODE is the 16 bit code used by Baraha for sorting the text in Indian
language alphabetical order. In the BrhCompare, BrhCompareEx API functions, the strings
are first converted into COLLCODE and then binary compared. The COLLCODE is declared in
the collcode.h header file. Every script is given a range of 256 codes in the private use
area of the UNICODE (0xE000 - 0xF900). By doing this, we avoid clash with the real UNICODE
codes.
When the strings stored in COLLCODE format are sorted by the database in the binary mode, it will return the strings in the correct sort order of the Indian languages. We know that the databases support different types of sorting such as English Case Sensitive, English Case Insensitive, Binary, e.t.c. Normally databases are setup to use English case insensitive sort order, which gives the best support for sorting the English text. If we use binary sort order for the sake of Indian languages, it affects the way English text is sorted. But, there is a technique using which we can obtain the proper sort order for Indian language and the English text at the same time. The technique is to store the COLLCODE in the form of hexadecimal strings. Every COLLCODE which is 2 bytes, is converted to 4 bytes (0000-FFFF) of collation string. It is evident that, irrespective of the sorting mechanism used by the databases, hexadecimal strings are always sorted properly. Storing text in COLLCODE as hex strings in the databases has advantages and disadvantages.
The advantage is that we don't have to setup binary sort order for the database. Setting up binary sort option for the database affects the way in which English strings are sorted. Instead, we can setup the database to use English Case Sensitive, or English Case Insensitive sort orders. This way we can ensure proper sort order for both Indian language & English text fields at the same time. The disadvantage is that it requires more space for storing the data. For example, the word "baraha" requires just 6 bytes when stored in BRHCODE format. Where as, it requires 24 bytes when stored in COLLCODE hex strings format. Therefore, we should use this technique only for the fields that are used in the sort criteria. It is not necessary to use this format for the fields, which are not used as sort criteria.
LATIN:
LATIN is the 8 bit code which when formatted with BRH Latin font, will
display latinized Indian language (ISO15919).
The Visual Basic sample programs "BarahaConvertVB", "people_db" and VC++ sample programs BarahaConvert, BarahaSort, show the usage of Baraha Script Processing API.
Kannada script processing API reference
Devanagari script processing API reference
Tamil script processing API reference
Telugu script processing API reference
Malayalam script processing API reference
Gujarati script processing API reference
Gurumukhi script processing API reference
Bengali script processing API reference
Oriya script processing API reference
Notes:
BarahaConvertVB sample uses Forms 2.0 (Fm20.dll) ActiveX Controls for demonstrating UNICODE functionality. You should have these controls installed in your system. See MSDN article Q224305 for more information.
BarahaConvert VC++ sample uses a UNICODE edit window for showing the UNICODE conversion functionality. If you try to run this sample program in Windows 95/98 systems, you may not see the UNICODE window at all. On Windows NT, you will see the UNICODE window, but you may not see the Indian language text properly.
See also: Indian language UNICODE support in Windows operating systems.