The lemmatization of nouns in African languages with special reference to Sepedi and Cilubà

Original Articles

The lemmatization of nouns in African languages with special reference to Sepedi and Cilubà

DOI: 10.1080/02572117.1999.10587405
Author(s): D.J. Prinsloo Department of African Languages, South Africa , Gilles-Maurice de Schryver Department of African Languages, South Africa

Abstract

The aim of this article is to analyze traditional approaches to the lemmatization of nouns on the macrostructural level in African languages against the background of the user- perspective, the physical limitations on volume, the consideration of currently available dictionaries and the utilization of a corpus. Five basic lemmatization strategies are given due attention, namely lemmatizing nouns under stems, according to morpho-lexical fields, under both singular and plural forms, solely under singular forms, or finally on first or third letters. It is shown that any modern strategy that aims at avoiding the shortcomings and pitfalls of these five basic types whilst at the same time exploiting their virtues, will have to give full weight to target users' desires. A means to incorporate those desires is illustrated through a small dictionary project to which the concept of ‘simultaneous feedback’ is applied. In conclusion four rules of thumb and ten general guidelines are presented.

Get new issue alerts for South African Journal of African Languages