Downloads

A Survey of Multilingual Document Clustering

Authors

Mrs.Kavita Moholkar1

Abstract

The amount of multilingual documents generated on internet, is increasing day by day. Multilingual document clustering (MDC) is a technique of classifying documents in different languages. Classification of documents for the languages without labeled training data set is a major challenge. Two major approaches used till date are machine translation of documents for classification and use bilingual dictionaries for effective translation of trained classification models. This paper surveys various MDC challenge and techniques. The major focus is on the problem of translating documents and classifying it semantically.

Article Details

Published

2017-04-10

Section

Articles

How to Cite

A Survey of Multilingual Document Clustering. (2017). International Journal of Engineering and Computer Science, 6(4). https://ijecs.in/index.php/ijecs/article/view/3633