Abstract
This project demonstrates a simple and pragmatic approach for the creation of comparable corpora using Cross-Lingual Information Retrieval (CLIR). CLIR research is becoming more and more important for Information Retrieval (IR) on the Web as it is a truly multilingual environment and CLIR is necessary for global information exchange and knowledge sharing .In this project, the aim is to identify the same news story written in multiple languages (a problem of cross-language news story detection). For example, in a multilingual environment, such as India, where the same news story is covered in multiple languages, a reader might want to refer to the local language version of a news story and these are also rich sources of both parallel and comparable text. In the paper we have followed the corpus based approach for the retrieval of most relevant news.