Abstract
In the recent past, it has been found that the web is used as a tool by radical or extremist groups and users to perform several kinds of mischievous acts with concealed agendas and promote their ideologies in a sophisticated manner. Some of the web forums are specially being used for open discussions on critical issues influenced by radical thoughts. We propose an application of collocation theory to identify radically influential users in web forums. The radicalness of a user is captured by a measure based on the degree of match of the commented posts with a threat list. The experiments are conducted on a standard data set to find radical and infectious threads, members, postings, ideas, and ideologies. Proposed system to rank the user on text and image based similarity measures. We make the following key contributions in proposed system: An application of analyze the data it may be text data or image data. If it is text data it will go through preprocessing stages like stop word removal, suffix removal, then by cosine similarity function it check the similarity with threat list then decide whether that user is radical or not. If it image data, if it contain text data then it separate text from image by OCR technique. Send that text to text analysis and image goes through image preprocessing like image filtering, EHD to take aggregate features, using similarity measures it check similarity with training data set. Finally after measures of radicalness of user, it ranks the users by PageRank algorithm.