Modification of KNN Algorithm
K-Nearest Neighbor (KNN) classification is one of the most fundamental and simple classification methods. It is among the most frequently used classification algorithm in the case when there is little or no prior knowledge about the distribution of the data. In this paper a modification is taken to improve the performance of KNN. The main idea of KNN is to use a set of robust neighbors in the training data. This modified KNN proposed in this paper is better from traditional KNN in both terms: robustness and performance. Inspired from the traditional KNN algorithm, the main idea is to classify an input query according to the most frequent tag in set of neighbor tags with the say of the tag closest to the new tuple being the highest. Proposed Modified KNN can be considered a kind of weighted KNN so that the query label is approximated by weighting the neighbors of the query. The procedure computes the frequencies of the same labeled neighbors to the total number of neighbors with value associated with each label multiplied by a factor which is inversely proportional to the distance between new tuple and neighbours. The proposed method is evaluated on a variety of several standard UCI data sets. Experiments show the significant improvement in the performance of KNN method.
L. I. Kuncheva, Combining Pattern Classifiers, Methods and Algorithms, New York: Wiley, 2005
H. Parvin, H. Alizadeh, B. Minaei-Bidgoli and M. Analoui, A n Scalable Method for Improving the Performance of Classifiers in Multiclass Applications by Pairwise Classifi-ers and GA‖, In Proc. of the Int. Conf. on Networked Computing and advanced In-formation Management byIEEE CS, (NCM08), Sep. 2008
H. Parvin, H. Alizadeh, M. Moshki, B. Minaei- Bidgoli and N. Mozayani, D ivide & Conquer Classification and Optimization by Genetic Algorithm‖, In Proc. of the Int. Conf. on Convergence and hybrid Information Technology by IEEE CS, (ICCIT08), Nov. 11-13, 2008
H. Parvin, H. Alizadeh, B. Minaei-Bidgoli and M. Analoui, CC HR: Combination of Classifiers using Heuristic Retraining‖, In Proc. of the Int. Conf. on Networked Computing and advanced Information Management by IEEE CS, (NCM 2008), Korea Sep. 2008.
H. Parvin, H. Alizadeh and B. Minaei-Bidgoli, New Approach to Improve the Vote-Based Classifier Selection‖, In Proc. of the Int. Conf. on Networked Computing and ad-vanced Information Management by IEEE CS, (NCM 2008), Korea Sep. 2008
H. Alizadeh, M. Mohammadi and B. Minaei-Bidgoli, Neu ral Network Ensembles using Clustering Ensemble and Genetic Algorithm‖, In Proc. of the Int. Conf. on Convergence and hybrid Information Technology by IEEE CS, (ICCIT08), Nov. 11-13, 2008, Busan, Korea.
H. Parvin, H. Alizadeh and B. Minaei-Bidgoli, A New Method for Constructing Classifier Ensembles, International Journal of Digital Content: Technology and its Applica-tion, JDCTA, ISSN: 1975-9339, 2009 (in press).
H. Parvin, H. Alizadeh and B. Minaei-Bidgoli,Using Clustering for Generating Di-versity in Classifier Ensemble, Internation-Journal of Digital Content: Technology and its ApplicationJDCTA, ISSN: 1975-9339, 2009 (in press).
B.V. Darasay, Nearest Neighbor pattern clas-sification techniques, Las Alamitos, LA: IEEE CS Press
E. Fix, J.L. Hodges, Discriminatory analysis, nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas, 1951
Cover, T.M., Hart, P.E. Nearest neighbor pat-tern classification. IEEE Trans. Inform. The-ory, IT-13(1):21–27, 1967.
Hellman, M.E. The nearest neighbor classifi-cation rule with a reject option. IEEE Trans. Syst. Man Cybern., 3:179–185, 1970.
K. Fukunaga,, L. Hostetler, k-nearest-neighbor bayes-risk estimation. IEEE Trans. Information Theory, 21(3), 285-293, 1975.
S.A. Dudani, The distance-weighted k-nearestneighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6:325–327, 1976.
T. Bailey, A. Jain, A note on distance-weighted knearest neighbor rules. IEEE Trans. Systems, Man, Cybernetics, Vol. 8, pp. 311-313, 1978.
S. Bermejo, J. Cabestany, Adaptive soft k-nearestneighbour classifiers. Pattern Recog-nition, Vol. 33, pp. 1999-2005, 2000.
Jozwik, A learning scheme for a fuzzy k-nn rule. Pattern Recognition Letters, 1:287–289, 1983.
J.M. Keller, M.R. Gray, J.A. Givens, A fuzzy k-nn neighbor algorithm. IEEE Trans. Syst. Man Cybern., SMC-15(4):580–585, 1985.
K. ITQON, Shunichi and I. Satoru, Improv-ing Performance of k-Nearest Neighbor Clas-sifier by Test Features, Springer Transactions of the Institute of Electronics, Information and Communication Engineers 2001.
R. O. Duda, P. E. Hart and D. G. Stork, Pat-tern Classification , John Wiley & Sons, 2000.
Hamid Parvin,Hoseinali Alizadeh,Behrouz Minati A Modification on K-Nearest Neigh-bor Classifier GJCST Vol.10 Issue 14 (Ver.1.0) November 2010 UCI data set used:
1.default of credit card clients Data Set http://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
Soybean Data Set
http://archive.ics.uci.edu/ml/datasets/Soybean+%28Large%29
Teaching Assistant Evaluation Data Set
http://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Evaluation
Iris Data Set