Staff allotment is a crucial job in industry sector. Human resource plays leading rule to achieve success of the industry. The Data mining is analytical method to allot the right job for right man. This is not easiest job to match the work schedule in large industrial sector. There is a large databases have to be maintain each and every department. There must be a need for perfect knowledge discovery model to retrieve the information about the staff. From that the management can make perfect schedule. The data mining approaches will make suitable to make the work schedule. how can be Data mining techniques and k mean algorithm process the effective job will be the theme of this article.
Data mining may be regarded as an evolving approach to data analysis in very large databases that could become a useful tool to management professionals. Data mining involves extracting knowledge based on patterns of data in very large databases. Yet, data mining goes beyond simply performing data analysis on large data sets. Organizations that employ thousands of employees and track a multitude of employment-related information might find valuable information patterns contained within their databases to provide insights in such areas as employee retention and compensation planning. To develop the staff planning and allotment, the k mean clustering algorithm can be used for this job. K mean is a method is popularly for cluster analysis in data mining .k means clustering aims to part ion n observations into k clusters in which each observation belongs to the cluster with nearest mean serving as prototype of cluster. k mean algorithm is can be grouped employees as a different cluster with nearest mean.
The most distinct characteristic of data mining is that it deals with very large and complex data sets (gigabytes or even terabytes). The data sets to be mined often contain millions of objects described by tens, hundreds or even thousands of various types of attributes or variables (interval, ratio, binary, ordinal, nominal, etc.). This requires the data mining operations and algorithms to be scalable and capable of dealing with different types of attributes.
However, most algorithms currently used in data mining do not scale well when applied to very large data sets because they were initially developed for other applications than data mining which involve small data sets. In terms of clustering, we are interested in algorithms which can efficiently cluster large data sets containing both numeric and categorical values because such data sets are frequently encountered in data mining applications. clustering large data sets or can handle large data sets efficiently but are limited to numeric attributes. K means is the one of the un supervised analysis. Which aims to partition n observations into k clusters in which each observation belongs to cluster with nearest mean.
The term k means was first used james Macqueen in 1967.The standard algorithm was first proposed by stuarliud used this pulse code modulation bell labs used this from1982 to still now. More efficient version proposed and published n Fortran by Haritigan and Wonk in1975/79.in1957.
DBSCAN (Ester et al., 1996) and BIRCH (Zhang et al., 1996). These algorithms are often revisions of some existing clustering methods. By using some carefully designed search methods (e.g., randomised search in CLARANS), organising structures (e.g., CF Tree in BIRCH) and indices (e.g., R¤-tree in DBSCAN), these algorithms have shown some significant performance improvements in clustering very large data sets. Again, these
algorithms still target on numeric data and cannot be used to solve massive categorical data clustering problems.
The need for research with respect to the k mean is used in this work. These algorithms with objectives and methodologies have been stated in broad way in this research. This paper describes explains the broader way of knowledge extraction from k mean . in the previous system have not clear idea about cluster . Key management and Origin of the seed(employee detail).The key is more essential that is the mean of the clusters. Using this we can easily form the
Need of proposed work:-
Search of relevant records or similar data search is a most popular function of database to obtain knowledge. There are certain similar records that we want to fall in one category or form one cluster. Query redirection is one of the good approaches to retrieve data from different databases on different servers.
K mean suits Right manpower to Right job:
The -means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Here k mean used to categorise into employee data set in top different clusters like whole employees section employees ,Permanent employees , CAP ( company apprentice Trainee Programmer) Using this cluster how the machine can run without stop. Every Production System have to be operated a full day without stop .The section employee who is the permanent employees of the company. They are well trained to operate
The system. Some time there must be lack of trained people and if they went for a holiday The CAP should be will that gap instead of trained employee. For that k mean clustering is used to allot the job.
Permanent System operatorsPermanent System operators
Employee data base of the companyEmployee data base of the companySection employeesSection employees
Machine prograssMachine prograssIf PES Not availabeeIf PES Not availabeeUsingk mean algorithmUsingk mean algorithm
TOTTOTIf caps not available TOTIf caps not available TOT
Figure shows the progress of the system
Extraction of right employee for right job using k means clustering Algorithm.
In this approach we define k sets one of the each cluster k<=n. The next step is to organise data in appropriate data set and associate it to nearest set. here again recalculate knew sets. As the loop has been generated until right solution to get.
Place Initial Actual number of Employees as a cluster under one mean that is k mean .1. Place Initial Actual number of Employees as a cluster under one mean that is k mean .
Then place the permanent employee in another cluster using same kmean in before step. That means the have same or nearest the actual previous mean. Name the cluster Pset2. Then place the permanent employee in another cluster using same kmean in before step. That means the have same or nearest the actual previous mean. Name the cluster Pset
place again new cluster CAP (Company apprentices programmer) this cluster has different data That cluster name is C set. Then reform the CAP(Company apprentices programmer) new cluster in to permanent employee kmean. That means who know the particular job that man power pick out from the cluster then they will form new cluster.3. place again new cluster CAP (Company apprentices programmer) this cluster has different data That cluster name is C set. Then reform the CAP(Company apprentices programmer) new cluster in to permanent employee kmean. That means who know the particular job that man power pick out from the cluster then they will form new cluster.
Loop have will be generated in the cluster.4. Loop have will be generated in the cluster.
Using Firs in First out the p set data will be allotted that means the permanent employee will be allotted for the job. when the actual number cannot be satis fied the next CAP(Company apprentices programmer) will be allotted .5. Using Firs in First out the p set data will be allotted that means the permanent employee will be allotted for the job. when the actual number cannot be satis fied the next CAP(Company apprentices programmer) will be allotted .
this procedure will be repeated until the cluster coverage.6. this procedure will be repeated until the cluster coverage.
CAP set E P set then merge or join
|CAPS| V |Regular|
Caps i < Regular j
Explanation of the Algorithm:
Here we first group or form the set or cluster using particular point or kmean. First group whole employees of the section. Then form another cluster for permanent employees. for example group permanent employees as who belongs to particular section. .How they are efficient to run particular program in the machine. Then form the next cluster for CAPS. Again reform the CAPs group using the same k mean for permanent employee . that means who know to run a program in a particular machine. Join or merge this group in to pset than the loop will be continued according to the actual need for the section.Using First in First out Permanent Employee will be allotted for the job The CAPS will be allotted. In the following table describes over all need of the proposed work . first we form a rough cluster or approximate number of seeds (that means how many men need for the job).Then divide in clusters for the job . There must be gap between need and regular employees cluster. Then we need CAPs to fill the gap. Again cluster reformed using with CAPS cluster. Then find a Centetriod Of the cluster then reform the cluster gain this cluster much match need of the plan. Repeate the process until we get the process.
|FIG - MFG Man power status @||11-Jul-15|
|AFTER 14 DAYS|
K mean is the simplest algorithm. Here in this paper is try to define the simplest problem .Manpower allotment with in the industry sector . so here the number of cluster can be specified as an input to the algorithm, so it is a easiest procedure with in the sector. When large scale of employee selection
This approach can not be suited . because there is no right number of cluster. The cluster mean can be changed in the employee selection ,experience reference e etc. we cannot fit into particular cluster.
- Anderberg MichaelR. CONCEPTUAL PROBLEMS IN CLUSTER ANALYSIS 1973;:10-24. Google Scholar
- Ball GeoffreyH, Hall DavidJ. A clustering technique for summarizing multivariate data 1967-mar;:153-155. Google Scholar
- Bezdek JamesC. Objective Function Clustering 1981;:43-93. Google Scholar
- Bobrowski L, Bezdek JC. c-means clustering with the l/sub l/ and l/sub infinity / norms 1991;:545-554. Google Scholar
- Bobrowski L, Bezdek JC. c-means clustering with the l/sub l/ and l/sub infinity / norms 1991;:545-554. Google Scholar
- Cormack RM. A Review of Classification 1971. Google Scholar
- Dubes RichardC. How many clusters are best? - An experiment 1987;:645-663. Google Scholar
- Dubes Richard, Jain AnilK. Validity studies in clustering methodologies 1979-jan;:235-254. Google Scholar
- Xu Xiaowei, Ester M, Kriegel H.-P., Sander J. A distribution-based clustering algorithm for mining in large spatial databases . Google Scholar
- WADDINGTON D. Industrial chemistry By R. W. Thomas and P. Farago. Pp vi $\mathplus$ 140. Heinemann Educational Books Ltd. 1973. E2a90 1974-jan. Google Scholar
- Fisher DouglasH. Knowledge acquisition via incremental conceptual clustering 1987-sep;:139-172. Google Scholar
- Gowda KChidananda, Diday E. Symbolic clustering using a new dissimilarity measure 1991-jan;:567-578. Google Scholar
- Gower JC. A General Coefficient of Similarity and Some of Its Properties 1971-dec. Google Scholar
- Jie Li, Xinbo Gao, Li-cheng Jiao. A CSA-based clustering algorithm for large data sets with mixed numeric and categorical values . Google Scholar
- Jollois François-Xavier, Nadif Mohamed. Clustering Large Categorical Data 2002;:257-263. Google Scholar
- Perna Janet. Message from the General Manager, Data Management Solutions, IBM Software Group 2002;:1-1. Google Scholar
- Sarle WarrenS. Algorithms for Clustering Data 1990-may;:227-229. Google Scholar
- Kaufman Leonard, Rousseeuw PeterJ. Finding Groups in Data 1990. Google Scholar
- KlĂśsgen Willi. Knowledge discovery in databases and data mining 1996;:623-632. Google Scholar
- Ziegel EricR, Fayyad UsamaM, Piatetski-Shapiro Gregory, Smyth Padhraic, Uthurusamy Ramasamy. Advances in Knowledge Discovery and Data Mining 1998-feb. Google Scholar
- Kodratoff Y, Tecuci G. Learning based on conceptual distance 1988;:897-909. Google Scholar
- Lebowitz Michael. Experiments with incremental concept formation: UNIMEM 1987-sep;:103-138. Google Scholar
- Yeh HC. Some Properties of the Homogeneous Multivariate Pareto (IV) Distribution 1994-oct;:46-53. Google Scholar