Está en la página 1de 2

BSCSHonors Program

CS-402

GIFT University Gujranwala

Course: Data Mining


Resource Person: Nadeem Qaisar Mehmood

14-November-2012

(Fall 2012)

ASSIGNMENT1 (Data)

Total Points: 30 Submission Due: Monday 19th November, 2012

Instructions: Please Read Carefully!



This is a group assignment. A group must have at most 3 members only.

Each group member must pass the viva for this assignment to get marks. The viva will be conducted after the submission of this assignment. Try to write in your own words, you can take help from each other and can use the reference books to solve the assignment; however the marking shall be done after taking viva of the assignment. You will be advised through email to come for viva later on.
You are expected to submit this assignment as a single .zip file containing all the source files of your implementation. This zip file must be named as: CS402-AS01-(ROLLNUMBER1)(ROLLNUMBER2).zip and nothing else!

Assignment is to be submitted electronically via email at nadeemqaisar@gift.edu.pk till Monday 19th November, 2012. a. The subject of the email should be: CS402-AS01-(ROLLNUMBER1) (ROLLNUMBER2) and nothing else! b. Attach the zip file to the email. c. Keep the body of the email as empty. d. Send a copy of your email to you other group member. There will be a 30% penalty against late submissions. No assignment will be submitted after Monday 20th November, 2012 16:00 hrs.

NOTE 01: Any illegal alteration with the data set and compiling a bad report will result in a state forward zero in the assignment marks. NOTE 02: You must pass the subsequent viva of this assignment to actually have any marks for this assignment.

Page 1 of 2

BSCSHonors Program

CS-402

GIFT University Gujranwala

Data Preparation
Introduction about the data: The data consist of evaluations of teaching performance over three regular semesters and two summer semesters of teaching assistant (TA) assignments.

You have to explore a data set provided to you with this assignment and apply the following approaches: Data Selection Preprocessing o Remove noise, outliers, missing values o Select features, reduce dimensions Also try to use following summary statistics to describe the data o Mean, median, standard deviation, measures of mean and variations, standard distribution curse, correlation, proximity measurement, frequency curve, percentile Also discuss how you can use this data for Data mining and in what kind of problems you can use it to perform Modeling
Some Information about the Data Make it sure that you understand the data and the report reflect correct and concise information about the data. The report will be a 3-t-4 pages MS word document only that will describe the characteristics of the data. Tools: You are allowed to use any statistical tool, however as the assignment is on the fundamental bases to revise the basic concepts that leads to data exploration and information discovery therefore you are even encouraged to use excel for such exploration purposes.

END OF ASSIGNMENT

Page 2 of 2

También podría gustarte