Data Mining Project Part 2

 Continue on your data mining adventure by doing some classifications 

1. Conintue with your original dataset. Make any changes that were suggested and link or add to the original document. All infor should be accessible from previous portions of the project. 

2. Utilizing technology classify on one of your categorical variables

       (a) Use a simple decision tree to classify your data to a categorical variable. Create a visualization of the decision tree. Make sure to produce a confusion matrix.

       (b) Repeat your decision tree but use a cross validation technique to test the accuracy. Examine variable importance and be certain to comment on the most important variables.                                               

        (c) Examine the importance of each feature using a chi-square statistic or gain ratio. Create a visualization. Does this follow what your decision trees showed? 

3. Write your report!

          (a) Include all items requested above. Include graphs and text about each.

          (b) Discuss the cross validation process chosen. Discuss whether each model is overfit and how you might tell.

           (c) Discussion confusion matrix and what it might mean for making certain predictions in your project.

          The report will be graded by the following criteria: 

ˆ Statistical analysis – 30 points. The statistical tests are all provided.

 ˆ Graphical Representations – 30 points. The requested graphical displays are made and included in report.

 ˆ Continuation – 15 points. The report is a continuation of the previous report. This may include links or just additional to Part 1. In any case the introduction to your data should be available and any necessary fixes made.

 ˆ Interpretations – 15 points. The results of the statistical analysis are clearly explained and interpreted in the context of the problem. The conclusions accurately reflect the analysis and are well supported. 

ˆ Writing quality – 10 points. The paper is readable and clearly written. There are few, if any, grammatical or spelling errors and they do not interfere with the clarity of the paper. Numbering on this document is not used in the report in anyway 

Don't hesitate - Save time and Excel

Are you overwhelmed by an intense schedule and facing difficulties completing this assignment? We at GrandHomework know how to assist students in the most effective and cheap way possible. To be sure of this, place an order and enjoy the best grades that you deserve!

Post Homework