EBUS537 Data Mining and Machine Learning Assignment 1 | University of Liverpool

Category	Assignment	Subject	Computer Science
University	University of Liverpool	Module Title	EBUS537 Data Mining and Machine Learning

Assignment Requirements:

Questions

You are given a training dataset and a testing dataset, which will be provided in electronic form in Canvas (“myCarTrainDataset_2024.csv” and “myCarTestDataset_2024.csv”). Both datasets do not require data cleaning for simplicity. Each data object is a record of a car.

The attributes of the cars in the datasets are described below:

“price”: purchasing price  “doors”: number of doors
“persons”: car capacity to accommodate persons
“boot”: the size of luggage boot  “accept”: car acceptability. This attribute is treated as the class label.

You are required to:

Use the training dataset, conduct relevant data exploration to obtain initial insights into the dataset with appropriate explanations and interpretations.
Use the training dataset, and apply the Hunt’s Algorithm together with the Greedy Strategy using the Gini impurity measure to build a fully-grown decision tree to predict whether a car is acceptable or not. If the attribute has multiple attribute values, you are required to use multiway split (do not use binary split). Leaf nodes should be declared as a single class label by applying the voting system as explained during the lectures (do not use probability/fraction). The selection of the attribute to split the decision tree should be explicitly explained and justified using the calculated results for the entire tree. The sample calculation processes and explanations should be provided as appropriate.
After building the fully-grown decision tree in the previous step, please post-prune the sub-trees if all of its leaf nodes have the same class label if applicable. Test the post pruned decision tree using the test dataset and produce the confusion matrix. Interpret the obtained results in the case context.
Beyond the above context, identify a case study of applying decision treebased classification methods in practice. Discuss the identified case study in relation to the CRISP-DM model. Support your arguments with relevant references

Need Help With This Assignment?

Our verified experts deliver 100% original, plagiarism-free work to your exact brief and marking criteria. Submit free — compare quotes — choose your expert.

No credit card · No commitment · First quote in minutes

You are given a training dataset and a testing dataset, which will be provided in electronic form in Canvas (“myCarTrainDataset_2024.csv” and “myCarTestDataset_2024.csv”). Both datasets do not require data cleaning for simplicity. Each data obje

EBUS537 Data Mining and Machine Learning Assignment 1 | University of Liverpool

Assignment Requirements:

Questions

You are required to:

Expert Help With This Assignment — On Your Terms

Need help with your own assignment?

You May Also Find Helpful

For this Assignment, you write a proposal for some form of social advocacy that will seek to change a social, organizational, or legislative policy. The proposal may also involve advocacy

Policy and Politics in Healthcare Organizations Instructions: How can policies and politics affect a Healthcare Organization and Nursing academia? Contribute a minimum of 5-6 pages. It should include at least 3 academic

Imagine that Lynette follows up with you shortly after reading your views on leadership and collaboration. Her e-mail says the following: Thanks for sending me your thoughts last week on the diversity issue at the clinic. Your next s

Nurse at the Court Instructions: In a PowerPoint presentation, explain the importance of the nurses at the Court, nursing licensure, and regulations. Be sure to review the academic expectations for your submission.

Need Help With This Assignment?