Description
Purpose
This task provides you with opportunities to learn supervised machine learning and Python skills (GLO1 & ULO1) and apply your digital literacy to research and develop a machine learning solution (GLO3, GLO5, and ULO2). By completing this task, you will gain knowledge and skills in selecting and applying one or more appropriate supervised machine learning algorithm(s) to develop and evaluate a machine learning solution and present and interpret the outcomes to business clients.
Context/Scenario
VicCrashAnalytics is a fictitious data consulting firm that provides analytics services to governments and other organizations in Australia. The Assignment 1 project involves a consulting contract for the Victorian government’s Department of Transport (DOT). The client wants to understand the factors that contribute to blackspots (also known as accident hotspots). This information will be used to develop effective education campaigns, propose legislative reforms, and potentially design and implement other interventions. You have been provided with a dataset containing information about blackspots, the demographics of the surrounding road segments, and their characteristics. Specifically, the client’s objective is to gain insights from the provided data and predict the risk of blackspots.
The dataset provided:
• Blackspot.csv
• Metadata.csv
You are required to explore this dataset and develop and test a machine learning model(s) using Python. You are also required to report findings to Mr. Michael Howards, Transport Analytics Manager, VicCrashAnalytics. Challenge: You have also been provided with a second dataset without labels: Blackspot_Competition.csv You are invited to deploy the model and apply it on this second dataset.
The model with the best performance will win a small prize! The dataset used in this assignment has been developed by Asel Mendis through integrating crash data from Department of Transport and demographics data from the Australian Bureau of Statistics (ABS). The dataset then has undergone further pre-processing and resampling by the unit team specifically for the purpose of learning. Therefore, it is important to note that the dataset may not fully represent real-world scenarios. It is essential that your insights and conclusions are justified based on the provided dataset. Your ability to 2 effectively process, analyse, and model the data and interpret the outcome will be evaluated as part of the assessment