ENG335 Prepare a new dataset by excluding the “date and day” and “year” attributes: Machine Learning Assignment, SUSS, Singapore

Question 1
Download the Lyft Inc dataset from Kaggle (https://www.kaggle.com/datasets/dermisfit/lyft-inc-dataset). Understand the dataset by performing exploratory analysis. Prepare a new dataset by excluding the “date and day” and “year” attributes. You need to also drop TWO (02) more attributes. If you don’t exclude these two attributes, you will get a perfect/ideal estimator. Design a linear regression model to estimate the bike demand using only FOUR (04) best attributes from the newly constructed dataset. Discuss your results and the relevant metrics. If you include all the features of the new dataset, does that give a better model. Would you use the model that employs all the features for the prediction of the bike demand?
(20 marks)

Question 2
Load the Wrestling World Tournament dataset from Kaggle
(https://www.kaggle.com/datasets/julienjta/wrestling-world-tournament). The
objective is to detect the gender of the wrestler given the other parameters. Perform exploratory data analysis. Analyze and drop the appropriate features and suitably encode the categorical features. Design a simple neural network classifier with ONE (01) hidden layer. Construct the Naïve Bayes classifier for the above problem. Adjust the parameters of the neural network algorithm such that it has the same or better performance than the Naïve Bayes classifier.

Question 3
Download the Sloan Digital Sky Survey DR16 dataset available in Kaggle
(https://www.kaggle.com/datasets/muhakabartay/sloan-digital-sky-survey-dr16). Prepare the dataset by dropping the features [‘objid’, ‘run’, ‘rerun’, ‘camcol’, ‘plate’, ‘field’, ‘mjd’, ‘fiberid’, ‘specobjid’, ‘redshift’] and perform exploratory data analysis. Propose optimal values for the depth and number of trees in the random forest.

Buy Custom Answer of This Assessment & Raise Your Grades
Get A Free Quote

Question 4
Use the cat vs rabbit dataset available in the Kaggle
(https://www.kaggle.com/datasets/muniryadi/cat-vs-rabbit). You can use example codes (from Kaggle or other resources) to download and load the data properly into the programming environment. Perform exploratory data analysis and show a random sample of SIX (06) images each for the cat and the rabbit. Design a CNN with TWO (02) convolutional layers and THREE (03) dense layers (including the final output layer). Employ ‘tanh’ activation and MaxPooling. Keep 18% of the training dataset for validation and use at least 10 epochs. Note: Use the data in train-cat-rabbit folder to create your training and validation datasets. Use the data in val-cat-rabbit as your test dataset to rate the performance of the algorithm.

Question 5
Select any stock listed in Singapore stock exchange. Using Yahoo finance, download the daily stock data (Open, High, Low, Close, Adj Close, Volume). Download the data such that 8 years of data up to the last working day of December 2021 can be used for training and the data from the 1st working day of 2022 till the last working day of year 2022 can be used as test data. Use the previous 52 days of stock information (High and Volume) to predict the next day stock price (High). Design an LSTM network to do the predictions. You are required to use LSTM with a cell state of at least 100 dimensions and do at least 50 epochs of training. Rate the performance of the LSTM classifier and provide necessary plots.

Stuck with a lot of homework assignments and feeling stressed ?
Take professional academic assistance & Get 100% Plagiarism free papers
Get A Free Quote

The post ENG335 Prepare a new dataset by excluding the “date and day” and “year” attributes: Machine Learning Assignment, SUSS, Singapore appeared first on Singapore Assignment Help.