Project 2 – Multiclass Classification In this project you will explore some techniques in solving the supervised learning task of multiclass

Project 2 – Multiclass Classification

In this project you will explore some techniques in solving the supervised
learning task of multiclass classification. It is important to realize that
understanding an algorithm or technique requires understanding how it
behaves under a variety of circumstances. You will go through the process
of choosing and exploring two multiclass classification datasets, tuning
the algorithms you have learned about, writing a thorough analysis of your
findings, and presenting your findings. The most crucial part of this
assignment is the analysis and your ability to explain and justify your
results.

I. Choosing Datasets

The first task in this assignment is choosing two interesting multiclass
classification datasets. That is, the variable you’re trying to predict must
be categorical with at least 3 possible values. The features can be of any
type, and it is recommended that you choose datasets with diverse feature
sets. I don’t care where you get the data from. You can download some,
take some from your own research, or make some up on your own. What I
do care about is that the datasets must be interesting. They should
contain a decent amount of features and a sufficiently large amount of
examples. Do not choose an “easy” dataset, however don’t go crazy either
trying to find the perfect one. Your two datasets should also differ in some
way such that you can compare and contrast your results between the
two. You should also be following standard machine learning practice by
splitting your dataset into training and testing, and only touching the
testing dataset at the very end when you are ready to report results. (Cross
validation is highly recommended).

II. Coding (10%)

After choosing your datasets you will now be tasked with writing code to apply
the machine learning algorithms you have learned about. Your code must be
written in python, but you may use any libraries that have already implemented
the machine learning algorithms (e.g scikit-learn). You are not expected to code
the algorithms from scratch, and in fact I would highly discourage it. What you
may not do is copy code from the internet. Below are the algorithms that you are
required to “implement”

• Naive Bayes

• Support Vector Machine (SVM)

• Logistic Regression

• Neural Network

Your code does not have to be pretty or well written. However, it must be written
in python and I must be able to run one script (main.py) that will produce all the
results and figures in your report.

III. Report (80%)

You will then produce a report describing and analyzing your methods and
results. Here you will describe the datasets you have chosen and why they are
interesting. You will then provide an analysis on how the different machine
learning algorithms performed on each dataset. The report must be limited to 10
pages maximum. Plots and figures are highly recommended. It is up to you
how you wish to demonstrate your understanding of the machine learning
algorithms you have explored, but below I have listed some potential ideas for
analysis and items you may wish to include in the report.

• A description of your two datasets and why you feel that they are interesting.

• Hypotheses on how you believe the learning algorithms will perform on each

dataset and why.

• How you dealt with different features in your datasets? missing data? different

scalings?

• Training and testing error rates you obtained for your various learning

algorithms (some sort of cross validation is highly recommended)

• The effect of hyperparameters on performance

• Comparing and contrasting results between datasets

• Comparing and contrasting results between learning algorithms

• Training and testing error rates as a function of training dataset size

• Timing analysis of how long it takes to train/test each algorithm

• Conclusions

• Ideas for future analyses

• What you may have done differently

• References

You are NOT being graded on how well the algorithms perform on your datasets.
What is most important is WHY? You should be explaining and justifying all of
your figures and results, and demonstrating that you understand the intricate

details of the machine learning process, and the machine leaning algorithms you
are using.

IV. Presentation (10%)

Finally you will give a maximum 7 minute presentation of your results (You will be
cut off exactly at the 7 minute mark). In this presentation you will describe your
datasets, your methods, and any interesting results you found!

What to turn in?

Below is a list of items you will be required to turn in via canvas. Please make
sure all documents are named as described bellow.

• report.pdf – Your maximum 10 page report in pdf format. Do not use super
tiny or large font. No specific formatting is required but use common sense.

• presentation.pptx or presentation.key – Your presentation slides either in a
powerpoint or keynote document.

• code.zip – A zip file with all of the code you have written. Within the folder
there should a file called README.txt that contains instructions on how to run
your code, and a python file called main.py that will produce all figures and
plots in your report/presentation. I should be able to reproduce your results
easily.

• data.zip – A zip file that contains the two datasets you have chosen.

Grading

You are being scored on your analysis more than anything else. Roughly
speaking, implementing everything and getting it to run is worth very little for
this assignment. Of course though, analysis without proof of working code
makes the analysis suspect. The key thing is that your explanations should be
both thorough and concise, and your analysis should prove to me you have a
deep understanding of the machine learning process and the machine learning
algorithms you are using.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Write and 8 to 10 page paper on the development

Write and 8 to 10 page paper on the development methodology of social media websites. The use of outside references (books, peer-reviewed journal articles, professional publications, media) is encouraged. Be sure to tie your paper to the concepts and readings covered in the course. Please address all of the questions

Creating Production Possibilities Schedules and Curves  Creating Production Possibilities Schedules and Curves Student Assignment In this

Creating Production Possibilities Schedules and Curves  Creating Production Possibilities Schedules and Curves Student Assignment In this assignment, you will create a production possibilities schedule and curve to determine what your opportunity costs are and which product is the best for you to produce. Directions: Gather materials and necessary information. Ask

Terrorism and Political Violence Questions Nursing Assignment Help

unit 4 Instructions Based on your  reading, discuss the problem caused by imperfect intelligence  about a potential  adversary’s capabilities and intentions in obtaining and  using a WMD. Review the  three views given in the textbook of when it is  appropriate to take military  action in self-defense. Which camp do you

1. Explain, in detail, the entire process of a case

1. Explain, in detail, the entire process of a case in the American Court System. 2. Define Use of Force. Explain the degree of force that is permitted and, in detail, when it can be used. 3. Define Corporal Punishment. Explain what can and cannot be done and when it

Software architecture refers to how an application or system is

Software architecture refers to how an application or system is organized. It includes how each system component relates and communicates with each other. Software designers and developers use design patterns to create reusable solutions to computing problems. Explain the concept and practical use of software design patterns. Explain the software

1. Why are ACE inhibitors prescribed for diabetic patients (outside

1. Why are ACE inhibitors prescribed for diabetic patients (outside of hypertension).  2. Name the 4 types of insulin and what each is used for in diabetes management? 3. What labs would you expect to be drawn for your patient on levothyroxine? 4. What medication is used for hyperthyroidism? 5.

For this assignment you will engage with the Humanities in

 For  this assignment you will engage with the Humanities in your local area  in some way, either through visiting a local cultural venue or  interviewing someone who is a professional in the Humanities, such as a  locally known artist, musician, author, or religious leader, or a  director of an art

Web Content Audit and Usability Plan

 Check www.middlesex.ca – website of Middlesex county. Go through the economic development pages and especially the county of Middlesex economic development strategic plans and economic development strategic plan 2014 – 2019. Explain what you believe the purpose, message and audience of this section is. What do you think about the

Topic: Turnover in Amazon Inc. Take a moment and reflect

Topic: Turnover in Amazon Inc.  Take a moment and reflect on the organization that you selected and are researching and writing about for your Course Project. Based on your research thus far, appraise the strength of your selected organization’s culture. Use the following questions to help you get started in

Social research and journalism have much in common. Briefly, what

Premium Paper Help is a professional writing service that provides original papers. Our products include academic papers of varying complexity and other personalized services, along with research materials for assistance purposes only. All the materials from our website should be used with proper references.

Level of scientific consensus on climate change

Linquist, 2019 Philosophy 2070 www.biophilosophy.ca Paper writing guidelines Guidelines for Writing Introductory Philosophy Papers One major obstacle when writing a philosophical paper is deciding how to bite off a manageable chunk of material. Often, students will raise issues too broad to be given thorough treatment in the allotted space. Inevitably,

1,000 words or more using APA 6 style and at

1,000 words or more using APA 6 style and at least 2 references, include plagiarism report   Transportation Southwest had a rocky road at its start. Incorporated in 1967, it never actually started flying planes until June of 1971 because of lengthy legal challenges by other major airlines. When it

Please find the details of the project and pick any

Please find the details of the project and pick any one of the organization about Cloud Computing and write the 3-5 pages of body and without any plagarism For this project, select an organization that has leveraged Cloud Computing technologies in an attempt to improve profitability or to give them

Read the journal article “Challenges and Resources for Participating in

Read the journal article “Challenges and Resources for Participating in a Hurricane Sandy Hospital Evacuation” (see attached) Discuss your role as a nurse in disaster preparedness and response. Describe your current patient population and discuss the challenges you might face in carrying out your responsibilities in a disaster. How could

The Allied Group has acquired Kramer Industries and is now

The Allied Group has acquired Kramer Industries and is now considering additional investments. They have determined that there is a firm that is a good fit for their portfolio, the Kramer firm of Montana. The firm was established in 1990 and has the following historical returns: Kramer Industries Year     Earnings

Suppose we conducted a study and examined the relationship between

Suppose we conducted a study and examined the relationship between Number of Years in School and Number of Offenses Committed. We found the correlation value (r) to be .72, but it was not significant. Despite having such a high value, why might this relationship not be statistically significant? Suppose we conducted

Part 1 is a 1-page executive summary paper that includes

 Part 1 is a 1-page executive summary paper that includes a short description of the criminal justice policy you will be arguing for or against.  Part 2 is a 5- to the 6-slide presentation in which you create an executive summary that describes the information presented in each of the

Each year principals are faced with budgetary changes. One of

  Each year principals are faced with budgetary changes. One of the toughest monetary changes to handle is the sudden notification of a reduction in funds. The following case study assignment will help to prepare you for this common occurrence. For this assignment, suppose you are a principal who just