Project 2 – Multiclass Classification In this project you will explore some techniques in solving the supervised learning task of multiclass

Project 2 – Multiclass Classification

In this project you will explore some techniques in solving the supervised
learning task of multiclass classification. It is important to realize that
understanding an algorithm or technique requires understanding how it
behaves under a variety of circumstances. You will go through the process
of choosing and exploring two multiclass classification datasets, tuning
the algorithms you have learned about, writing a thorough analysis of your
findings, and presenting your findings. The most crucial part of this
assignment is the analysis and your ability to explain and justify your
results.

I. Choosing Datasets

The first task in this assignment is choosing two interesting multiclass
classification datasets. That is, the variable you’re trying to predict must
be categorical with at least 3 possible values. The features can be of any
type, and it is recommended that you choose datasets with diverse feature
sets. I don’t care where you get the data from. You can download some,
take some from your own research, or make some up on your own. What I
do care about is that the datasets must be interesting. They should
contain a decent amount of features and a sufficiently large amount of
examples. Do not choose an “easy” dataset, however don’t go crazy either
trying to find the perfect one. Your two datasets should also differ in some
way such that you can compare and contrast your results between the
two. You should also be following standard machine learning practice by
splitting your dataset into training and testing, and only touching the
testing dataset at the very end when you are ready to report results. (Cross
validation is highly recommended).

II. Coding (10%)

After choosing your datasets you will now be tasked with writing code to apply
the machine learning algorithms you have learned about. Your code must be
written in python, but you may use any libraries that have already implemented
the machine learning algorithms (e.g scikit-learn). You are not expected to code
the algorithms from scratch, and in fact I would highly discourage it. What you
may not do is copy code from the internet. Below are the algorithms that you are
required to “implement”

• Naive Bayes

• Support Vector Machine (SVM)

• Logistic Regression

• Neural Network

Your code does not have to be pretty or well written. However, it must be written
in python and I must be able to run one script (main.py) that will produce all the
results and figures in your report.

III. Report (80%)

You will then produce a report describing and analyzing your methods and
results. Here you will describe the datasets you have chosen and why they are
interesting. You will then provide an analysis on how the different machine
learning algorithms performed on each dataset. The report must be limited to 10
pages maximum. Plots and figures are highly recommended. It is up to you
how you wish to demonstrate your understanding of the machine learning
algorithms you have explored, but below I have listed some potential ideas for
analysis and items you may wish to include in the report.

• A description of your two datasets and why you feel that they are interesting.

• Hypotheses on how you believe the learning algorithms will perform on each

dataset and why.

• How you dealt with different features in your datasets? missing data? different

scalings?

• Training and testing error rates you obtained for your various learning

algorithms (some sort of cross validation is highly recommended)

• The effect of hyperparameters on performance

• Comparing and contrasting results between datasets

• Comparing and contrasting results between learning algorithms

• Training and testing error rates as a function of training dataset size

• Timing analysis of how long it takes to train/test each algorithm

• Conclusions

• Ideas for future analyses

• What you may have done differently

• References

You are NOT being graded on how well the algorithms perform on your datasets.
What is most important is WHY? You should be explaining and justifying all of
your figures and results, and demonstrating that you understand the intricate

details of the machine learning process, and the machine leaning algorithms you
are using.

IV. Presentation (10%)

Finally you will give a maximum 7 minute presentation of your results (You will be
cut off exactly at the 7 minute mark). In this presentation you will describe your
datasets, your methods, and any interesting results you found!

What to turn in?

Below is a list of items you will be required to turn in via canvas. Please make
sure all documents are named as described bellow.

• report.pdf – Your maximum 10 page report in pdf format. Do not use super
tiny or large font. No specific formatting is required but use common sense.

• presentation.pptx or presentation.key – Your presentation slides either in a
powerpoint or keynote document.

• code.zip – A zip file with all of the code you have written. Within the folder
there should a file called README.txt that contains instructions on how to run
your code, and a python file called main.py that will produce all figures and
plots in your report/presentation. I should be able to reproduce your results
easily.

• data.zip – A zip file that contains the two datasets you have chosen.

Grading

You are being scored on your analysis more than anything else. Roughly
speaking, implementing everything and getting it to run is worth very little for
this assignment. Of course though, analysis without proof of working code
makes the analysis suspect. The key thing is that your explanations should be
both thorough and concise, and your analysis should prove to me you have a
deep understanding of the machine learning process and the machine learning
algorithms you are using.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

WU Health & Medical Being an Effective Follower Discussion Nursing Assignment Help

BEING AN EFFECTIVE FOLLOWER Robert Kelley (1992), author of The Power of Followership, suggested that effective followers possess four common qualities: They are self-managing and able to work effectively without direct supervision; They are very committed to the organization’s goals; They possess a high level of competence and mastery of

Technology made it easier for all of the mis-information

We have heard that the Russians keep trying to change the outcome of our elections by posting inaccurate information in Facebook, Instagram etc… to get people to change their vote. The President claims that there is ‘fake news’ everywhere’. Do some research and give your thoughts on all of this.

CRITICAL INFRASTRUCTURE VULNERABILITY AND PROTECTION

 Preliminary Introduction to your Review Examine the literature (given to you in background information or search the internet) and consider on what topic you will focus on your CIP project. Prepare a two-page proposal depicting how you will approach and on what you will be focusing in this project. TOPIC:

Please respond to a minimum of 2 peers. Include the Nursing Assignment Help

Please respond to a minimum of 2 peers. Include the following in your responses: What similarities or differences do you see between your perceptions related to the nursing shortage and staffing and those of your peers? Describe ways in which posted responses have shifted your perspective or provide additional insight

Algorithms power the biggest web companies and the most promising

Algorithms power the biggest web companies and the most promising startups. Interviews at tech companies start with questions that probe for good algorithm thinking. In this computer science course, you will learn how to think about algorithms and create them using sorting techniques such as quick sort and merge sort,

Please respond to the following: The executive management team of

  Please respond to the following: The executive management team of a medium-sized business wants to be more customer-focused in the marketplace. Because you oversee the CRM, you have been assigned to support the newly created social media marketing plan: Describe the best steps to identify customers and the different

Chapter 8 topics: Define and describe virtualization. Defend the following

 Chapter 8 topics: Define and describe virtualization. Defend the following statement: Virtualization is not a new concept within computer science. Describe the various types of virtualization. List the pros and cons of virtualization.  Chapter 9 topics: List the security advantages of cloud-based solutions. List the security disadvantages of cloud-based solutions.

150 WORDS Explain what leadership means to you. What do

 150 WORDS Explain what leadership means to you. What do you believe constitutes an effective versus ineffective leader? Of the power sources, which do you feel is your best source of power as a leader? Describe a transformational leader that you have worked with. What tools did they have that

Part 1 The following activity includes several case Nursing Assignment Help

Part 1  The following activity includes several case presentations of edema. Make a diagnosis for each case, remembering the following questions:  Is the edema acute/sudden or chronic (e.g., duration, progression)? Is it unilateral or bilateral? Is the edema generalized or localized? Is it pitting or nonpitting? Is it dependent? In

Examine Case Study: An African American Child Suffering From

Paper details The Assignment Examine Case Study: An African American Child Suffering From Depression. You will be asked to make three decisions concerning the medication to prescribe to this client. Be sure to consider factors that might impact the client’s pharmacokinetic and pharmacodynamic processes. • At each decision point stop

Current research to examine an aspect of women’s health as it relates

YOU will create an annotated bibliography of current research to examine an aspect of women’s health as it relates to health education, policy or clinical practice in a Canadian context. Through this process, you will identify current literature based on critique and synthesis of 8 peer-reviewed articles or policy documents

Assume you are sharing an elevator while attending a conference

  Assume you are sharing an elevator while attending a conference when a fellow attendee notices your nametag and credentials and asks you what exactly you do as an MFT. Prepare an elevator to talk to sell your approach as “a systemic thinker” that could be delivered in about two

Consider a time in your practicum or work setting in

  Consider a time in your practicum or work setting in which a client asked you to share information about yourself.  What did they ask you to share? https://brillianthomeworkassisters.com/learning-goal-im-working-on-a-management-exercise-and-need-a-sample-draft-to-h/management/ How did you respond? Using the NASW Code of Ethics and the readings for this module to inform your thinking, do

Growth of the Hero

Description Book: http://www.ancienttexts.org/library/mesopotamian/gilgamesh/tab1.htm Write an essay of at least 500 words and at least 5 paragraphs answering the question, does Gilgamesh grow as a hero as his story progresses. Select 3-5 incidents from the story that contribute to Gilgamesh’s grown and maturation as a leader, a person, or a hero.