Está en la página 1de 39

Statistical

Machine
Learning
CS 4440
Anushka Gupta, Jian
Hua, Ethan Jen
Outline
- Background:
- What is Machine Learning?

- Application in Industry

- Timeline of Development

- Types of Learning

- Some General Machine Learning Principles

- Challenges to Machine Learning

- Products:
- Commercial: Microsoft Azure
Background Information
What is Machine Learning
Machine learning is the subfield of computer science that gives computers the
ability to learn without being explicitly programmed (Arthur Samuel, 1959)
What is Machine Learning
CMU computer science professor Tom M. Mitchell provided a widely quoted,
more formal definition:

A computer program is said to learn from experience E with respect to some


class of tasks T and performance measure P, if its performance at tasks in T,
as measured by P, improves with experience E.
Application in Industry
Manufacturing

Predictive maintenance or condition monitoring

Warranty reserve estimation

Propensity to buy

Demand forecasting

Process optimization

Telematics
Application in Industry
Retail

Predictive inventory planning

Recommendation engines

Upsell and cross- channel marketing

Market segmentation and targeting


Application in Industry
Healthcare and Life Science

Alerts and diagnostics from real-time patient data

Disease identification and risk stratification

Patient triage optimization

Proactive health management

Healthcare provider sentiment analysis


Application in Industry
Financial services

Risk analytics and regulation

Customer segmentation

Sales and marketing campaign management

Creditworthiness evaluation
Application in Industry
Travel and hospitality

Aircraft scheduling

Dynamic pricing

Social media: consumer feedback and interaction analysis

Customer complaint resolution


Products that Use Machine
Learning
Google search

Amazon Recommendations

Siri

Self driving car


Companies involved
Google

Amazon

Microsoft

Uber

Apple

Tesla

Yahoo

Other major tech companies


Timeline of Development
1950 -The Turing Test

1952- The first computer learning program

1957 The first neural network for computer

1967 nearest neighbor


Timeline of Development
1981 Explanation based learning

1997 Deep Blue

2010 Microsoft Kinect


Timeline of Development
2011 IBM Watson beats human at Jeopardy

2014 Facebook DeepFace

2015 Amazon launches own machine learning platform.

2016 Google AlphaGo


Types of Learning
1) Supervised Learning

2) Unsupervised Learning

3) Reinforcement Learning
Supervised Learning
Goal:
Learn to predict the output from the input data

Data:
predictors and result (x and y)

Types of problems:
Classification, Regression

Types of algorithms:
Naive Bayes Classifier

Decision Trees
Unsupervised Learning
Goal:
Discover an underlying structure/description of the data

Data:
only have input data (x)

Types of problems:
Clustering, Association

Types of algorithms:
K-means

Apriori algorithm
Reinforcement Learning
Goal:
Make decisions, pick the best decision in the current state

Data:
Actions, states, rewards

Types of problems:
Game AIs

Types of algorithms:
Markov Decision Process (MDPs)
General Machine Learning
Principles
- Goal: generate a model based on dataset that is capable of predicting
new instances

- Training and Testing Sets


- Rule of thumb: train on 70% of the data, test on 30% of the data

- Cross-validation
- Used for model selection

- Divide training set into k folds, run k-1 iterations to train and evaluate on kth iteration

- Advantages: allows you train and test on your training data, model is averaged
General Machine Learning
Principles
Eager Learners Lazy Learners

Ex: Neural Ex: k Nearest


Network Neighbors

Learns as data Learns only


comes in when queried

Saves functions Saves individual


data points

Space efficient Not space


efficient

Slow learning, Fast learning,


faster prediction slow prediction
Challenges to Machine
Learning
- Overfitting:
- Trusting the data too much and developing a model that fits your training set but cannot
predict new values in the test set

- Usually identified by high training accuracy but low testing accuracy

- Tricks applied to algorithms to help prevent overfitting

- Curse of Dimensionality:
- Searching in spaces with higher dimensions is much harder

- Prediction power decreases in higher dimension spaces

- More data is needed to train a model for higher dimensions


Example of Overfitting
Products
- Microsofts cloud computing platform with a built in machine learning library
as part of the Cortana Intelligence Suite

- Backed by the Microsoft Azure Cloud Databases

- Drag/drop application of machine learning algorithms

- Use WorkBench UI for experiments or can use API for building other
applications

- Collaboration with shared WorkBench or Jupyter Notebooks

- Integration of Python and R scripts for scalability and customization of the


algorithms
- Waikato Environment for Knowledge Analysis developed at the University
of Waikato in New Zealand

- Java architecture so can be run on most devices easily

- Similar purpose as Azure in a smaller scale:


- Facilitate ML in an easy to use app

- Similar workflow: import, preprocess and then build the model

- Free, open source software with UI or command line usage

- Preprocessing filters to resample, or create discrete values


- Open source neural network library initially created by Google Brain Team

- Neural networks can generalize to model most classification problems

- Can be used from C, C++ and Python applications


- TensorFlow inteface is in Python

- APIs vary from high level APIs for beginners to lower level APIs for fine
tuned ML research

- Graph of tensors (n-dimensional arrays) where nodes are operations


connected by tensors
Neural Network

Source: https://i.stack.imgur.com/1bCQl.png
Prope Micro Tenso WEKA
rties soft rFlow
Azure

Types Classif Neural Data


of ication, Networ prepro
Algorit Regre k cessin
hms ssion, g
Cluster ,C
ing, lusteri
Comp ng
,C
uter lassific
Vision, ation
Text ,R
Analyti egress
cs ion
,F
Data Azure Deskto SQL
eature
Source Cloud p Datab
selecti
s Storag upload ases,
on
Suppo e, Deskto
rted Hadoo p
p, upload
Manua
l Entry,
Research
Where Machine Learning is
headed
Machine learning is used in a great number of industries from self-driving cars to communicating
with humans

Chat bots

Security

recommendations

How do we efficiently learn in settings where exploration is required?

How can we do effective offline evaluation of algorithms?

How can we be both efficient in sample complexity and computational complexity?

How can we learn from lots of data?

How can we learn to index efficiently?


Machine Learning Research
Pedestrian Detection for autonomous Vehicles
k-Nearest Neighbours

Nave Bayes classifier

Support Vector Machine

Cloud of points generated by the sensor is processed to detect pedestrians, by selecting


cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ,
and YZ projections of the points contained in the cube

Human Perception of Images


Subliminal priming of perception of images
Machine Learning Research
Cont.
Neuroimaging for Drug Discovery and Development
Machine learning enables predictions at the individual level based on the distributed effects
across the whole brain

Disease detection
Machine Learning Research
Cont.
Financial
Algorithmic trading

High Frequency Trading

Quantopian

Two Sigma

Citadel

Use technical indicators

Loan/Insurance underwriting

Machine Perception
WEKA Demo
Sources
Agost, 5 examples of predictive analytics in the travel industry, 2016, avaliable at
http://www.amadeus.com/blog/07/04/5-examples-predictive-analytics-travel-industry/

Buggey, T. (2007, Summer). Storyboard for Ivan's morning routine. Diagram. Journal of Positive Behavior Interventions, 9(3), 151. Retrieved December 14, 2007, from Academic
Search Premier database.

Costa, L., Gago, M. F., Yelshyna, D., Ferreira, J., David Silva, H., Rocha, L., & ... Bicho, E. (2016). Application of Machine Learning in Postural Control Kinematics for the Diagnosis of
Alzheimers Disease. Computational Intelligence & Neuroscience, 1-15. doi:10.1155/2016/3891253

Columbus ,10 Ways Machine Learning Is Revolutionizing Manufacturing 2016, avaliable at


http://www.forbes.com/sites/louiscolumbus/2016/06/26/10-ways-machine-learning-is-revolutionizing-manufacturing/#33f053862d7f

Columbus, Machine Learning Is Redefining the Enterprise in 2016, 2016 avaliable at


http://www.business2community.com/business-innovation/machine-learning-redefining-enterprise-2016-01569528#ZJmwLJcjKHZbLaXg.97

Doyle, O., Mehta, M., & Brammer, M. (2015). The role of machine learning in neuroimaging for drug discovery and development. Psychopharmacology,
232(21/22), 4179-4189.
doi:10.1007/s00213-015-3968-0

Google, TensorFlow: Basic Usage, 2016, available at https://www.tensorflow.org/versions/r0.10/get_started/basic_usage


Sources
Google, TensorFlow: Tensor Rank, Shapes and Types, 2016, avaliable at https://www.tensorflow.org/versions/r0.10/resources/dims_types

Google, TensorFlow: Reading Data, 2016, avaliable at https://www.tensorflow.org/programmers_guide/reading_data

IBM,Deep Blue, 1997, avaliable at http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/deepblue/

Machine Learning and Algorithms; Agile Development. (2012). Communications of the ACM, 55(8), 10-11. doi:10.1145/2240236.2240239

Marr, How Machine Learning, Big Data And AI Are Changing Healthcare, 2016, avaliable at
http://www.forbes.com/sites/bernardmarr/2016/09/23/how-machine-learning-big-data-and-ai-are-changing-healthcare-forever/#16a3c8654f49

Marr, Short History of Machine Learning -- Every Manager Should Read, 2016, avaliable at
http://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine-learning-every-manager-should-read/#3f2162be323f

Microsoft, Azure Machine Learning, 2017, avaliable at https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-faq

Microsoft, Overview of Azure Machine Learning, 2016, avaliable at https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-studio-overview-diagram

Microsoft, Introduction to Azure Machine Learning in the Cloud, 2017, avaliable at


https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-what-is-machine-learning
Sources
Mohan, D. M., Kumar, P., Mahmood, F., Wong, K. F., Agrawal, A., Elgendi, M., & ... Chan, A. D. (2016). Effect of Subliminal Lexical Priming on the Subjective Perception of Images: A
Machine Learning Approach. Plos ONE, 11(2), 1-22. doi:10.1371/journal.pone.0148332

Navarro, P. J., Fernndez, C., Borraz, R., & Alonso, D. (2017). A Machine Learning Approach to Pedestrian Detection for Autonomous Vehicles Using High-Definition 3D Range
Data. Sensors (14248220), 17(1), 1-20. doi:10.3390/s17010018

Shish, Big Data & Machine Learning Scenarios for Retail,2015, avaliable at

https://blogs.msdn.microsoft.com/shishirs/2015/01/26/big-data-machine-learning-scenarios-for-retail/

WEKA, The_WEKA_Workbench.pdf, 2016 available at http://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf

Zhang, X., Mahoor, M., & Mavadati, S. (2015). Facial expression recognition using $${l}_{p}$$ -norm MKL multiclass-SVM. Machine Vision & Applications, 26(4), 467-483.

doi:10.1007/s00138-015-0677-y

También podría gustarte