Está en la página 1de 9

Installing tensorflow-

---------------------

open anaconda prompt-

a) conda create -n tensorflow python=3.5

b) activate tensorflow

c) (tensorflow)C:> pip install --ignore-installed --upgrade tensorflow

-----------------------------------------------------------------------------------
--------------------------------------

Deep learning is a subfield of machine learning, which itself is a subfield of AI,


which itself is a subfield of computer science.

A discipline can be a subfield of more than one more general topic: for example,
machine learning can also be thought to be a subfield of statistics.

------------------------------------------------------------------

Colab(CO) works on Google Cloud platform, so we don't need to download it. Just we
need a web browser to run it.

It is very similar to Jupyter notebook.

In addition to Python, you can create TensorFlow models in JavaScript using


TensorFlow.js. TensorFlow also has other language bindings, with various degrees of
support, including: Swift, R, and Julia. Python and JavaScript are currently the
most complete language implementations however.

Note: (Udacity) All usage of Colab in this course is completely free of charge.
Even GPU usage is provided free of charge for some hours of usage every day.

Note: Colab connects your notebook to a cloud-based runtime, means you can run it
on any browser, so no need for any kind of installation on your machine.

-----------------------------------------------------------------------------------
-----------------------------

Training our first model which will be able to convert Celsius degrees to
Fahrenheit:
-----------------------------------------------------------------------------------
-

Some Machine Learning terminology


Feature � The input(s) to our model. In this case, a single value � the degrees in
Celsius.

Labels � The output our model predicts. In this case, a single value � the degrees
in Fahrenheit.

Example � A pair of inputs/outputs used during training. In our case a pair of


values from celsius_q and fahrenheit_a at a specific index, such as (22,72).
Create the model:
----------------
Next create the model. We will use simplest possible model we can, a Dense network.
Since the problem is straightforward, this network will require only a single
layer, with a single neuron.

Build a layer
We'll call the layer l0 and create it by instantiating tf.keras.layers.Dense with
the following configuration:

-> input_shape=[1] � This specifies that the input to this layer is a single value.
That is, the shape is a one-dimensional array with one member. Since this is the
first (and only) layer, that input shape is the input shape of the entire model.
The single value is a floating point number, representing degrees Celsius.

-> units=1 � This specifies the number of neurons in the layer. The number of
neurons defines how many internal variables the layer has to try to learn how to
solve the problem (more later). Since this is the final layer, it is also the size
of the model's output � a single float value representing degrees Fahrenheit. (In a
multi-layered network, the size and shape of the layer would need to match the
input_shape of the next layer.)

l0 = tf.keras.layers.Dense(units=1, input_shape=[1])

Assemble layers into the model:


------------------------------

Once layers are defined, they need to be assembled into a model. The Sequential
model definition takes a list of layers as argument, specifying the calculation
order from the input to the output.

This model has just a single layer, l0.

model = tf.keras.Sequential([l0])

Note
----

You will often see the layers defined inside the model definition, rather than
beforehand:

model = tf.keras.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1])
])

Compile the model, with loss and optimizer functions:


----------------------------------------------------

Before training, the model has to be compiled. When compiled for training, the
model is given:

-> Loss function � A way of measuring how far off predictions are from the desired
outcome. (The measured difference is called the "loss".)

-> Optimizer function � A way of adjusting internal values in order to reduce the
loss.

model.compile(loss='mean_squared_error',
optimizer=tf.keras.optimizers.Adam(0.1))
These are used during training (model.fit(), below) to first calculate the loss at
each point, and then improve it. In fact, the act of calculating the current loss
of a model and then improving it is precisely what training is.

During training, the optimizer function is used to calculate adjustments to the


model's internal variables. The goal is to adjust the internal variables until the
model (which is really a math function) mirrors the actual equation for converting
Celsius to Fahrenheit.

TensorFlow uses numerical analysis to perform this tuning, and all this complexity
is hidden from you so we will not go into the details here. What is useful to know
about these parameters are:

The loss function (mean squared error) and the optimizer (Adam) used here are
standard for simple models like this one, but many others are available. It is not
important to know how these specific functions work at this point.

One part of the Optimizer you may need to think about when building your own models
is the learning rate (0.1 in the code above). This is the step size taken when
adjusting values in the model. If the value is too small, it will take too many
iterations to train the model. Too large, and accuracy goes down. Finding a good
value often involves some trial and error, but the range is usually within 0.001
(default), and 0.1

Train the model:


---------------

Train the model by calling the fit method.

During training, the model takes in Celsius values, performs a calculation using
the current internal variables (called "weights") and outputs values which are
meant to be the Fahrenheit equivalent. Since the weights are initially set
randomly, the output will not be close to the correct value. The difference between
the actual output and the desired output is calculated using the loss function, and
the optimizer function directs how the weights should be adjusted.

This cycle of calculate, compare, adjust is controlled by the fit method. The first
argument is the inputs, the second argument is the desired outputs. The epochs
argument specifies how many times this cycle should be run, and the verbose
argument controls how much output the method produces.

history = model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)

Later we will go into more details on what actually happens here and how a Dense
layer actually works internally.

Display training statistics:


---------------------------

The fit method returns a history object. We can use this object to plot how the
loss of our model goes down after each training epoch. A high loss means that the
Fahrenheit degrees the model predicts is far from the corresponding value in
fahrenheit_a.

We'll use Matplotlib to visualize this (you could use another tool). As you can
see, our model improves very quickly at first, and then has a steady, slow
improvement until it is very near "perfect" towards the end.
import matplotlib.pyplot as plt
plt.xlabel('Epoch Number')
plt.ylabel("Loss Magnitude")
plt.plot(history.history['loss'])

Use the model to predict values:


-------------------------------

Now you have a model that has been trained to learn the relationship between
celsius_q and fahrenheit_a. You can use the predict method to have it calculate the
Fahrenheit degrees for a previously unknown Celsius degrees.

So, for example, if the Celsius value is 100, what do you think the Fahrenheit
result will be?

print(model.predict([100.0]))

[[211.33778]]

The correct answer is 100�1.8+32=212 , so our model is doing really well.

To review:
----------

We created a model with a Dense layer


We trained it with 3500 examples (7 pairs, over 500 epochs).

Our model tuned the variables (weights) in the Dense layer until it was able to
return the correct Fahrenheit value for any Celsius value. (Remember, 100 Celsius
was not part of our training data.)

Looking at the layer weights:


----------------------------

Finally, let's print the internal variables of the Dense layer.

print("These are the layer variables: {}".format(l0.get_weights()))

Recap:
-----

Congratulations! You just trained your first machine learning model. We saw that by
training the model with input data and the corresponding output, the model learned
to multiply the input by 1.8 and then add 32 to get the correct result.

This was really impressive considering that we only needed a few lines code:

l0 = tf.keras.layers.Dense(units=1, input_shape=[1])
model = tf.keras.Sequential([l0])
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.1))
history = model.fit(celsius_q, fahrenheit_a, epochs=500, verbose=False)
model.predict([100.0])

This example is the general plan for of any machine learning program. You will use
the same structure to create and train your neural network, and use it to make
predictions.

The Training Process


--------------------

The training process (happening in model.fit(...)) is really about tuning the


internal variables of the networks to the best possible values, so that they can
map the input to the output. This is achieved through an optimization process
called Gradient Descent, which uses Numeric Analysis to find the best possible
values to the internal variables of the model.

To do machine learning, you don't really need to understand these details. But for
the curious: gradient descent iteratively adjusts parameters, nudging them in the
correct direction a bit at a time until they reach the best values. In this case
�best values� means that nudging them any more would make the model perform worse.
The function that measures how good or bad the model is during each iteration is
called the �loss function�, and the goal of each nudge is to �minimize the loss
function.�

The training process starts with a forward pass, where the input data is fed to the
neural network (see Courses\Udacity-IntroToMachineLearning\Fig.1). Then the model
applies its internal math on the input and internal variables to predict an answer
("Model Predicts a Value" in Fig. 1).

In our example, the input was the degrees in Celsius, and the model predicted the
corresponding degrees in Fahrenheit.

Once a value is predicted, the difference between that predicted value and the
correct value is calculated. This difference is called the loss, and it's a measure
of how well the model performed the mapping task. The value of the loss is
calculated using a loss function, which we specified with the loss parameter when
calling model.compile().

After the loss is calculated, the internal variables (weights and biases) of all
the layers of the neural network are adjusted, so as to minimize this loss � that
is, to make the output value closer to the correct value (see Courses\Udacity-
IntroToMachineLearning\Fig2_Backpropagation.png).

This optimization process is called "Gradient Descent". The specific algorithm used
to calculate the new value of each internal variable is specified by the optimizer
parameter when calling model.compile(...). In this example we used the Adam
optimizer.

By now you should know what the following terms are:

Feature: The input(s) to our model


Examples: An input/output pair used for training
Labels: The output of the model
Layer: A collection of nodes connected together within a neural network.
Model: The representation of your neural network
Dense and Fully Connected (FC): Each node in one layer is connected to each node in
the previous layer.
Weights and biases: The internal variables of model
Loss: The discrepancy between the desired output and the actual output
MSE: Mean squared error, a type of loss function that counts a small number of
large discrepancies as worse than a large number of small ones.
Gradient Descent: An algorithm that changes the internal variables a bit at a time
to gradually reduce the loss function.
Optimizer: A specific implementation of the gradient descent algorithm. (There are
many algorithms for this. In this course we will only use the �Adam� Optimizer,
which stands for ADAptive with Momentum. It is considered the best-practice
optimizer.)
Learning rate: The �step size� for loss improvement during gradient descent.
Batch: The set of examples used during training of the neural network
Epoch: A full pass over the entire training dataset
Forward pass: The computation of output values from input
Backward pass (backpropagation): The calculation of internal variable adjustments
according to the optimizer algorithm, starting from the output layer and working
back through each layer to the input.

-----------------------------------------------------------------------------------
---------------------------------------------
The Rectified Linear Unit (ReLU)
--------------------------------
In this lesson we talked about ReLU and how it gives our Dense layer more power.
ReLU stands for Rectified Linear Unit and it is a mathematical function.

As we can see, the ReLU function gives an output of 0 if the input is negative or
zero, and if input is positive, then the output will be equal to the input.

ReLU gives the network the ability to solve nonlinear problems.

Converting Celsius to Fahrenheit is a linear problem because f = 1.8*c + 32 is the


same form as the equation for a line, y = m*x + b. But most problems we want to
solve are nonlinear. In these cases, adding ReLU to our Dense layers can help solve
the problem.

ReLU is a type of activation function. There several of these functions (ReLU,


Sigmoid, tanh, ELU), but ReLU is used most commonly and serves as a good default.
To build and use models that include ReLU, you don�t have to understand its
internals.

Let�s review some of the new terms that were introduced in this lesson:

Flattening: The process of converting a 2d image into 1d vector.


ReLU: An activation function that allows a model to solve nonlinear problems.
Softmax: A function that provides probabilities for each possible output class.
Classification: A machine learning model used for distinguishing among two or more
output categories.

-----------------------------------------------------------------------------------
---------------------------------------
Regression vs. classification (https://developers.google.com/machine-
learning/crash-course/framing/ml-terminology)
-----------------------------

A regression model predicts continuous values. For example, regression models make
predictions that answer questions like the following:

What is the value of a house in California?

What is the probability that a user will click on this ad?

A classification model predicts discrete values. For example, classification models


make predictions that answer questions like the following:

Is a given email message spam or not spam?

Is this an image of a dog, a cat, or a hamster?

Descending into ML: Linear Regression(https://developers.google.com/machine-


learning/crash-course/descending-into-ml/linear-regression)
------------------------------------

It has long been known that crickets (an insect species) chirp more frequently on
hotter days than on cooler days. For decades, professional and amateur scientists
have cataloged data on chirps-per-minute and temperature. As a birthday gift, your
Aunt Ruth gives you her cricket database and asks you to learn a model to predict
this relationship. Using this data, you want to explore this relationship.

First, examine your data by plotting it:(Courses\Udacity-


IntroToMachineLearning\CricketChirpsPerMinute.svg)

As expected, the plot shows the temperature rising with the number of chirps. Is
this relationship between chirps and temperature linear? Yes, you could draw a
single straight line like the following to approximate this relationship:
(Courses\Udacity-IntroToMachineLearning\LinearRelationship.svg)

True, the line doesn't pass through every dot, but the line does clearly show the
relationship between chirps and temperature. Using the equation for a line, you
could write down this relationship as follows:

y = mx + b

where:

y is the temperature in Celsius�the value we're trying to predict.


m is the slope of the line.
x is the number of chirps per minute�the value of our input feature.
b is the y-intercept.

By convention in machine learning, you'll write the equation for a model slightly
differently:

y` = b + w1x1

where:

y` is the predicted label (a desired output).


b is the bias (the y-intercept), sometimes referred to as w0.
w1 is the weight of feature 1. Weight is the same concept as the "slope" m in the
traditional equation of a line.
x1 is a feature (a known input).

To infer (predict) the temperature y` for a new chirps-per-minute value x1 , just


substitute the x1 value into this model.

Although this model uses only one feature, a more sophisticated model might rely on
multiple features, each having a separate weight (w1,w2 , etc.). For example, a
model that relies on three features might look as follows:

y` = b + w1x1 + w2x2 + w3x3


------------------------------------------------------------------------

Type of ML Algorithms: (https://www.codementor.io/dhananjaykumar/ml-with-python-


part-1-z02wtrpxb)
---------------------
Supervised Learning:
-------------------

Supervised learning is when the model is getting trained on a labelled


dataset. Labelled dataset is one which have both input and output parameters.
Supervised machine learning includes two major processes: classification and
regression.

Classification: is the process where incoming data is labeled based on past


data samples and manually trains the algorithm to recognize certain types
of objects and categorize them accordingly. Examples -

- Fraud Detection
- Email Spam Detection
- Image Classification

Regression: is the process of identifying patterns and calculating the


predictions of continuous outcomes. The system has to understand the
numbers, their values, grouping. Examples -

- Weather Forecasting
- Risk Assessment
- Score Prediction

Popular Supervised Learning Algorithms:

- Liner Regression
- Logistic Regression
- Decision Tree
- Random Forest
- KNN (K Nearest Neighbor)
- Support Vector Machine

Unsupervised Learning:
---------------------

In case of unsupervised machine learning algorithms the desired results are


unknown and yet to be defined. Unsupervised learning problems further grouped into
clustering and association problems.

-----------------------------------------------------------------------------------
-------------------------------

How Google does Machine Learning started on 17th Oct: ( need to finish the first
assignement asap) - COURSERA
---------------------------------------------------
Python for Data Science and AI (by IBM) (Coursera - started on 20th Nov)
---------------------------------------

Did you know? IBM Watson Studio lets you build and deploy an AI solution, using the
best of open source and IBM software and giving your team a single environment to
work in.

https://cloud.ibm.com/catalog/services/watson-studio?
context=wdp&apps=all&cm_mmc=Email_Outbound-_-Developer_Ed%2BTech-_-WW_WW-_-
Campaign-Cognitive%2BClass%2BLab%2Binfobox
%2BWatsonStudio&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678

También podría gustarte