{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction on KERAS\n", "\n", "Keras is a high-level neural networks API, written in Python, developed with a focus on enabling fast experimentation. Keras offers a consistent and simple API, which minimizes the number of user actions required for common use cases, and provides clear and actionable feedback upon user error.\n", "\n", "Keras is capable of running on top of many deep learning backends such as TensorFlow, CNTK, or Theano. This capability allows Keras model to be portable across all there backends.\n", "\n", "Kesas is one of the most used Deep Learning Framework used by researchers, and is now part of the official TensorFlow Higher Level API as tf.keras\n", "\n", "Keras models can be trained on CPUs, Xeon Phi, Google TPUs and any GPU or OpenCL-enabled GPU like device.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building a Model with Keras\n", "\n", "The core data structure of Keras is the Model which is basically a container of one or more Layers.\n", "\n", "There are two main types of models available in Keras: the Sequential model and the Model class, the latter used to create advanced models.\n", "\n", "The simplest type of model is the Sequential model, which is a linear stack of layers. Each layer is added to the model using the .add() method of the Sequential model object.\n", "\n", "The model needs to know what input shape it should expect. The first layer in a Sequential model (and only the first) needs to receive information about its input shape, specifing the input_shape argument. The following layers can do automatic shape inference from the shape of its predecessor layer." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "from tensorflow import keras\n", "\n", "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Dense, Activation\n", "\n", "model = Sequential()\n", "# Adds to the model a densely-connected layer with 32 units with input shape 16:\n", "model.add(Dense(32, input_shape=(16,)))\n", "# Adds another layer with 16 units, each connected to 32 outputs of previous layer\n", "model.add(Dense(16))\n", "# Last layer with 8 units, each connected to 16 outputs of previous layer\n", "model.add(Dense(8, activation='softmax'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The activation argument specifies the activation function for the current layer. By default, no activation is applied. \n", "\n", "The softmax activation function is commonly used in the last layer of a model to select a single output from many, for example to select the most probable identified item among a set in a classification problem. \n", "\n", "Keras provides many types of layers and activation functions implementations, which we are going to explore later in this course." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compile the model\n", "\n", "After the model is constructed, we have configure its learning process by calling the compile method. The compile phase is required to configure the following (required) element of the model:\n", "- optimizer: this object specifies the optimization algorithm which adapt the weights of the layers during the training procedure;\n", "- loss: this object specifies the function to minimize during the optimization;\n", "- metrics: [optional] this objects judge the performance of your model and is used to monitor the training" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Configure the model for mean-squared error regression.\n", "model.compile(optimizer='sgd', # stochastic gradient descent\n", " loss='mse', # mean squared error\n", " metrics=['accuracy']) # an optional list of metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Training Process\n", "\n", "Once the model is compiled, we can check its status using the summary and get precious information on model composition, layer connections and number of parameters." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense (Dense) (None, 32) 544 \n", "_________________________________________________________________\n", "dense_1 (Dense) (None, 16) 528 \n", "_________________________________________________________________\n", "dense_2 (Dense) (None, 8) 136 \n", "=================================================================\n", "Total params: 1,208\n", "Trainable params: 1,208\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "model.summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now it's time to learn how to train the model against a set of training data and monitor the optimization process and convergence using reported loss and accuracy measure." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train on 1000 samples, validate on 100 samples\n", "Epoch 1/10\n", "1000/1000 [==============================] - 1s 737us/step - loss: 0.2284 - acc: 0.1210 - val_loss: 0.2143 - val_acc: 0.0900\n", "Epoch 2/10\n", "1000/1000 [==============================] - 0s 37us/step - loss: 0.2284 - acc: 0.1200 - val_loss: 0.2143 - val_acc: 0.0900\n", "Epoch 3/10\n", "1000/1000 [==============================] - 0s 39us/step - loss: 0.2284 - acc: 0.1190 - val_loss: 0.2143 - val_acc: 0.0900\n", "Epoch 4/10\n", "1000/1000 [==============================] - 0s 39us/step - loss: 0.2284 - acc: 0.1180 - val_loss: 0.2142 - val_acc: 0.0900\n", "Epoch 5/10\n", "1000/1000 [==============================] - 0s 43us/step - loss: 0.2283 - acc: 0.1180 - val_loss: 0.2142 - val_acc: 0.0900\n", "Epoch 6/10\n", "1000/1000 [==============================] - 0s 37us/step - loss: 0.2283 - acc: 0.1190 - val_loss: 0.2142 - val_acc: 0.0900\n", "Epoch 7/10\n", "1000/1000 [==============================] - 0s 47us/step - loss: 0.2283 - acc: 0.1200 - val_loss: 0.2142 - val_acc: 0.0900\n", "Epoch 8/10\n", "1000/1000 [==============================] - 0s 35us/step - loss: 0.2283 - acc: 0.1200 - val_loss: 0.2141 - val_acc: 0.0900\n", "Epoch 9/10\n", "1000/1000 [==============================] - 0s 37us/step - loss: 0.2283 - acc: 0.1200 - val_loss: 0.2141 - val_acc: 0.0900\n", "Epoch 10/10\n", "1000/1000 [==============================] - 0s 45us/step - loss: 0.2283 - acc: 0.1200 - val_loss: 0.2141 - val_acc: 0.0800\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "# generate synthetic training dataset\n", "x_train = np.random.random((1000, 16))\n", "y_train = np.random.random((1000, 8))\n", "\n", "# generate synthetic validation data\n", "x_valid = np.random.random((100, 16))\n", "y_valid = np.random.random((100, 8))\n", "\n", "# fit the model using training dataset\n", "# over 10 epochs of 32 batch size each\n", "# report training progress against validation data\n", "model.fit(x=x_train, y=y_train, \n", " batch_size=32, epochs=10, \n", " validation_data=(x_valid, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The .fit method takes three important arguments:\n", "- x, y: training input independent and dependent datasets\n", "- batch_size: the model slices the data into smaller batches and iterates over these batches during training. This integer specifies the size of each batch.\n", "- epochs: an epoch is one iteration over the entire input data (done in smaller batches).\n", "- validation_data: [optional] validation data against which compute the loss and metrics in inference mode at the end of each epoch.\n", "\n", "A trained model contains fitted weights for each layer. We can inspect weight from each layers using the get_weights method, which returns an array of two arrays: the first are the weights belonging to input of the layer, the second are the weights associated to layer's bias." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "layer nodes weights: (16, 32)\n", "layer bias weights: (32,)\n", "layer nodes weights: (32, 16)\n", "layer bias weights: (16,)\n", "layer nodes weights: (16, 8)\n", "layer bias weights: (8,)\n" ] } ], "source": [ "for l in model.layers:\n", " w = l.get_weights()\n", " print(\"layer nodes weights: \", w[0].shape)\n", " print(\"layer bias weights: \", w[1].shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Evaluation and Prediction\n", "Once the training process has completed, you can evaluate the model over different test dataset. The evaluate method returns the loss value and, if the model was compiled providing also a metrics argument, the metric values for the model in test mode.\n", "\n", "When evaluating a model, the samples in a batch are processed independently, in parallel, so the larger is the batch, the sooner the evaluation task will complete." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100/100 [==============================] - 0s 42us/step\n" ] }, { "data": { "text/plain": [ "[0.2140801239013672, 0.08]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.evaluate(x_valid, y_valid, batch_size=32)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The predict method generates output prediction from an input dataset provided to the model.\n", "\n", "When running a prediction, the samples in a batch are processed independently, in parallel, so the larger is the batch, the sooner the prediction task will complete." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Input dataset shape: (100, 16)\n", "Predicted results shape: (100, 8)\n" ] } ], "source": [ "print(\"Input dataset shape: \", x_valid.shape)\n", "# this model maps an 16 dims problem into an 8 dims\n", "y_predicted = model.predict(x_valid, batch_size=128)\n", "print(\"Predicted results shape: \", y_predicted.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save and Restore a Model\n", "A trained model can be saved and stored to a file for later retreival. This allows you to checkpoint a model and resume training later without rebuiling and training from scratch.\n", "\n", "Files are saved in HDF5 format, within all weight values, model's configuration and even the optimizer's configuration." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "save_model_path='saved/intro_model'\n", "model.save(filepath=save_model_path, include_optimizer=True)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "model = tf.keras.models.load_model(filepath=save_model_path)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }