rohit apte

I subscribe to multiple publications including Wall Street Journal, Financial Times, CNN, MSNBC, etc. The problem with having so much information at your fingertips is that I usually go to the front page of each website and follow a few links. I almost always miss relevant articles because the links are usually not easy to follow. I wanted to aggregate information so that articles on related topics were grouped together. For that I had to

Scan and download all new articles on each site.
Cluster the articles automatically based on keywords.

I wrote a webcrawler in Python that indexes each site and pulls in new articles I have not downloaded before. There is plenty of literature out there on scraping websites. The book I found useful was Web Scraping with Python (sold on Amazon).

Once we have the articles we preprocess the data by

Converting everything to lowercase
Add part of speech tagging for each word. This classifies the word as a Noun, Verb, Adverb, etc.
Lemmatize the word, i.e. group together inflected forms of a word into a single item. For example

am, are, is ⇒ be

having, have, had ⇒have

car, cars ⇒car

Run TfIdf on the lemmatized documents. TfIdf stands for Term Frequency, Inverse Document Frequency. In a nutshell, a word is more important (and has higher weight) if it occurs more often in a document (the Term Frequency part), except if it occurs commonly in other documents in our corpus (the Inverse Document Frequency part).
TfIdf gives us a matrix of important words. We can then cluster the words using a common clustering algorithm like KMeans.

Here are what some of the clusters look like for all the WSJ articles from 30th April 2017 (along with the first 10 article titles in each cluster).

states, plan, house, cost, governments

Fugitive Mexican Ex-Governor Tomás Yarrington Had State Security While on the Run
Australia Considers Cross-Continent Pipeline to Beat Gas Shortages
GOP Health-Care Push Falls Short Again
America’s Most Anti-Reform Institution? The Media
Fugitive Mexican Governor Arrested in Guatemala
Pentagon Investigates Whether Army Rangers in Afghanistan Were Killed by Friendly Fire
New Plan, Same Hurdle in GOP’s Quest to Gut Obamacare
The Resurgent Threat of al Qaeda
Saudi Arabia Reinstates Perks for State Employees as Finances Improve
Trump Unveils Broad Tax-Cut Plan

growth, economy, economic, quarter, rose

Mexican Economy Maintains Growth in First Quarter
Outlook for Kenyan Economy Dimmed by Severe Drought
Economic Growth Lags Behind Rising Confidence Data
Economists See Growth Climbing in 2017, 2018, Then Dissipating
Economy Needs Consumers to Shop Again
U.K. Economy Slows Sharply Ahead of Election, Brexit Talks
From Diapers to Soda, Big Brands Feel Pinch as Consumers Pull Back
Stars Align for Emmanuel Macron—and France
Consumer Sentiment Remains High Despite GDP Report
South Korea’s Economy Grew 0.9% in First Quarter

trump, u.s., officials, president, administration

Pentagon Opens Probe Into Michael Flynn’s Foreign Payments
Two U.S. Service Members Killed in Afghanistan
Immigrant Crackdown Worries Food and Construction Industries
U.S. Launches Cruise Missiles at Syrian Air Base in Response to Chemical Attack
At NRA Meeting, Trump Warns of Challengers in 2020
Trump Backers in Phoenix Region Are Fine With His Learning Curve
Relative of Imprisoned Iranian-Americans Appeals to Trump for Help
Trump Issues New Warning to North Korea
U.S. Presses China on North Korea After Failed Missile Test
Trump’s Bipartisan War Coalition

share, billion, quarter, sales, company

What’s Keeping GM Going Strong Probably Won’t Last
Cardinal Health’s $6.1 Billion Deal for Some Medtronic Operations Raises Debt Concerns
UnitedHealth Profits Rise as it Exits Health-Care Exchanges
Lockheed Martin Hit By Middle East Charges
Johnson & Johnson Lifts Forecast on Actelion Tie-Up
Alphabet and Amazon Extend an Earnings Boom
How an ETF Gold Rush Rattled Mining Stocks
Ski-Park Operator Intrawest to Go Private in Latest Resort Deal
S&P’s Warning: Here Are 10 Public Retailers Most in Danger of Default
Advertising’s Biggest Threat Isn’t Digital Disruption

u.s., trade, company, trump, billion

Mexico Registers Small Trade Deficit in March
Coming to America: How Immigration Policy Has Changed the U.S.
Boeing Files Petition With Commerce Dept. Over Bombardier
Trump Administration Mulls More Trade Actions, Commerce Secretary Says
U.S. Hoteliers Go on Charm Offensive Amid Concerns Over Trump Policies
Today’s Top Supply Chain and Logistics News From WSJ
Dear Canada: It’s Not Personal, It’s Just Trade
Today’s Top Supply Chain and Logistics News From WSJ
Venezuela Creditor Seeks Asset Freeze on U.S. Refiner Citgo
Seoul Plays Down Possibility of Pre-Emptive U.S. Strike on North Korea

CIFAR-10 is an established computer vision dataset used for image recognition. Its a subset of 80 million tiny images collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. This is the link to the website.

The CIFAR-10 dataset consists of 60,000 32×32 color images of 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images.

As a first post, I wanted to write a deep learning algorithm to identify images in the CIFAR-10 database. This topic has been very widely covered – in fact Google’s Tensorflow tutorials cover this very well. However, they do a few things that made it difficult for me to follow their code

They split the code across multiple files making it difficult to follow.
They use a binary version of the file and a file stream to feed Tensorflow.

I downloaded the python version of the data and loaded all the variables into memory. There is some image manipulation done in the tensorflow tutorial that I recreated in the numpy arrays directly and we will discuss it below.

Prerequisites for this tutorial:
Other than Python (obviously!)

numpy
pickle
sklearn
tensorflow

For TensorFlow I strongly recommend the GPU version if you have the set-up for it. The code takes 6 hours on my dual GTX Titan X machine and running it on a CPU will probably take days or weeks!
Assuming you have everything working, lets get started!

Start with our import statements

import numpy as np
import pickle
from sklearn import preprocessing
import random
import math
import os
from six.moves import urllib
import tarfile
import tensorflow as tf

import numpy as np

import pickle

from sklearn import preprocessing

import random

import math

import os

from six.moves import urllib

import tarfile

import tensorflow as tf

Declare some global variables we will use. In our code we are using GradientDescentOptimizer with learning rate decay. I have tested the same code with the AdamOptimizer. Adam runs faster but gives slightly worse results. If you do decide to use the AdamOptimizer, drop the learning rate to 0.0001. This is the link to the paper on Adam optimization.

NUM_FILE_BATCHES=5
LEARNING_RATE = 0.1
LEARNING_RATE_DECAY=0.1
NUM_GENS_TO_WAIT=250.0
TRAINING_ITERATIONS = 30000
DROPOUT = 0.5
BATCH_SIZE = 500
IMAGE_TO_DISPLAY = 10

NUM_FILE_BATCHES=5

LEARNING_RATE = 0.1

LEARNING_RATE_DECAY=0.1

NUM_GENS_TO_WAIT=250.0

TRAINING_ITERATIONS = 30000

DROPOUT = 0.5

BATCH_SIZE = 500

IMAGE_TO_DISPLAY = 10

Create data directory and download data if it doesn’t exist – this code will not run if we have already downloaded the data.

data_dir='data'
if not os.path.exists(data_dir):
  os.makedirs(data_dir)
cifar10_url='https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'

data_file=os.path.join(data_dir, 'cifar-10-binary.tar.gz')
if os.path.isfile(data_file):
  pass
else:
  def progress(block_num, block_size, total_size):
    progress_info = [cifar10_url, float(block_num * block_size) / float(total_size) * 100.0]
    print('\r Downloading {} - {:.2f}%'.format(*progress_info), end="")
  filepath, _ = urllib.request.urlretrieve(cifar10_url, data_file, progress)
  tarfile.open(filepath, 'r:gz').extractall(data_dir)

data_dir='data'

if not os.path.exists(data_dir):

os.makedirs(data_dir)

cifar10_url='https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'

data_file=os.path.join(data_dir, 'cifar-10-binary.tar.gz')

if os.path.isfile(data_file):

pass

else:

def progress(block_num, block_size, total_size):

progress_info = [cifar10_url, float(block_num * block_size) / float(total_size) * 100.0]

print('\r Downloading {} - {:.2f}%'.format(*progress_info), end="")

filepath, _ = urllib.request.urlretrieve(cifar10_url, data_file, progress)

tarfile.open(filepath, 'r:gz').extractall(data_dir)

Load data into numpy arrays. The code below loads the labels from the batches.meta file, and the training and test data. The training data is split across 5 files. We also one hot encode the labels.

with open('data/cifar-10-batches-py/batches.meta',mode='rb') as file:
  batch=pickle.load(file,encoding='latin1')
label_names=batch['label_names']


def load_cifar10data(filename):
  with open(filename,mode='rb') as file:
    batch=pickle.load(file,encoding='latin1')
    features=batch['data'].reshape((len(batch['data']),3,32,32)).transpose(0,2,3,1)
    labels=batch['labels']
    return features,labels

x_train=np.zeros(shape=(0,32,32,3))
train_labels=[]
for i in range(1,NUM_FILE_BATCHES+1):
  ft,lb=load_cifar10data('data/cifar-10-batches-py/data_batch_'+str(i))
  x_train=np.vstack((x_train,ft))
  train_labels.extend(lb)


unique_labels=list(set(train_labels))
lb=preprocessing.LabelBinarizer()
lb.fit(unique_labels)
y_train=lb.transform(train_labels)

x_test_data,test_labels=load_cifar10data('data/cifar-10-batches-py/test_batch')
y_test=lb.transform(test_labels)

with open('data/cifar-10-batches-py/batches.meta',mode='rb') as file:

batch=pickle.load(file,encoding='latin1')

label_names=batch['label_names']

def load_cifar10data(filename):

with open(filename,mode='rb') as file:

batch=pickle.load(file,encoding='latin1')

features=batch['data'].reshape((len(batch['data']),3,32,32)).transpose(0,2,3,1)

labels=batch['labels']

return features,labels

x_train=np.zeros(shape=(0,32,32,3))

train_labels=[]

for i in range(1,NUM_FILE_BATCHES+1):

ft,lb=load_cifar10data('data/cifar-10-batches-py/data_batch_'+str(i))

x_train=np.vstack((x_train,ft))

train_labels.extend(lb)

unique_labels=list(set(train_labels))

lb=preprocessing.LabelBinarizer()

lb.fit(unique_labels)

y_train=lb.transform(train_labels)

x_test_data,test_labels=load_cifar10data('data/cifar-10-batches-py/test_batch')

y_test=lb.transform(test_labels)

Having more training data can improve our algorithms. Since we are confined to 50,000 training images (5,000 for each category) we can “manufacture” more images using small image manipulations. We do 3 transformations – flip the image horizontally, randomly adjust the brightness and randomly adjust the contrast. We also normalize the data. Note that there are different ways to do this, but standardization works best for image. However rescaling can be an option as well.

def updateImage(x_train_data,distort=True):
  x_temp=x_train_data.copy()
  x_output=np.zeros(shape=(0,32,32,3))
  for i in range(0,x_temp.shape[0]):
    temp=x_temp[i]
    if distort:
      if random.random()>0.5:
        temp=np.fliplr(temp)
      brightness=random.randint(-63,63)
      temp=temp+brightness
      contrast=random.uniform(0.2,1.8)
      temp=temp*contrast
    mean=np.mean(temp)
    stddev=np.std(temp)
    temp=(temp-mean)/stddev
    temp=np.expand_dims(temp,axis=0)
    x_output=np.append(x_output,temp,axis=0)
  return x_output
#update test data since we don't have to apply distortions
#for training data for each batch we will randomly distory before normalizing
x_test=updateImage(x_test_data,False)

def updateImage(x_train_data,distort=True):

x_temp=x_train_data.copy()

x_output=np.zeros(shape=(0,32,32,3))

for i in range(0,x_temp.shape[0]):

temp=x_temp[i]

if distort:

if random.random()>0.5:

temp=np.fliplr(temp)

brightness=random.randint(-63,63)

temp=temp+brightness

contrast=random.uniform(0.2,1.8)

temp=temp*contrast

mean=np.mean(temp)

stddev=np.std(temp)

temp=(temp-mean)/stddev

temp=np.expand_dims(temp,axis=0)

x_output=np.append(x_output,temp,axis=0)

return x_output

#update test data since we don't have to apply distortions

#for training data for each batch we will randomly distory before normalizing

x_test=updateImage(x_test_data,False)

Now comes the fun part. This is what our network looks like.

Lets define the various layers of the network. The last line of code (logits=tf.identity(final_output,name=’logits’)) is done in case you want to view the model in TensorBoard.

def truncated_normal_var(name, shape, dtype):
  return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.05)))
def zero_var(name, shape, dtype):
  return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

x=tf.placeholder(tf.float32,shape=[None,x_train.shape[1],x_train.shape[2],x_train.shape[3]],name='x')
labels=tf.placeholder(tf.float32,shape=[None,y_train.shape[1]],name='labels')
keep_prob=tf.placeholder(tf.float32,name='keep_prob')

with tf.variable_scope('conv1') as scope:
  conv1_kernel=truncated_normal_var(name='conv1_kernel',shape=[5,5,3,64],dtype=tf.float32)
  strides=[1,1,1,1]
  conv1=tf.nn.conv2d(x,conv1_kernel,strides,padding='SAME')
  conv1_bias=zero_var(name='conv1_bias',shape=[64],dtype=tf.float32)
  conv1_add_bias=tf.nn.bias_add(conv1,conv1_bias)
  relu_conv1=tf.nn.relu(conv1_add_bias)

pool_size=[1,3,3,1]
strides=[1,2,2,1]
pool1=tf.nn.max_pool(relu_conv1,ksize=pool_size,strides=strides,padding='SAME',name='pool_layer1')
norm1=tf.nn.lrn(pool1,depth_radius=4,bias=1.0,alpha=0.001/9.0,beta=0.75,name='norm1')

with tf.variable_scope('conv2') as scope:
  conv2_kernel=truncated_normal_var(name='conv2_kernel',shape=[5,5,64,64],dtype=tf.float32)
  strides=[1,1,1,1]
  conv2=tf.nn.conv2d(norm1,conv2_kernel,strides,padding='SAME')
  conv2_bias=zero_var(name='conv2_bias',shape=[64],dtype=tf.float32)
  conv2_add_bias=tf.nn.bias_add(conv2,conv2_bias)
  relu_conv2=tf.nn.relu(conv2_add_bias)

pool_size=[1,3,3,1]
strides=[1,2,2,1]
pool2=tf.nn.max_pool(relu_conv2,ksize=pool_size,strides=strides,padding='SAME',name='pool_layer2')
norm2=tf.nn.lrn(pool2,depth_radius=4,bias=1.0,alpha=0.001/9.0,beta=0.75,name='norm2')

reshaped_output=tf.reshape(norm2, [-1, 8*8*64])
reshaped_dim=reshaped_output.get_shape()[1].value

with tf.variable_scope('full1') as scope:
  full_weight1=truncated_normal_var(name='full_mult1',shape=[reshaped_dim,1024],dtype=tf.float32)
  full_bias1=zero_var(name='full_bias1',shape=[1024],dtype=tf.float32)
  full_layer1=tf.nn.relu(tf.add(tf.matmul(reshaped_output,full_weight1),full_bias1))
  full_layer1=tf.nn.dropout(full_layer1,keep_prob)

with tf.variable_scope('full2') as scope:
  full_weight2=truncated_normal_var(name='full_mult2',shape=[1024, 256],dtype=tf.float32)
  full_bias2=zero_var(name='full_bias2',shape=[256],dtype=tf.float32)
  full_layer2=tf.nn.relu(tf.add(tf.matmul(full_layer1,full_weight2),full_bias2))
  full_layer2=tf.nn.dropout(full_layer2,keep_prob)

with tf.variable_scope('full3') as scope:
  full_weight3=truncated_normal_var(name='full_mult3',shape=[256,IMAGE_TO_DISPLAY],dtype=tf.float32)
  full_bias3=zero_var(name='full_bias3',shape=[IMAGE_TO_DISPLAY],dtype=tf.float32)
  final_output=tf.add(tf.matmul(full_layer2,full_weight3),full_bias3,name='final_output')

logits=tf.identity(final_output,name='logits')

def truncated_normal_var(name, shape, dtype):

return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.truncated_normal_initializer(stddev=0.05)))

def zero_var(name, shape, dtype):

return(tf.get_variable(name=name, shape=shape, dtype=dtype, initializer=tf.constant_initializer(0.0)))

x=tf.placeholder(tf.float32,shape=[None,x_train.shape[1],x_train.shape[2],x_train.shape[3]],name='x')

labels=tf.placeholder(tf.float32,shape=[None,y_train.shape[1]],name='labels')

keep_prob=tf.placeholder(tf.float32,name='keep_prob')

with tf.variable_scope('conv1') as scope:

conv1_kernel=truncated_normal_var(name='conv1_kernel',shape=[5,5,3,64],dtype=tf.float32)

strides=[1,1,1,1]

conv1=tf.nn.conv2d(x,conv1_kernel,strides,padding='SAME')

conv1_bias=zero_var(name='conv1_bias',shape=[64],dtype=tf.float32)

conv1_add_bias=tf.nn.bias_add(conv1,conv1_bias)

relu_conv1=tf.nn.relu(conv1_add_bias)

pool_size=[1,3,3,1]

strides=[1,2,2,1]

pool1=tf.nn.max_pool(relu_conv1,ksize=pool_size,strides=strides,padding='SAME',name='pool_layer1')

norm1=tf.nn.lrn(pool1,depth_radius=4,bias=1.0,alpha=0.001/9.0,beta=0.75,name='norm1')

with tf.variable_scope('conv2') as scope:

conv2_kernel=truncated_normal_var(name='conv2_kernel',shape=[5,5,64,64],dtype=tf.float32)

strides=[1,1,1,1]

conv2=tf.nn.conv2d(norm1,conv2_kernel,strides,padding='SAME')

conv2_bias=zero_var(name='conv2_bias',shape=[64],dtype=tf.float32)

conv2_add_bias=tf.nn.bias_add(conv2,conv2_bias)

relu_conv2=tf.nn.relu(conv2_add_bias)

pool_size=[1,3,3,1]

strides=[1,2,2,1]

pool2=tf.nn.max_pool(relu_conv2,ksize=pool_size,strides=strides,padding='SAME',name='pool_layer2')

norm2=tf.nn.lrn(pool2,depth_radius=4,bias=1.0,alpha=0.001/9.0,beta=0.75,name='norm2')

reshaped_output=tf.reshape(norm2, [-1, 8*8*64])

reshaped_dim=reshaped_output.get_shape()[1].value

with tf.variable_scope('full1') as scope:

full_weight1=truncated_normal_var(name='full_mult1',shape=[reshaped_dim,1024],dtype=tf.float32)

full_bias1=zero_var(name='full_bias1',shape=[1024],dtype=tf.float32)

full_layer1=tf.nn.relu(tf.add(tf.matmul(reshaped_output,full_weight1),full_bias1))

full_layer1=tf.nn.dropout(full_layer1,keep_prob)

with tf.variable_scope('full2') as scope:

full_weight2=truncated_normal_var(name='full_mult2',shape=[1024, 256],dtype=tf.float32)

full_bias2=zero_var(name='full_bias2',shape=[256],dtype=tf.float32)

full_layer2=tf.nn.relu(tf.add(tf.matmul(full_layer1,full_weight2),full_bias2))

full_layer2=tf.nn.dropout(full_layer2,keep_prob)

with tf.variable_scope('full3') as scope:

full_weight3=truncated_normal_var(name='full_mult3',shape=[256,IMAGE_TO_DISPLAY],dtype=tf.float32)

full_bias3=zero_var(name='full_bias3',shape=[IMAGE_TO_DISPLAY],dtype=tf.float32)

final_output=tf.add(tf.matmul(full_layer2,full_weight3),full_bias3,name='final_output')

logits=tf.identity(final_output,name='logits')

Now we define our cross entropy and optimization function. If you want to use the AdamOptomizer, uncomment that line, comment the generation_run, model_learning_rate and train_step lines and adjust the learning rate to something lower like 0.0001. Otherwise the model will not converge.

cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,labels=labels),name='cross_entropy')
#train_step=tf.train.AdamOptimizer(LEARNING_RATE).minimize(cross_entropy)
generation_run = tf.Variable(0, trainable=False,name='generation_run')
model_learning_rate=tf.train.exponential_decay(LEARNING_RATE,generation_run,NUM_GENS_TO_WAIT,LEARNING_RATE_DECAY,staircase=True,name='model_learning_rate')
train_step=tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)
correct_prediction=tf.equal(tf.argmax(final_output,1),tf.argmax(labels,1),name='correct_prediction')
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32),name='accuracy')

cross_entropy=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,labels=labels),name='cross_entropy')

#train_step=tf.train.AdamOptimizer(LEARNING_RATE).minimize(cross_entropy)

generation_run = tf.Variable(0, trainable=False,name='generation_run')

model_learning_rate=tf.train.exponential_decay(LEARNING_RATE,generation_run,NUM_GENS_TO_WAIT,LEARNING_RATE_DECAY,staircase=True,name='model_learning_rate')

train_step=tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)

correct_prediction=tf.equal(tf.argmax(final_output,1),tf.argmax(labels,1),name='correct_prediction')

accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32),name='accuracy')

Now we define some functions to run through our batch. For large networks memory tends to be a big constraint. We run through our training data in batches. One epoch is one run through our complete training set (in multiple batches). After each epoch we randomly shuffle our data. This helps improve how our algorithm learns. We run through each batch of data and train our algorithm. We also check for accuracy every 1st, 2nd,…,10th, 20th,…, 100th,… step. Lastly we calculate the final accuracy of the model and save it so we can use the calculated weights on test data without having to re-run it.

epochs_completed=0
index_in_epoch = 0
num_examples=x_train.shape[0]

init=tf.global_variables_initializer()
with tf.Session() as sess:
  sess.run(init)

  def next_batch(batch_size):
    global x_train
    global y_train
    global index_in_epoch
    global epochs_completed
    start = index_in_epoch
    index_in_epoch += batch_size

    if index_in_epoch > num_examples:
      # finished epoch
      epochs_completed += 1
      # shuffle the data
      perm = np.arange(num_examples)
      np.random.shuffle(perm)
      x_train=x_train[perm]
      y_train=y_train[perm]
      # start next epoch
      start = 0
      index_in_epoch = batch_size
      assert batch_size <= num_examples end = index_in_epoch #return x_train[start:end], y_train[start:end] x_output=updateImage(x_train[start:end],True) return x_output,y_train[start:end] # visualisation variables train_accuracies = [] validation_accuracies = [] x_range = [] display_step=1 for i in range(TRAINING_ITERATIONS): #get new batch batch_xs, batch_ys = next_batch(BATCH_SIZE) # check progress on every 1st,2nd,...,10th,20th,...,100th... step if i%display_step == 0 or (i+1) == TRAINING_ITERATIONS: train_accuracy = accuracy.eval(feed_dict={x:batch_xs,labels: batch_ys,keep_prob: 1.0}) validation_accuracy=0.0 for j in range(0,x_test.shape[0]//BATCH_SIZE): validation_accuracy+=accuracy.eval(feed_dict={ x: x_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],labels: y_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],keep_prob: 1.0}) validation_accuracy/=(j+1.0) print('training_accuracy / validation_accuracy => %.2f / %.2f for step %d'%(train_accuracy, validation_accuracy, i))
      validation_accuracies.append(validation_accuracy)
      train_accuracies.append(train_accuracy)
      x_range.append(i)
      # increase display_step
      if i%(display_step*10) == 0 and i:
        display_step *= 10
    # train on batch
    sess.run(train_step, feed_dict={x: batch_xs, labels: batch_ys, keep_prob: DROPOUT})

  validation_accuracy=0.0
  for j in range(0,x_test.shape[0]//BATCH_SIZE):
    validation_accuracy+=accuracy.eval(feed_dict={ x: x_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],labels: y_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],keep_prob: 1.0})
  validation_accuracy/=(j+1.0)
  print('validation_accuracy => %.4f'%validation_accuracy)
  saver=tf.train.Saver()
  save_path=saver.save(sess,'./CIFAR10_model')
  sess.close()

epochs_completed=0

index_in_epoch = 0

num_examples=x_train.shape[0]

init=tf.global_variables_initializer()

with tf.Session() as sess:

sess.run(init)

def next_batch(batch_size):

global x_train

global y_train

global index_in_epoch

global epochs_completed

start = index_in_epoch

index_in_epoch += batch_size

if index_in_epoch > num_examples:

# finished epoch

epochs_completed += 1

# shuffle the data

perm = np.arange(num_examples)

np.random.shuffle(perm)

x_train=x_train[perm]

y_train=y_train[perm]

# start next epoch

start = 0

index_in_epoch = batch_size

assert batch_size <= num_examples end = index_in_epoch #return x_train[start:end], y_train[start:end] x_output=updateImage(x_train[start:end],True) return x_output,y_train[start:end] # visualisation variables train_accuracies = [] validation_accuracies = [] x_range = [] display_step=1 for i in range(TRAINING_ITERATIONS): #get new batch batch_xs, batch_ys = next_batch(BATCH_SIZE) # check progress on every 1st,2nd,...,10th,20th,...,100th... step if i%display_step == 0 or (i+1) == TRAINING_ITERATIONS: train_accuracy = accuracy.eval(feed_dict={x:batch_xs,labels: batch_ys,keep_prob: 1.0}) validation_accuracy=0.0 for j in range(0,x_test.shape[0]//BATCH_SIZE): validation_accuracy+=accuracy.eval(feed_dict={ x: x_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],labels: y_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],keep_prob: 1.0}) validation_accuracy/=(j+1.0) print('training_accuracy / validation_accuracy => %.2f / %.2f for step %d'%(train_accuracy, validation_accuracy, i))

validation_accuracies.append(validation_accuracy)

train_accuracies.append(train_accuracy)

x_range.append(i)

# increase display_step

if i%(display_step*10) == 0 and i:

display_step *= 10

# train on batch

sess.run(train_step, feed_dict={x: batch_xs, labels: batch_ys, keep_prob: DROPOUT})

validation_accuracy=0.0

for j in range(0,x_test.shape[0]//BATCH_SIZE):

validation_accuracy+=accuracy.eval(feed_dict={ x: x_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],labels: y_test[j*BATCH_SIZE : (j+1)*BATCH_SIZE],keep_prob: 1.0})

validation_accuracy/=(j+1.0)

print('validation_accuracy => %.4f'%validation_accuracy)

saver=tf.train.Saver()

save_path=saver.save(sess,'./CIFAR10_model')

sess.close()

The model gives around 81% accuracy on the test set. I have an iPython notebook on my GitHub site that lets you load the saved model and run it on random samples on the test set. It outputs the image vs the softmax probabilities of the top n predictions.

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Convolution	Matrix	Image 1	Image 2
Original Image
Average
Sharpen
Edge Detection 1
Edge Detection 2
Edge Detection 3
Edge Detection 4
Box Blur
Gaussian Blur
Gradient Detection 1
Gradient Detection 2

Experimenting with Javascript

Clustering News Articles

Convolutions

Definition

Image recognition on the CIFAR-10 dataset using deep learning