Top Unique Features about Microsoft Cognitive Toolkit (CNTK) 2024

What is Microsoft Cognitive Toolkit ? Microsoft Cognitive Toolkit (CNTK), originally known as Computational Network Toolkit, is a free, user-friendly, open-source, commercial-grade toolkit that allows us to train deep learning algorithms to learn like the …

Written by: Anupinder Singh

Published on: 22 February 2024

Table of Contents

What is Microsoft Cognitive Toolkit ?

Microsoft Cognitive Toolkit (CNTK), originally known as Computational Network Toolkit, is a free, user-friendly, open-source, commercial-grade toolkit that allows us to train deep learning algorithms to learn like the human brain. It enables us to build well-known deep learning systems such as feed-forward neural network time series prediction systems and convolutional neural network (CNN) image classifiers.

Comparison of 1.0 and 2.0 version of Microsoft Cognitive Toolkit

Microsoft Cognitive Toolkit

Features

CNTK includes highly optimized built-in components capable of handling multidimensional dense or sparse data from Python, C++, or BrainScript.
We can use CNN, FNN, RNN, Batch Normalization, and Sequence-to-Sequence with care.
It allows us to add additional user-defined core components on the GPU using Python.
It also supports automatic hyperparameter adjustment.
We can use reinforcement learning, generative adversarial networks (GANs), supervised and unsupervised learning.
For large datasets, CNTK has optimised readers.

Advantages of Microsoft Cognitive Toolkit

With cutting-edge accuracy, deep learning models can be trained using it.
It features a very robust C++ API as well as high-level, user-friendly Python APIs that are both low-level and functional programming paradigm-driven.
Scaling it across thousands of GPUs is not difficult.
Its support for C#,.NET, and Java inference makes it simple to include CNTK evaluation into user programs.
It’s simple to expand from Python for learners and layers.
Built-in readers. Effective built-in data readers in CNTK facilitate distributed learning as well.
Since Microsoft’s internal product groups use the same toolbox, you wouldn’t be compromised in any way.

Tutorial for working with Microsoft Cognitive Toolkit:

Source: Microsoft Cognitive Toolkit (CNTK) for Deep Learning

Architecture of Microsoft Cognitive Toolkit

Microsoft Cognitive Toolkit – Sequence Classification

Tensors

Tensor is the idea that CNTK operates on. CNTK inputs, outputs, and parameters are essentially arranged as tensors, which are frequently compared to generalized matrices. Each tensor possesses a rank −

Tensor of rank 0 is a scalar.
Tensor of rank 1 is a vector.
Tensor of rank 2 is a matrix.

Static and Dynamic Axes

The static axis, as their name suggests, remain the same length for the duration of the network. However, dynamic axes might differ in length from instance to instance. Actually, each minibatch is usually provided without their length being known beforehand.

Because they also define a meaningful grouping of the tensor’s integers, dynamic axes are similar to static axes.

Using sequences in the Microsoft Cognitive Toolkit

Long-Short Term Memory Network (LSTM)

Hochreiter & Schmidhuber introduced long-short term memory (LSTM) networks. It resolved the issue of achieving long-term memory retention in a basic recurrent layer. The diagram above shows the LSTM architecture. Its input neurons, memory cells, and output neurons are all visible. Long-short term memory networks use an explicit memory cell (which stores the previous values) and the following gates to counteract the vanishing gradient problem. −

Forget gate: It instructs the memory cell to erase the prior values, as the name suggests. Until the gate, also known as the “forget gate,” instructs it to forget the values, the memory cell keeps them.
Input gate: As the name suggests, an input gate adds new elements to the cell.
Output Gate: The output gate determines when to forward the vectors from the cell to the subsequent hidden state, as its name suggests.

With CNTK, working with sequences is quite simple. With the help of the following example, let’s examine it:

import sys
import os

from cntk import Trainer, Axis
from cntk.io import MinibatchSource, CTFDeserializer, StreamDef, StreamDefs,\
   INFINITELY_REPEAT
from cntk.learners import sgd, learning_parameter_schedule_per_sample
from cntk import input_variable, cross_entropy_with_softmax, \
   classification_error, sequence
from cntk.logging import ProgressPrinter
from cntk.layers import Sequential, Embedding, Recurrence, LSTM, Dense

def create_reader(path, is_training, input_dim, label_dim):
   return MinibatchSource(CTFDeserializer(path, StreamDefs(
      features=StreamDef(field='x', shape=input_dim, is_sparse=True),
      labels=StreamDef(field='y', shape=label_dim, is_sparse=False)
   )), randomize=is_training,
   max_sweeps=INFINITELY_REPEAT if is_training else 1)

def LSTM_sequence_classifier_net(input, num_output_classes, embedding_dim,
LSTM_dim, cell_dim):
   lstm_classifier = Sequential([Embedding(embedding_dim),
      Recurrence(LSTM(LSTM_dim, cell_dim)),
      sequence.last,
      Dense(num_output_classes)])
return lstm_classifier(input)

def train_sequence_classifier():
   input_dim = 2000
   cell_dim = 25
   hidden_dim = 25
   embedding_dim = 50
   num_output_classes = 5
   features = sequence.input_variable(shape=input_dim, is_sparse=True)
   label = input_variable(num_output_classes)
   classifier_output = LSTM_sequence_classifier_net(
   features, num_output_classes, embedding_dim, hidden_dim, cell_dim)
   ce = cross_entropy_with_softmax(classifier_output, label)
   pe =      classification_error(classifier_output, label)
   rel_path = ("../../../Tests/EndToEndTests/Text/" +
      "SequenceClassification/Data/Train.ctf")
   path = os.path.join(os.path.dirname(os.path.abspath(__file__)), rel_path)
   reader = create_reader(path, True, input_dim, num_output_classes)
input_map = {
   features: reader.streams.features,
   label: reader.streams.labels
}
lr_per_sample = learning_parameter_schedule_per_sample(0.0005)
progress_printer = ProgressPrinter(0)
trainer = Trainer(classifier_output, (ce, pe),
sgd(classifier_output.parameters, lr=lr_per_sample),progress_printer)
minibatch_size = 200
for i in range(255):
   mb = reader.next_minibatch(minibatch_size, input_map=input_map)
trainer.train_minibatch(mb)
   evaluation_average = float(trainer.previous_minibatch_evaluation_average)
   loss_average = float(trainer.previous_minibatch_loss_average)
return evaluation_average, loss_average
if __name__ == '__main__':
   error, _ = train_sequence_classifier()
   print(" error: %f" % error)

Output:

average since average since examples
loss last metric last
——————————————————
1.61 1.61 0.886 0.886 44
1.61 1.6 0.714 0.629 133
1.6 1.59 0.56 0.448 316
1.57 1.55 0.479 0.41 682
1.53 1.5 0.464 0.449 1379
1.46 1.4 0.453 0.441 2813
1.37 1.28 0.45 0.447 5679
1.3 1.23 0.448 0.447 11365

error: 0.333333

Leave a Comment Cancel reply

About Anupinder Singh

I started my career as a Programmer, but my passion for sharing knowledge made me shift to academic teaching. Throughout my intellectual engagement, subjects taught were from the core computer science & engineering domains and the latest trends which help me to be a content creator.

More from this author