UHG
Search
Close this search box.

A guide to GluonNLP: Deep Learning framework for NLP

GluonNLP is a Natural language processing Deep learning-based toolkit. This toolkit includes cutting-edge pre-trained models, training scripts, and training logs to help with rapid prototyping and reproducible research.

Share

Natural language processing is one of the most explored and currently trending topics in machine learning. By the NLP daily digital needs such as smart assistance, language translation, text prediction, etc are being addressed. In context to the various libraries used in this field, today in this post we are going to discuss a GluonNLP Natural language processing Deep learning-based toolkit. This toolkit includes cutting-edge pre-trained models, training scripts, and training logs to help with rapid prototyping and reproducible research. We also offer modular APIs with flexible building pieces for easy customization. Following are the major points that we are going to discuss in this post.     

Table of contents

  1. The GluonNLP
  2. Design of the library 
  3. Generating text sequence with GluonNLP

Let’s first understand the library structure.

The GluonNLP

Deep learning has spurred rapid progress in artificial intelligence research, resulting in remarkable discoveries on long-standing problems in a wide range of natural language processing areas. Deep learning frameworks like MXNet, PyTorch, TensorFlow, Caffe, Apache, and Theano make this possible. 

These frameworks have been crucial in the transmission of ideas in the field.  In particular, imperative tools, which were perhaps popularized by Chainer, are straightforward to develop,

learn, read, and debug. Such benefits hasten the imperative programming interface. 

Jian Guo et al create and develop the GluonNLP toolkits for deep learning in natural language processing using MXNet’s imperative Gluon API. GluonNLP simultaneously provides modular APIs to allow customization by reusing efficient building blocks; pretrained state-of-the-art models, training scripts, and training logs to enable fast prototyping and promote reproducible research; and models that can be deployed in a wide variety of programming languages, including C++, Clojure, Java, Julia, Perl, Python, R, and Scala.

Features of library

Here we’ll discuss the major highlights of this library. 

Modular API

Users may tailor their model design, training, and inference by reusing efficient components across various models with GluonNLP’s modular APIs. Data processing tools, models with individual components, initialization procedures, and loss functions are examples of common components.

Take the data API of GluonNLP, which is used to design efficient data pipelines, as an example of how the modular API supports efficient implementation.

with data provided by users In natural language processing jobs, inputs are frequently of various shapes, such as sentences of various lengths. As a result, the data API includes a set of utilities for sampling inputs and converting them into mini-batches that may be computed quickly.

Pre-trained models

Building on such modular APIs, GluonCV/NLP provides pre-trained state-of-the-art models, training scripts, and training logs via the model zoo, enabling fast prototyping and encouraging repeatable research. Over 200 models have been supplied by GluonNLP for natural languages processing tasks such as word embedding, language modelling, machine translation, sentiment analysis, natural language inference, dependency parsing, and question answering.

Generating text sequence with GluonNLP

In this section by leveraging this library API,  how to sample and generate a text sequence using a pre-trained language model.  Using a language model, we can sample sequences based on the likelihood that they will appear in our model for a particular vocabulary size and sequence length. 

Given the context from previous time steps, a language model predicts the likelihood of each word happening at each time step.GluonNLP provides two samplers for generating from a language model for this purpose: BeamSearchSampler and SequenceSampler, of which we will use SequenceSampler.

Let’s now quickly install the dependencies.  

# install dependencies
!pip install gluonnlp 
!pip install mxnet 

To begin, load an AWD LSTM language model, which is a state-of-the-art RNN language pre-trained language model from which we will sample sequences.

# loading the pre-trained model
import mxnet as mx
import gluonnlp as nlp
 
ctx = mx.cpu()
lm_model, vocab = nlp.model.get_model(name='awd_lstm_lm_1150',
                                      dataset_name='wikitext-2',
                                      pretrained=True,
                                      ctx=ctx)

A scorer function is required for Sequence Sampler to function. As the scorer function, we will utilize the BeamSearchScorer, which implements the scoring function with a length penalty.

# scorer
scorer = nlp.model.BeamSearchScorer(alpha=0, K=5, from_logits=False)

Next, we need to define a decoder based on the pre-trained language model.

#decoder
class LMDecoder(object):
    def __init__(self, model):
        self._model = model
    def __call__(self, inputs, states):
        outputs, states = self._model(mx.nd.expand_dims(inputs, axis=0), states)
        return outputs[0], states
    def state_info(self, *arg, **kwargs):
        return self._model.state_info(*arg, **kwargs)
decoder = LMDecoder(lm_model)

Now that we have a scorer and a decoder, we’re ready to construct a sampler. The example code below shows how to make a sequence sampler. We’ll make a sampler with 5 beams and a maximum sample length of 100 to control softmax activation.

# create sampler
seq_sampler = nlp.model.SequenceSampler(beam_size=5,
                                        decoder=decoder,
                                        eos_id=eos_id,
                                        max_length=100,
                                        temperature=0.97)

Next, we’ll produce sentences that begin with “I enjoy swimming.” We feed the language model [‘I,’ ‘love,’ ‘to’] to retrieve the starting states and set the initial input to be the word ‘swim’. 

# generate samples
bos = 'I love to swim'.split()
bos_ids = [vocab[ele] for ele in bos]
begin_states = lm_model.begin_state(batch_size=1, ctx=ctx)
if len(bos_ids) > 1:
    _, begin_states = lm_model(mx.nd.expand_dims(mx.nd.array(bos_ids[:-1]), axis=1),
                               begin_states)
inputs = mx.nd.full(shape=(1,), ctx=ctx, val=bos_ids[-1])

All this can be combined with a helper function by which using a single line we can generate the sequence. 

# helper function
def generate_sequences(sampler, inputs, begin_states, num_print_outcomes):
 
    samples, scores, valid_lengths = sampler(inputs, begin_states)
    samples = samples[0].asnumpy()
    scores = scores[0].asnumpy()
    valid_lengths = valid_lengths[0].asnumpy()
    print('Generation Result:')
 
    for i in range(num_print_outcomes):
        sentence = bos[:-1]
 
        for ele in samples[i][:valid_lengths[i]]:
            sentence.append(vocab.idx_to_token[ele])
 
        print([' '.join(sentence), scores[i]])

Below now we can generate the sequence.

generate_sequences(seq_sampler, inputs, begin_states, 5)

Here is the output of the function,

As we can see the generated context is quite suitable for our original sentence.

Final words

Through this post, we have discussed the GluonNLP, a deep learning-based library to address various task-related NLP such as sentiment analysis, word embeddings, sequence generation, etc. We may experiment with various applications of natural language processing by leveraging its modular APIs and pre-trained models.   

References

📣 Want to advertise in AIM? Book here

Picture of Vijaysinh Lendave

Vijaysinh Lendave

Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.
Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord-icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.