Data and Tasks jar for Sequence Labeling — Recurrent Neural Networks(RNNs)
In the last article, we discussed the Data and Task jar for Sequence classification-specific problems. In this article, we touch upon the Data and Task jar for Sequence labeling problems
Data and Tasks for Sequence Labeling
Let’s first discuss the objective of Sequence Labeling — Here, for every word in the input sentence, the model predicts an output
Say the input consists of a number of sequences, the tabular representation of the same would be of the form (post tokenization)
And for each word of each sentence/row, there is the corresponding output
For example for the first sentence, the first word is “The” is a determiner, then the second word “first” is an adjective, the third word “half” is a “noun” and so on.
For every word in the input, there would be its respective true output as well - essentially a “1:1 mapping in the sense that each input word would have some output”. And since the input sentence could have a variable number of words, and there is this 1:1 mapping between input and output, which implies “the output would also be of variable length”
And the “input and output needs to be converted to numbers” as the model takes in…