Parts of speech etiquette and dependency grammar | PoS tag

Contents

Target

  • The parts of speech tagging and dependency analysis are techniques widely used in word processing..
  • Understand the parts of speech labels and grammars with their respective use cases in natural language processing.

Introduction

Natural language processing is a branch of machine learning that deals with how machines understand human languages. Text data is a widely available problem domain for NLP tasks.

To work with text data, it is important to transform raw text in a way that machine learning algorithms can understand and use, this is called text preprocessing. We have several techniques for text preprocessing, as lemmatization, lematización, POS tagging and dependency analysis.

Note: If you are more interested in learning concepts in an audiovisual format, we have this full article explained in the video below. If that is not the case, you can keep reading.

In this article, we are going to discuss the properties related to the structure of the text data. Here, we will talk about the parts of speech and dependency grammars that will lead us to understand how they work.

Voice tag parts

Part of speech labels are the properties of words, that define its main context, functions and use in a sentence. Some of the most commonly used part of speech tags are

screenshot-from-2021-03-24-16-36-57-4225937

Nouns: That defines any object or entity

Verbs: That defines some action.

Adjectives and adverbs: Acts as a modifier, quantifier or intensifier in any sentence.

In a sentence, each word will be associated with an appropriate part of the voice tag. For instance, consider the sentence below

screenshot-from-2021-03-24-16-31-32-5047547

In this sentence, each word is associated with a part of the voice tag that defines its functions. Here, David has an NNP tag, which means it is a proper name. What's more, has and bought belong to the verb that indicates that they are the actions. The Laptop store and Apple are the nouns. New is the adjective whose function is to modify the context of the laptop.

The labels of the parts of speech are defined by the relationship of the words to the other words of the sentence.

We can apply machine learning models and rule-based models to get the labels of the speech parts of a word. The Penn Treebank corpus provides the most commonly used parts of voice tag annotations. In which, a total of 48 POS labels according to use.

screenshot-from-2021-03-24-15-17-13-5149298

POS tag use cases

Voice Part Tags have a large number of applications and are used in a variety of tasks, What

  • Text cleaning
  • Function engineering tasks
  • Disambiguation of the meaning of the word

For instance, consider these sentences

screenshot-from-2021-03-24-16-38-43-6202494

In both sentences, the keyword book is used, but in sentence one, is used as a verb. While in sentence two it is used as a noun.

Constituency grammar

Now let's talk about grammar.

The first type of grammar is the constitutive grammar. Any word / word group / phrase can be called constituent. The goal of constitutive grammar is to organize any sentence into its constituents using its properties. These properties are generally driven by the labels of the parts of speech, identifying nouns or verb phrases.

For instance, circumscription grammar can define that any sentence can be organized into three constituents: a subject, a context or an object. These components can take different values ​​and, Consequently, they can generate different sentences.

screenshot-from-2021-03-24-16-41-40-9150213

Another way of looking at constitutive grammar is to define them in terms of their parts of speech.. The tags tell a grammatical structure that contains a . This corresponds to the same phrase, Dogs bark in the park.

screenshot-from-2021-03-24-16-43-33-6264710

Dependency grammar

We also have a different kind of grammar, namely, dependency grammar, which states that “The words in a sentence depend on the other words in the sentence”.

For instance, in the last sentence, a barking dog was mentioned and the dog was modified by barking since the adjective-dependency modifier exists between the two.

Dependency grammar organizes the words in a sentence according to their dependency. One of the words in the sentence acts as the root and all the other words are directly or indirectly linked to the root through its dependencies. These dependencies represent the relationship between the words in a sentence.

Dependency grammar is used to understand the structure and semantic dependencies between words. Let's consider an example.

screenshot-from-2021-03-24-16-46-12-5916287

The dependency tree for this sentence looks like this.

tree-6283778

In this tree, the root word is “community”, having NN as part of the voice tag and all other words in this tree are rooted directly or indirectly with a dependency relationship as a direct object / direct subject, modifiers, etc.

These relationships define their roles and functions of each word in the sentence and how various words connect to each other.. Here, each dependency can be represented as a triplet containing a relation, a governor and a clerk. This means that a dependent is connected to the governor by a relationship. In other words, they are subject-verb or object.

As in the last example, DataPeaker is the subject or the governor, the largest data science community is the clerk or object.

screenshot-from-2021-03-24-16-48-32-3397635

Dependency grammar use cases

Dependency grammar has multiple use cases, for instance

  • In recognition of named entity
  • Question answer system
  • In co-reference resolutions, where the task is to map the pronouns with the respective noun phrases.
  • Summary problems in the text.
  • Functions for text classification problems

Final notes

To sum up, in this article we looked at labels for parts of speech and two types of grammar, namely, circumscription grammar and dependency grammar. We also looked at some of the important examples and use cases from them.

If you are looking to start your data science journey and want all topics under one roof, your search stops here. Take a look at DataPeaker's certified AI and ML BlackBelt Plus Program

If you have any question, let me know in the comment section!

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.