Blog

Preparing for a Post-LIBOR Future with ContraxSuite, Part 4: LexNLP, The Brain Inside

LIBOR goes dark sometime toward the end of 2021, and even now, financial institutions are preparing for it. As discussed in Part 1, Part 2, and Part 3, the potential consequences are substantial and wide-ranging, but there are also many options during this transitional period.

ContraxSuite has functionality to handle a lot of the heavy lifting during this transition. Pre-trained to read contracts and “speak financial”, ContraxSuite can be your personal Sherlock Holmes as you investigate your contracts for LIBOR-related material. Our contract analytics software can read and categorize contracts, sniff out potentially affected clauses, and even identify and insert proposed amendments on its own. The way it does this, is with a sophisticated brain: LexNLP.

LexNLP: The Brain

ContraxSuite is a powerful software interface. It identifies, organizes, de-duplicates, analyzes, extracts, and exports data from legal documents. Think of ContraxSuite like a human: it has many capabilities that can be combined to complete many different tasks.

Every high-functioning body has a high-functioning brain. LexNLP is that brain, a natural language toolkit designed for legal. More accurately, you can think of LexNLP like a dictionary with a pulse. It’s designed to work with real, unstructured legal text, including contracts, plans, policies, procedures, and other material, and search not just for specific words, but for explicit and implicit associations between and among words.

Stems and Lemmas

LexNLP differs from other natural language toolkits in its analysis of words and their forms. Two of the most important types of analysis concern stems and lemmas. A stem is that segment of a word that contains the word’s root, and the basis for variations on that root. An example of a stem relevant to LIBOR would be “advanc,” which would capture inflected forms like “advance,” “advances,” “advancer,” and “advancing”.

A lemma is similar to a stem, but obtains a more complete grammatical understanding of a word stem. A simple example of a lemma would be the word “go”. A lemmatization of “go” would return words like “going” and “goes,” but would also return grammatically similar words like “went” that stemming alone would miss.

So in the context of LIBOR, LexNLP captures that both “advance” and “advancing” are semantically related concepts, and would not miss analogous terms when analyzing a document.

Stems and lemmas are just two ways LexNLP wields natural language processing in analyzing legal language, but far from the only two.

Extracting Structured Information

LexNLP isn’t just capable of simple natural language tasks like identifying concepts in unstructured text. LexNLP goes above and beyond most libraries by extracting structured information. Examining structured information is something our human brains do constantly every day; we read text, or listen to speech, and not only process each word alone, but the words themselves as they relate to one another and as they relate to the overall communicative goal of the context. Computers, however, require thorough instructions in order to accomplish the same structuring tasks.

LexNLP interprets information in a structured way. It does this via dozens of easy-to-use functions. Let’s take a look at four of the most commonly used:

  • get_money
  • get_percents
  • get_dates
  • get_definitions

All of these functions have specific targets when searching through documents. The get_money function, for example, searches the document for any numbers, whether specified numerically (e.g., “$200.00”) or written out longhand (e.g., “zweihundert Euro”). Any and all such references are not merely highlighted in the text; they are normalized or converted into a standard format based on ISO 4217. Here’s an example:

In: list(get_money("That little dog in the window is one-hundred and fifteen dollars.", return_sources=True))
Out: [(115, 'USD', 'one-hundred and fifteen dollars')]

Without LexNLP to read this input, the computer doesn’t know what “dog” means, or what “dollar” means. It just finds the number “115”, the word “dog”, and the word “dollars”, without finding the relationship between. The output data would be unstructured.

With the structured data of LexNLP, the computer can determine that “dog” is an object, that “in the window” is a positional aspect for the dog, and that the dog costs an amount of “dollars” (probably US dollars, though not necessarily). Once get_money sees “dollars”, it reads the context of the sentence and determines that the amount of dollars is “115”.

A function like get_percents, meanwhile, can analyze a table like this:

base rate table floating rate credit agreement

And return all the percentages in that table, either written as percentages, or decimals, or basis points.

And the get_dates function is written so that hundreds of possible dates (e.g., “DD Month YYYY” or “Month DD, YYYY”) will be found:

lease date duration

lease date duration

LexNLP’s “get_dates” function can find any date format

Example: Finding Definitions

Finally, the following shows how LexNLP can work on unstructured text to find definitions. LexNLP uses its get_definitions function combined with get_percents to extract the applicable margin for an adjusted base rate:

adjusted base rate text

LexNLP finds the adjusted rate definition and extracts out the margin adjustment. The following code snippet reviews the document these words were found in, and then displays the whole sentence that contains “Adjusted Base Rate,” along with the margin itself:

LexNLP adjusted base rate code

LexNLP adjusted base rate output

LexNLP code showing “Adjusted Base Rate” search and output

If you were using ContraxSuite, this phrase would be found and highlighted by the annotator.

Conclusion

LexNLP is a powerful natural language toolkit built for legal language. It functions like the brain inside ContraxSuite, our contract analytics platform. It uses various techniques to find important words and phrases, such as stemming, lemmatization, and string matching via call functions like get_definitions.

In the months to come, it will become increasingly important for lenders and borrowers to analyze the language of their loan agreements for their relationship with LIBOR rates. LexNLP can find those relationships, one word at a time.

Put ContraxSuite On The Case

In response to the growing need for LIBOR-related contract analytics, we are building a LIBOR-focused version of ContraxSuite. Trained on tens of thousands of financial contracts, ContraxSuite can find and label important LIBOR-related clauses, including fallback provisions. ContraxSuite provides a user-friendly interface for users of all backgrounds and experience levels, but your organization can opt to use LexNLP separately from ContraxSuite to customize how you extract and structure information in your documents. Contact us to find out how best to implement ContraxSuite, or develop a solution with LexNLP by itself, in your organization (You can also scroll to the contact form at the bottom of this page).

To continue with this series, click here for Part 5. Start at the beginning with Part 1 of this series. Click here for Part 2. Click here for Part 3.

 

About LexPredict

LexPredict is an enterprise legal technology and consulting firm, part of the Elevate family of businesses. Our consulting teams specialize in legal analytics, legal data science and training, risk management, and legal data strategy consulting. We work with corporate legal departments and law firms to empower better organizational decision-making by improving processes, technology, and the ways people interact with both. We develop software and data tools, including ContraxSuite, LexSemble, CounselTracker, and LexReserve, that assist organizations with contract analytics and workflows, early case assessment and decision trees, outside counsel spend management, and case valuation. Discover more at lexpredict.com.

Comments are closed, but trackbacks and pingbacks are open.