Blog

ContraxSuite 1.1.5 Release Update

Client demand for document and contract analytics tools like ContraxSuite continues to rise. We are improving ContraxSuite everyday, including building functionality for integration with document management systems.

Now that LexPredict has been acquired by Elevate, ContraxSuite is more in-demand than ever. Read on for a detailed description of the latest updates to ContraxSuite.

Release Notes

This month’s release, Version 1.1.5, is the sixteenth open source release of ContraxSuite. It became generally available on November 7th, 2018. Release 1.1.5 focuses on several updates around document field concepts:

  • Improvements in the field detection system
  • Changes in document field concepts

  • Application of filters between projects and document types, and inside projects

  • Improvements in models to widen storage for uncategorized model objects data

  • Multiple minor bug fixes and improvements for admin site

Detailed Changelog

New in Release 1.1.5:

  • Massive refactoring of the field detection system with a “field detection strategy” in the code. Field detection strategies support two main processes: training and field detecting. The training process is optional and makes sense for machine learning-based field detection only. Each type of field detection is now represented as a field detection strategy:
    • Disable field detection: Used when nothing needs to be extracted; this field is for data entry only
    • Regexp field detection: Finds matching sentences/paragraphs and extraction hints with configured regular expressions, extracts values from the selected text with an extraction function implemented for the field type
    • Text-based machine learning: Finds matching sentences/paragraphs and extraction hints with machine learning models trained on user entries; extracts values with extraction functions
    • Formula-based value calculation: Calculation field value with manually entered formula, taking as its input the values of other fields of the document
    • Field-based machine learning for choice fields: Selects field value with machine learning model, which takes values of other fields of the document as its input
    • Regexp and text-based ML: Starts with regexp field detection, then switches to text-based machine learning when there are enough user entries to train the model
    • Formula and field-based ML: Starts with manually-entered formulas, then switches to field-based machine learning when there is enough user data to train the model
    • Python-coded fields: Detects/calculates field values with a Python class distributed with the project code
  • Changes in document field concepts:
    • A document field, depending on its type, now can be put into one of two groups: value-aware (string, Boolean, float, date, company, person, etc.) and not value-aware (“related info”). The document editor API now returns the “value-aware” flag for each such field
    • A field can be configured as read-only (true/false) in Django admin. Read-only fields are detected by the system but cannot be overridden by users via API
    • A field can be configured in Django admin as requiring or not requiring text annotations. Values without an assigned text range can be stored for the fields marked as not requiring text annotations via API
  • Calculated/detectable fields now can be built to depend on other detectable fields. These fields will be detected in the order of importance
  • Values within calculated fields (even formula-based) can now be overridden by a user and these changes will be incorporated into training the machine learning models
  • Improved admin task, “Documents: Train and Test”, can be used for training and testing the field detection strategies. This task now supports training on the data that users changed and/or confirmed:
    • Both train and test steps can be skipped
    • Non-machine learning field detection strategies can be tested with this task
    • For machine learning strategies, the “train” step additionally tests the scikit-learn classifier on 20% of the training data, and prints the scikit-learn report to the task logs
    • For all types of field detection strategies, the “test” step executes full field detection on each test document the same way it is executed during normal system work. Total accuracy is calculated and printed to task logs
  • Added admin task for loading JSON documents with predefined document fields
  • Added functionality to load documents with predefined fields
  • Fixed bug with geography field representation
  • Fixed bug with incorrect order of sentences in the “detect field values” task
  • Added document filters. A user is now able to create, store, and apply filters by projects, by document types, and by implemented API
  • Added ability to save document filters inside a project, and restore previous filters in new sessions
  • Fixed a bug where document uploading was interrupted when the application was not able to send email notifications
  • Added ability to include client SSL certificates into the application
  • Added API for user-related documents, i.e. documents assigned to a current user
  • Added ability to filter by document status and document assignee name
  • Fixed Document Field Categories admin form
  • Fixed alias Document.text to be equal with Document.full_text field
  • Better handling for “partial” algorithm for the loadnewdata management command
  • Closed the signup page. New users must be added via the admin site
  • Made “project field” optional for Upload Session objects
  • Added AdvancedManager class to emulate dot-like behavior for values extracted from queryset; uses .dot_values() instead of .values() on querysets
  • Added “metadata” field into DocumentField and DocumentType models to store additional uncategorized data for model objects
  • Improved JSON form fields representation on admin site
  • Task List page now sorts tasks by start date
  • Added ability to detect Field Values for different unit types (e.g. paragraphs, sentences)
  • Unified annotator API response, migrated annotator API into “document” application (changed API URLs)
  • Several minor changes and bug fixes in document editor and grid API

The full release notes and changelog for Version 1.1.5, and for all prior versions of ContraxSuite, are available at the ContraxSuite GitHub page.

To get started with ContraxSuite by LexPredict, visit our website, or drop us a line at contact@lexpredict.com.

Comments are closed, but trackbacks and pingbacks are open.