Data Language AI Products

Data Language AI Products

Data Language AI includes Text AI Services, our industry-beating text analytics product and Explainable AI Platform, that provides transparency for complex predictive analytics.

AI Document Classification

Data Language has been tackling the problem of document classification for many years and has used their wealth of experience in NLP, data science, and text analytics to create a product that is world leading.

Our Text AI Services are easy-to-integrate SaaS tools that use AI and machine learning to make content categorisation and tagging as easy and efficient as possible. Our Text AI Services dramatically reduce the cost and effort associated with tagging and organising written content, speeding up behind-the-scenes processes and ensuring accuracy.

With our Document Classification Text AI Service, the difficult engineering and scaling is done for you, dramatically reducing the total cost of ownership of developing AI based classification in-house.

Document Classification Text AI Service main features

  • Fully managed Software as a Service, trivial to integrate into your systems
  • Language agnostic
  • Classifies content using your own vocabularies and dictionaries
  • Self optimizes and learns on-the-fly independently to your workflow
  • Dramatically reduces the time, cost and complexity of tagging content
  • Gives you more time to do what you do best

Benchmarking

We have benchmarked our Document Classification Text AI Service against Reuters RCV1 and Arvix Academic Paper Dataset AAPD and achieved impressive results.  Read more about the benchmarks here.

Our Document Classification Text AI Service is perfect for almost any text classification problem, and in any language, some of its primary use-cases are :

Publishing

Our Document Classification Text AI Service was originally built to tackle the challenges that publishers and media organisations have with automatic and human-assisted content tagging: reacting to new terms (tags) fast as they break in news, automatic on-the-fly training, consistency, disambiguation, editorial tagging style  - all this is handled out of the box by our service.

Our Document Classification Text AI Service is perfect for classifying legal texts and documents, or part-of-documents such as contract clauses. Once trained, it can be used to automate the tagging of incoming legal documents, as well as archives of existing documents. We have demonstrated Document Classification Text AI Service on Hansard of some 200,000 parliamentary interventions for the 2005/6 period , classified with >7200 categories from the parliamentary library classification vocabulary, and achieved extremely accurate results.

Healthcare and Life Sciences

Through our work with Cochrane, we have extensive experience with bioinformatics vocabularies such as SNOMED-CT, MESH, and MedDRA. Our Document Classification Text AI Service is ideally suited for the automated classification and tagging of content against these vocabularies, whether for classifying clinical trials,  patient records, or general medical documents.

Free Trial

If you are a publisher or work with high volumes, or large content archives and are working on or thinking of building your own automated tagging platform, then our Document Classification Text AI Service is an ideal solution - you can be up and running in hours, and reduce your total cost of ownership of building, scaling, and hosting your own AI / NLP based classification solution dramatically.

Please get in touch for a free trial.

Sky

Data Language have great experience in information architecture for broadcasters, which is quite rare. Their enthusiasm has rubbed off on my team, which is a real bonus.

Head of Recommendations and Metadata, Sky TV
Oliver Bartlett
Get In Touch
100k

Our Text AI Classification Service scales to >100k parallel predictive micro-models, and returns an ensemble prediction result in under 1 second.

0.83

We benchmarked our Tagmatic service against Hansard data: a corpus of ~200,000 parliamentary interventions for the 2005/6 period , classified with ~7200 categories, F1 scoring a whopping 0.83

Frequently Asked Questions

What are Data Language Text AI Services?

Data Language Text AI Services are commercial SaaS tools for sophisticated, language agnostic text classification and recommendation using AI and machine learning. They are fully managed cloud services, that use your own dictionaries, taxonomies, vocabularies or knowledge graph entities. They are built on pure machine-learning algorithms and thus avoids problems such as disambiguation. They are self-optimizing, and evolve as your content evolves.

What languages do Data Language Text AI Services support ?

Data Language Text AI Services are language and character set agnostic. They work out-of-the-box on all latin character set languages, as well as Cyrillic, greek, Arabic, and any other alphabet based language. They have not yet been deployed on Chinese and Japanese Kanji, but if you have a project for classification of Kanji text, we would be happy to work with you.

How do I integrate Data Language Text AI Services with my publishing system ?

Data Language Text AI Services are incredibly easy to integrate. They have powerful APIs for training and prediction. Documents are sent to the training endpoint for on-the-fly training, and sent to the predict endpoint for sub-second rapid prediction. We also have a visual playground that can be used for experimenting, and trialling the services. All the infrastructure is managed by us, you can be up and running in just hours.

What sectors do Data Language Text AI Services work for ?

Data Language Text AI Services can be applied to almost any text classification problem, and are ideally suited to domains such as News, Sport, and Lifestyle/Entertainment Publishing. Importantly, our Text AI Services also work in Law and Legal tech for classifying legal texts, contracts and clauses, and in Healthcare and Life Sciences for classifying documents against bioinformatics vocabularies such as SNOMED-CT, MeSH, and MedDRA (as a Computer Aided Coding System).