Translate.Wonk

Translating corporate language with AI

Integration into CMS and system landscape

High availability of services

Data Protection & Data Security

Trained for the company

Glossaries & stop words as the foundation.

Team session in between CMS editors from different countries using wonk.ai

Translate.Wonk

Translating corporate language with AI

Maintaining glossaries, lengthy approval processes, and media breaks between email, Word, CMS, and external systems — finally a thing of the past.

We use your company’s existing knowledge and current translations from websites, documents, translation memory systems, and other sources as training data for enterprise-specific translation models. This lets us deliver machine translation in real time and at the highest quality.

Reliable, secure, efficient — and continuously learning.

What can Translate do?

Translate combines the economics of machine translation with the quality of human translation.

Website Projects and Content – . XLIFF

.xml, .xlif and other formats that a CMS can export, Translate avoids and translates the language true to format.

Glossaries & Stop-Words

For ad-hoc changes, Translate offers glossaries and Stop-Word maintenance. Glossaries hold word equivalents in the respective language, similar to a dictionary. Stop words are used to not translate proper names and product names.

Office documents – Word, Excel, PowerPoint

Classic documents that arise in daily office work can be translated with Translate. And keep your formatting and design.

Initial training in own company language

At the beginning of the collaboration, we train Translate based on your existing content from websites, translation memory systems (TMS), documents of any form and existing glossaries.

Downloads & print templates – .PDF files

Use transate for already completed documents in the format .pdf and translate them into the desired languages while retaining formatting and layout.

Training Data Storage & Continuous Learning

Via the API and the web frontend you have the possibility to make adaptations and improvements to the translations. Through proofreading, your corporate translation will be improved with every training.

Catalogs, brochures and printed matter – .IDML

This content is usually generated in Indesign in the format .idml. Translate translates the content so that you are editable and preserved in indesign format. 

100% DSVGO compliant

Integrated through API in Microsoft Azure. Data in EU & Germany – or on-premise if desired

But – there are existing solutions.

Companies have translation processes and solutions,  but that doesn’t mean they’re good.

Time-to-market – translations take weeks
Quality – translation results are not good enough
Efficiency – the translation process is very manual with many media breaks
Cost – translations are too expensive at €0.15 per word

We can do better. 

The quality of previous translations with the efficiency of machine translation.

+ 50% faster Time-to-Market
+ 80% lower costs
+ 70% better quality

How is this possible? 

By training machine translation in your corporate language. 

Collection – capturing language data from websites, glossaries, language databases, documents, and TMS exports
Extraction – extracting language pairs from the collected data
Processing – validating, cleaning, and transforming the language data for training
AI training – training the language models on the translation knowledge
Evaluation – model evaluation by the customer in a blind model test

Training of corporate language models.

Translate learns your corporate language.

What the training can do:
Learning the company’s special terms, tonality and proper names – for the highest translation quality
Continuous learning and improvement in operations
Processing of all data sources for training – whether websites, TMS, glossaries, documents or other sources

Quality 

 70%

better

Translation quality

Significantly better than generic models

We tested:

How powerful is a custom-trained model compared to a generic model from DeepL?

Small spoiler – it’s 70% better translations.


Test case (DE-EN):

The performance of the model trained by wonk.ai was tested against DeepL using the customer’s data.

In the customer area and in the customer-specific language, the model trained by wonk.ai was clearly convincing compared to DeepL.

2500 records were calculated from the customer domain.

365,691 sets won for training

Data Sources: Customer Websites & TMS Export

BERT Score for calculating semantic similarity to customer reference translation sentences

Better variant DeepL (sentences)

higher BERT score

Better variant DeepL (%)

higher BERT score

Better variant wonk.ai (sentences)

higher BERT score

Better variant wonk.ai

higher BERT score

Same Rating (Sentences)

Same BERT-Score

Same Rating (Sentences)

Same BERT-Score

Decide on the quality.

Independent in your own evaluation environment.

After the model training, your stakeholders and translation managers will have their own access to your separate evaluation environment. There you can evaluate the quality of the translations without being influenced by the source.

In this way, you as the project manager receive the highest level of independent evaluation and, as a consequence, a very high level of acceptance of the trained models.

And if the evaluation is not positive?
 
This can happen – usually with not so strong basic models and little training data.
Then everyone involved knows right from the start that the models need even more quality.

With our continuous process of training data collection and enrichment, you can collect enough data over time to successfully train your own model.

Who uses Translate?

Translate supports medium-sized businesses and corporations in their internationalisation.

KWS Saat SE & Co. KGaA

Stock-exchange listed plant breeding and biotechnology company.

The world’s fourth-largest seed producer by revenue from agricultural crops.

Revenue: €1.54 billion (2021–2022).

Employees: over 5000

70 countries

MULTIVAC GROUP

MULTIVAC is a solution provider for the packaging and processing of foodstuffs, medical and pharmaceutical products as well as consumer and industrial goods of all kinds.

Revenue: EUR 1.37 billion (2022)

Employees: 7000

84 locations, 160 countries

Phase Plan Training & Operations

From data to translation

01

Data exchange

wonk.ai will receive the list of languages to be translated and access to the data sources available for training.

02

Data Checkup

wonk.ai checks the quality of the language data and the number of language pairs to be achieved.

03

Specifying the testers

The company determines which stakeholders and experts in the company will review the language models and evaluate the results – compared to previous translations or alternative solutions.

04

Training of language models

wonk.ai extracts the language data, validates and cleans the training set, and trains the language models with mathematical evaluation.

05

Evaluation of the results

The customer’s testers evaluate the language models within their own assessment environment and provide feedback on individual results.

05

Commissioning

Once the language models have been initially approved, they can be put into operation directly and can be used via the customer’s own web environment. The trained models can also be integrated into third-party systems via the API and can thus be used in the entire system landscape.

Dieser ganze Prozess dauert zwischen zwei und vier Wochen.

Demovideo –  .idml translation

How can you translate existing InDesign (.idml) documents with Translate?

Demovideo – .pdf translation

How can you translate existing PDF documents with Translate?

FAQ – Questions & Answers

We are happy to answer any questions you may have in our personal get-to-know-you meetings.

If you’re in a hurry, here are some questions we’ve been asked.

Yes. It is also possible to operate our services on-premise in your own infrastructure.

As standard, we operate our services in a Microsoft Azure cloud architecture. Fully encrypted and only accessible to the company.

There are also IT departments that want to run our models in their own cloud (private cloud) such as Microsoft Azure Cloud, AWS or Google Cloud. This is possible, but requires a lot of work on the project.

It is also possible to operate completely on-premise in the company’s own data center, which requires a one-time project effort, an analysis of the use cases and the corresponding hardware with GPUs for hosting.

So far, we have covered 50 different languages in our projects. We are able to deliver well over 100 languages in high quality. We are happy to receive a list of the desired language pairs and give a concrete statement on the subject of the project.
Through our API architecture, we have ensured that your models can be connected quickly and reliably in all relevant systems. Connections implemented so far range from content management systems, e-shops, product information systems to office software such as Word and Excel.

Of course, with our separate web frontend, you always have the opportunity to use the functionalities and your models across systems.
To train the language models, we extract sentence pairs in the source and target languages.

We obtain these pairs of sentences from the existing publications on the company’s websites, from publications, databases, translation memory systems, glossaries and much more.

We are able to process a wide range of formats and data and make them usable for training.
Most companies have enough language data from the past for training.

We can collect this language data and make it usable.

Through our processes, we are usually able to win 10,000 to 300,000 sentence pairs per language.

In the training projects, we check at the beginning for which languages enough training data are available and communicate the status early, so that transparency is available from the start.
The quality of translations and their improvement can be calculated and expressed mathematically. Of course, we deliver that.

In our experience, the acceptance of specialist departments, countries and translators is more relevant.

That’s why we’ve created an evaluation environment where your company’s employees can test and evaluate the trained models.

Compared to the previous manual translations and also compared to other untrained generic translation services. The qualitative evaluation by your colleagues is then the basis for deciding which trained models are ready for translation.
Translation memory systems are designed to help translators and language managers deliver consistent translation quality. Furthermore, TMS are intended to reduce translation costs, because strings that have already been translated are not resubmitted for translation.   Instead, when there is a sufficient character‑level match, the previously translated string is returned.

This should help reduce translation costs and keep translation quality constant. And this approach makes sense when many manual and changing translators work with the knowledge.

wonk.ai language models combine the previous translation knowledge (like a TMS) with the feedback of the proof readers. This means that a translation memory system is not needed for many applications.

Translation pricing is based on the number of trained language models and the total number of words translated. On average, wonk.translate is 80% cheaper than previous translation solutions.

Machine translation is always useful when many translations in many languages are needed in a short period of time.

This is the case, for example, when global content campaigns are to be rolled out or when changing CMS and relaunching websites. Whenever a lot of new content is created or edited and is to be translated in a short time, machine translation is particularly useful.

For context:
A typical rollout for a website relaunch—assuming 400 content pages per language and 10 target languages—amounts to about 2,000,000 words (2M). That’s a substantial volume where getting support makes sense.

Absolutely. With wonk.ai you can comfortably and quickly translate PDF files. You can translate a variety of file formats either via the API or directly in the web frontend using drag‑and‑drop. .pdf format and many more.

Automatic translation also works for classic Office files (Powerpoint, Excel, Word) as well as for specific formats, e.g. InDesign prepress.

Any questions?

Time for a personal conversation.

We would like to help you and your company further and support you in the editorial department with AI. This often results in questions and topics that can be better clarified in the conversation. I’m happy to help you.