Online MT Tools and confidentiality Tópico cartaz: Anmol
|
As a computer engineer, I'm convinced MT has a role to play in translation. I'm equally convinced that machines can never replace human translators, since translation involves language, easily one of mankind's most complex cognitive skills.
I've been trying out some MT tools over the past couple of years. One of the leading tools on the market, which comes with a trial version, turned out not to meet my expectations just yet. The biggest plus was that it could be installed and ru... See more As a computer engineer, I'm convinced MT has a role to play in translation. I'm equally convinced that machines can never replace human translators, since translation involves language, easily one of mankind's most complex cognitive skills.
I've been trying out some MT tools over the past couple of years. One of the leading tools on the market, which comes with a trial version, turned out not to meet my expectations just yet. The biggest plus was that it could be installed and run locally off my laptop.
I've tried Google Translate on-line, being careful to translate only sentences or sentence fragments at a time, with all confidential information replaced by fabricated place-holders such as "ABC". I'm quite impressed with Google Translate, and it definitely speeds up the translation process. However, they don't have a downloadable version, and it's clear why they don't, since systems based on the statistical approach feed off a growing on-line corpus by design. Neither does Systran, another leading tool.
Using on-line software such as Google Translate for MT becomes inefficient, since you work one chunk at a time (be it line or paragraph), otherwise you tend to compromise the confidentiality of the text.
Has anyone found an approach using on-line MT software which is efficient and does not lead to a breach of confidentiality? Or a good off-line MT tool? Paying for the software would not be an issue. ▲ Collapse | | | Samuel Murray Holanda Local time: 06:22 Membro (2006) inglês para africâner + ... Use GTT instead of GT | Apr 23, 2013 |
Anil Gidwani wrote:
Has anyone found an approach using on-line MT software which is efficient and does not lead to a breach of confidentiality?
Well, since you mention that you anonymise text before you paste it into Google Translate, I would recommend that you try Google Translate Toolkit. It allows you to upload whole files instead of just a few lines, and if you believe Google then your content will not be shared with anyone. | | | Rolf Keller Alemanha Local time: 06:22 inglês para alemão | Joakim Braun Suécia Local time: 06:22 alemão para sueco + ...
Confidentiality is not achieved by removing company names. Given a long enough text an attentive reader who knows the field may well figure out who it's about. | |
|
|
Samuel Murray Holanda Local time: 06:22 Membro (2006) inglês para africâner + ...
Rolf Keller wrote:
Samuel Murray wrote:
if you believe Google [when using Google Translate Toolkit], then your content will not be shared with anyone.
https://developers.google.com/terms/?hl=en-EN
Search for "Submission of content".
Yes, but the Google Translate Toolkit is not an API, and that link of yours relate to APIs only. | | | Anmol Local time: 10:52 CRIADOR(A) DO TÓPICO Which is why feeding an entire text is questionable | Apr 25, 2013 |
Joakim Braun wrote:
Confidentiality is not achieved by removing company names. Given a long enough text an attentive reader who knows the field may well figure out who it's about.
I agree. It's not so easy to anonymize anyway, certain texts are replete with acronyms, names of departments, names of products etc.
A selective line-based usage of an online engine is clearly the safest approach at this time. Unless an off-line version is available, which is doubtful.
Does Google Translate offer an off-line version? Are there any off-line products that are good enough to be considered commercially usable? I used a trial version of PromT a year or so ago, and decided it didn't meet my expectations at the time. Systran does not have a trial version. Does anyone have an opinion of Systran? | | | Rolf Keller Alemanha Local time: 06:22 inglês para alemão API or no API | Apr 25, 2013 |
Samuel Murray wrote:
Yes, but the Google Translate Toolkit is not an API, and that link of yours relate to APIs only.
Ok, but the toolkit includes an API:
https://developers.google.com/translator-toolkit/
BTW, I assume that the web editor of the toolkit uses that API. Unfortunately the terms and conditions for this editor seem to be a secret, at least for non-registered users like me . | | | Samuel Murray Holanda Local time: 06:22 Membro (2006) inglês para africâner + ... How about anonymising *and* randomising? | Apr 25, 2013 |
Anil Gidwani wrote:
It's not so easy to anonymize anyway, certain texts are replete with acronyms, names of departments, names of products etc.
...
A selective line-based usage of an online engine is clearly the safest approach at this time.
Well, a malicious MT provider would still be able to recreate much of the text by simply keeping track of who submits what. I can see two ways of overcoming that: (a) the simplest but least effective way is to randomise the sentences that you submit to the MT server; (b) if you can write a program that many people use, then such a program can send each segment to the MT server via a random other user's connection, so that the MT service can't create a list of segments that belong to a user.
You can also obfuscate segments by mixing it with segments from other sources (e.g. if you have a program that can add random sentences from the internet that use similar wording as your source text).
My opinion is that absolute confidentiality can't be maintained and that the translator should take reasonable steps to ensure confidentiality. Anonymising the text is one such method. Randomising the segments that are submitted is another such method. And so is obfuscation.
For my language combination, anonymising would be easy, because my source language is English, and the English use capital letters mostly only for things that would normally need to be removed during an anonymisation process. So for me, I can simply replace all words with capital initials with placeholders. You have a bit of a problem, with German... | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Online MT Tools and confidentiality Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
| Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |