ABBYY Aligner Tópico cartaz: Kirstine Rennie
|
Hello,
My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.
Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner. My question is has anyone ever used ABBYY Aligner and if so what are the advantages to using it instead of aligning translations in a CAT tool? Is it significantly fas... See more Hello,
My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.
Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner. My question is has anyone ever used ABBYY Aligner and if so what are the advantages to using it instead of aligning translations in a CAT tool? Is it significantly faster?
Any help would be much appreciated!
Kirstine ▲ Collapse | | | Michael Beijer Reino Unido Local time: 10:27 Membro holandês para inglês + ... |
I also typed out a long post that the proz software destroyed instead of posting. Serves me right for typing in the browser instead of a text editor.
Anyway, how much material are we talking about (thousands of segment pairs or tens of thousands, maybe hundreds of thosuands or millions?) and how good a result do you want (is 95% correct pairing good enough, or do you want 100.00% correct)? Are you okay with discarding subpar documents or sections of texts or do you want every last sentence... See more I also typed out a long post that the proz software destroyed instead of posting. Serves me right for typing in the browser instead of a text editor.
Anyway, how much material are we talking about (thousands of segment pairs or tens of thousands, maybe hundreds of thosuands or millions?) and how good a result do you want (is 95% correct pairing good enough, or do you want 100.00% correct)? Are you okay with discarding subpar documents or sections of texts or do you want every last sentence extracted?
Your choices depend on these factors. In any case, you need an aligner with a good autoaligner algorithm. Most CAT tools' aligners fail at this hurdle. Then what sort of a review/edit you do after autoalignment depends on your needs.
[Edited at 2015-01-15 15:00 GMT] ▲ Collapse | | | intelligent and fast | Jan 15, 2015 |
As to my translation pair it seemed to me quite intelligent and fast.
I had a filling that it caught the meaning of the sentence when aligning each pair. However, there was mistakes in aligning.
To be frank I haven't seen better aligner so far, though I can't say that I have used a lot of them.
Now you can try ABBYY Aligner 2.0 trial which works only 15 days though and has some restrictions.
The version I have used is 1.0. | |
|
|
Samuel Murray Holanda Local time: 11:27 Membro (2006) inglês para africâner + ...
kirstinerennie wrote:
My translation firm is looking to align a very large amount of work. I had already posted a question regarding this in the Across forum, assuming that it would have to be done in a CAT tool.
No, it typically can't be done "in" a CAT tool, although some CAT tools are accompanied by alignment programs. For very large amounts of work, the aligners that I have seen that come with CAT tools may not be suitable, though.
LF Aligner is a freeware option that tends to get good reviews. I haven't really used it myself, though. It is mainly a non-GUI aligner, but the latest version does have a GUI, but it's not the most user-friendly GUI that I have seen.
If you want high quality TMs that you can trust 100%, then you'd have to check the alignment manually, and that is when it becomes necessary to have an alignment GUI that is easy to use.
Another member responded saying that he thought it would be easier/quicker in something like ABBYY Aligner.
The product brief looks impressive, but you won't be able to tell whether it is "good" with your very large amount of work, because the trial version is limited to 1000 segments (or 50). For EUR 100 it is not cheap. The installer for the trial version is 300 MB (ouch!).
==Added:
Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.
There is a batch function (not tested) but I just dragged and dropped two files into it (EN and AF). It doesn't support AF, but recognised the AF file as NL, which is good enough. It only merges cells if you select them, and unfortunately the shortcut for merging is in an odd position, but it's not the end of the world. There is no simple shortcut for moving between whole cells, but the down and up arrows move between cells easily. It does not seem possible to delete a cell without deleting the entire row. Ctrl+Z works! PgDn and PgUp moves through the file (also a good thing).
The screenshot in the product brief showed that the program will mark possible misalignments in colour, but it didn't do it for me.
[Edited at 2015-01-15 15:45 GMT] | | |
Hi everyone,
Thanks so much for all your helpful replies, very much appreciated.
We are currently looking at all the options as it's a very big alignment project (around 2 million words!).
Thanks again,
Kirstine | | |
kirstinerennie wrote:
Hi everyone,
Thanks so much for all your helpful replies, very much appreciated.
We are currently looking at all the options as it's a very big alignment project (around 2 million words!).
Thanks again,
Kirstine
That's a pretty big alignment project. If 2M words is in one language, that'll probably work out to about 200K segment pairs. That's already in the size range where I'd normally do an autoalignment with only a partial manual review (more a series of spot checks looking for potential quality improvement tricks than an actual review, re-running the autoalignment if I find something system-level and fixable). Still, a full manual review is not outside the realm of possibility. It's just a huge job. I reckon I have probably done manual review on 100K+ segments so far, for my personal use, for a hobby project (aligning public domain literary works) and for paying clients (translators who needed TMs made from translated documents)... But then I'm probably quite unusual in terms of my tolerance for certain types of monotonous work and being able to review/fix alignments quickly.
As to tools, I personally would use LF Aligner. But then I wrote it so I'm obviously partial. Alignfactory and ABBYY seem to get good reviews, although if it works as Samuel describes I would say ABBYY is out.
[Edited at 2015-01-16 20:26 GMT] | | | 2nl (X) Holanda Local time: 11:27 Transit Alignment tool | Jan 17, 2015 |
The Transit Alignment tool offers interesting ways to improve the alignment result, e.g. by use of your dictionaries.
Quick Start: https://transitnxt.wordpress.com/2013/11/27/aligning-files-in-transit-nxt/
Full manual: http://tinyurl.com/qf56j5q
... See more The Transit Alignment tool offers interesting ways to improve the alignment result, e.g. by use of your dictionaries.
Quick Start: https://transitnxt.wordpress.com/2013/11/27/aligning-files-in-transit-nxt/
Full manual: http://tinyurl.com/qf56j5q
Use internal word list
Transit NXT uses an internal word list to assess the probability of the source and target segments being correctly matched.
The alignments are saved in the file align.adc under config\global in your Transit NXT installation folder.
If Transit NXT finds that the source-language segment contains an entry from the internal word list, it searches for the translation of the term in the target-language segment.
Use project dictionaries
Transit NXT uses the current TermStar dictionary to assess the probability of the source and target segments being correctly matched.
If Transit NXT finds that the source-language segment contains a term that has been added to the current dictionary, it searches for the translation of the term in the target-language segment.
Resource files mode (with comparison of markup segments)
Transit NXT compares markup segments during align- ment, instead of text segments.
Use this option when aligning files with string IDs, perhaps for localisation projects. ▲ Collapse | |
|
|
standard feature | Jan 17, 2015 |
2nl wrote:
Use internal word list
Transit NXT uses an internal word list to assess the probability of the source and target segments being correctly matched.
The alignments are saved in the file align.adc under config\global in your Transit NXT installation folder.
If Transit NXT finds that the source-language segment contains an entry from the internal word list, it searches for the translation of the term in the target-language segment.
Use project dictionaries
Transit NXT uses the current TermStar dictionary to assess the probability of the source and target segments being correctly matched.
If Transit NXT finds that the source-language segment contains a term that has been added to the current dictionary, it searches for the translation of the term in the target-language segment.
That's been a standard feature of many autoalignment algorithms for many-many years. It's kind of an obvious method to use, so it's certainly not a selling point for any one algorithm. In many cases it's a drawback.
Alignment history lesson coming up, skip if uninterested: many different efforts were made to get away from this dictionary-based method in order to be able to align texts in language pairs where no good dictionary is immediately available (for the text pair in question, in the right format, to the person running the alignment). Perhaps the most widely used one is the Gale-Church algorithm developed in 1993. It is based on segment length: longer segments tend to correspond to longer segments, and shorter ones to shorter ones. If you go through the whole text trying to equalize the segment lenths, things start to fall into place. Some algorithms try to find identical strings in the two texts to use as anchors (e.g. proper names), some run the texts through a MT engine or deploy other tricks. Quite a few use a combination of methods. Hunalign, which testing has shown to be one of the best algorithms, uses a combination of the dictionary method and the Gale & Church algorithm. (It can actually run the Gale & Church to get a rough alignment, automatically extract a dictionary from the aligned texts and then do a second alignment run with the freshly made dictionary.) LF Aligner uses hunalign as its alignment engine and comes with dictionaries for a wide range of language pairs. You can also add your own dictionary.
2nl wrote:
Resource files mode (with comparison of markup segments)
Transit NXT compares markup segments during alignment, instead of text segments.
Use this option when aligning files with string IDs, perhaps for localisation projects.
That's a neat trick, which is also employed by multiple other aligners. LF Aligner doesn't do this (I tried to integrate an open source alignement engine that does this but couldn't get the alignment engine to work and abandoned the idea). In the case of XML files or similar, it could be very useful if it's implemented well. It's usefulness is limited to specific file types, though. E.g. it might 'work' with HTML files in that it might correctly pair up paragraphs... but most autoaligners will do that anyway. The really important bit is correctly pairing up sentences, and HTML markup probably won't help you do that at all. | | | ABBYY Aligner | Jan 23, 2018 |
Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.
There is a batch function (not tested) but I just dragged and... See more Okay, I tested it. It is a "smart" aligner, which is a good thing, but it misses a crucial function: the ability to insert blank cells. It can move cells up and down only if there is a blank cell above or below that cell, but if there is no blank cell, and there is a misalignment, then you can't fix it using the keyboard shortcuts, but must fix it by manually copy/pasting text from one cell into another. That is *bad*.
There is a batch function (not tested) but I just dragged and dropped two files into it (EN and AF). It doesn't support AF, but recognised the AF file as NL, which is good enough. It only merges cells if you select them, and unfortunately the shortcut for merging is in an odd position, but it's not the end of the world. There is no simple shortcut for moving between whole cells, but the down and up arrows move between cells easily. It does not seem possible to delete a cell without deleting the entire row. Ctrl+Z works! PgDn and PgUp moves through the file (also a good thing).
The screenshot in the product brief showed that the program will mark possible misalignments in colour, but it didn't do it for me.
[Edited at 2015-01-15 15:45 GMT] [/quote]
I liked ABBYY, it is in version 2. it is simple and straightforward for TMX manutention.
Just select any segment and split it to have an extra source-target text line. Just split the last line, copy and past any text (source-target) and have AABBYY align the text from that text (it willl automatically generate segments).
What I like in ABBYY Aligner 2 is that it allows me to view the whole text, have full control of alignment, manually correct both source and target, then keep it as an ABBYY project (custom ABBYY format) or else export either to bilingual RTF or to TMX.
Registered version allows for bigger files, but even it does have some limitations as to file size (check the help file for that, I just do not have that data now, sorry) but you can always split your work into "work1", "worr2" etc, if it gets too big. ▲ Collapse | | | Stepan Konev Federação Russa Local time: 13:27 inglês para russo Use 'Split segment' command | Jan 24, 2018 |
victorlage wrote:
it misses a crucial function: the ability to insert blank cells.
You can run the Split Segment command (Ctrl+Enter) at any point of any segment. If you put cursor at the end of a sentence, this will just insert an empty pair of cells below.
There is no simple shortcut for moving between whole cells
Alt+arrows?
P.S. @victorlage,
Sorry... I did not notice that it was a quote from another user.
[Edited at 2018-01-24 06:09 GMT] | | | Susan Welsh Estados Unidos Local time: 05:27 russo para inglês + ... ABBYY Aligner | Jan 24, 2018 |
I have it, but found that it requires a lot of manual adjustment. On the rare occasions when I need to align something, I use LF Aligner. But I never do huge jobs such as yours, so my experience is not that relevant. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » ABBYY Aligner Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |