managing a .csv file exported from a website_quite urgent Tópico cartaz: Sarah Ianieri
| Sarah Ianieri Itália Local time: 09:42 inglês para italiano + ...
Hi everyone, I have to translate a .CVS file exported from a website (i asked for other formats and the owener told me that is not possible for them), below an example of the file:
31,377,it,NASTRO MONOADESIVO RIFLETTENTE PER INTERNO,"**ALU BAND** è il nastro **monoadesivo acrilico** sviluppato specificamente per la **sigillatura ermetica di barriere vapore alluminizzate**, come le membrane della serie **BARRIER ALU**.
When I save it in excel, the formatting seems to... See more Hi everyone, I have to translate a .CVS file exported from a website (i asked for other formats and the owener told me that is not possible for them), below an example of the file:
31,377,it,NASTRO MONOADESIVO RIFLETTENTE PER INTERNO,"**ALU BAND** è il nastro **monoadesivo acrilico** sviluppato specificamente per la **sigillatura ermetica di barriere vapore alluminizzate**, come le membrane della serie **BARRIER ALU**.
When I save it in excel, the formatting seems to me it may be corrupting, moving from 3 lines into more than 10 line. Excel could also be responsible for corrupting some of the characters due to encoding issues and this could be the cause of ,my "special chars" problem.
So before opening the CSV in Excel and mess it up, I openned it in a decent text editor and then add a BOM. Then import the CSV into Excel as opposed to opening it in Excel. Select the comma as the delimiter and then I was able to see what it should look like. So i was able to see how best to handle it... with the Excel filetype in Studio, just adding a couple of embedded content rules for the ** symbols.
But when I sent back the translated file to the website owner and he told me that the file is not well formatted. When i imported the .csv file into .xlsx the text was moved in more lines (in the source file there were just 3 lines) and they need back the text in 3 lines. I tried to do the inverse operation but maybe something didn't work.
Can someone help me again and say how to re-convert the file as in the source? ▲ Collapse | | | Samuel Murray Holanda Local time: 09:42 Membro (2006) inglês para africâner + ...
Sarah Company wrote:
When I save it in Excel, [it changes] from 3 lines into more than 10 lines.
1. How many columns are there supposed to be?
2. I'm not sure what you mean by "Import", but try this: optionally save the CSV file as a UTF16-LE, not UTF8. Then, open a blank Excel file, and then go Data > (Get & Transform Data >) Get Data > From File > From Text/CSV. Then click Load.
[Edited at 2020-11-30 14:49 GMT] | | | Try using Google Sheets from your browser | Nov 30, 2020 |
Using Excel sometimes ends up with errors.
So you could try using Google Sheets. It interprets Unicode characters in more correct manner.
Hope this helps | | | Luca Tutino Itália Membro (2002) inglês para italiano + ... Common problem | Nov 30, 2020 |
Hi Sarah,
This is an old MS Excel problem, at least for the Italian version. It is possible that your text editor saved the file with the wrong character table. My usual workaround is the following:
1. Rename a copy of the file as xxx.csv.txt
2. Open Excel and open the txt file: select File > Open, select txt in the File type field at the bottom of the Open file dialog window, navigate to the txt file and select it
3. In the new dialog Select "Delimited file... See more Hi Sarah,
This is an old MS Excel problem, at least for the Italian version. It is possible that your text editor saved the file with the wrong character table. My usual workaround is the following:
1. Rename a copy of the file as xxx.csv.txt
2. Open Excel and open the txt file: select File > Open, select txt in the File type field at the bottom of the Open file dialog window, navigate to the txt file and select it
3. In the new dialog Select "Delimited file", comma, and confirm.
4. Save as Excel
I hope it works.
[Edited at 2020-11-30 16:15 GMT] ▲ Collapse | |
|
|
Sarah Ianieri Itália Local time: 09:42 inglês para italiano + ... CRIADOR(A) DO TÓPICO already tried but not the solution | Nov 30, 2020 |
Luca Tutino wrote:
1. Rename the file as xxx.csv.txt
2. Open Excel and open the txt file, select File > Open..., select txt in the File type field at the bottom of the Open file dialog window, navigate to the txt file and select it
3. In the new dialog Select "Delimited file", comma, and confirm.
4. Save as Excel
ciao Luca,
I've already tried this process but doesn't work, the sentences are in more that 10 lines | | | Would it work to translate it in a text editor? | Nov 30, 2020 |
Hi Sarah,
Would it be possible to translate the .CSV file directly in a text editor like Notepad or Atom?
I am not sure what the file is used for on the website, but when I made small software programs with .CSV files, the .CSV files were used to store information, and the program would use each row as an array and each column as an element in the array. The formatting had to be perfect because the delimiters were used to select and write information.
Sarah Company wrote:
So before opening the CSV in Excel and mess it up, I openned it in a decent text editor and then add a BOM. Then import the CSV into Excel as opposed to opening it in Excel. Select the comma as the delimiter and then I was able to see what it should look like. So i was able to see how best to handle it... with the Excel filetype in Studio, just adding a couple of embedded content rules for the ** symbols.
I wonder if selecting the comma for the delimiter in Excel changed the formatting. Even if commas are used as delimiters in that file, there may be additional delimiters (which could be any symbol) and custom for rules for when commas are not delimiters. For example, if there is a comma within a title, it would just be part of the title. | | | Sarah Ianieri Itália Local time: 09:42 inglês para italiano + ... CRIADOR(A) DO TÓPICO unfortunatelly not | Nov 30, 2020 |
[quote]Peter Kovacik wrote:
Hi Sarah,
Would it be possible to translate the .CSV file directly in a text editor like Notepad or Atom?
Hi Peter,
Unfortunatelly not, I need to use sdl trados. | | | Samuel Murray Holanda Local time: 09:42 Membro (2006) inglês para africâner + ...
Sarah Company wrote:
I need to use SDL Trados.
1. Well, Trados' CSV filter is rather primitive (it does not support a large number of CSV dialects). For example, Trados' CSV filter can't handle line breaks in the middle of a field. Trados assumes that a line break is always the end of a record, and therefore if there is a quote character somewhere followed by a line break somewhere else, Trados regards it as a broken CSV file (which it isn't).
2. Sarah, if you have already translated some of this in Trados, and you are then able to get a correctly formatted source Excel file, can't you just use the TM from the previous job? Or is the main problem that you face that you don't know how to convert a properly formatted Excel file back to the client's semicolon delimited CSV dialect? | |
|
|
Luca Tutino Itália Membro (2002) inglês para italiano + ... Open Office? | Nov 30, 2020 |
Another possible option might be open office, the open source version of office. I found it quicker to install and lighter than expected, and sometimes helpful in this type of situations. | | | Samuel Murray Holanda Local time: 09:42 Membro (2006) inglês para africâner + ...
Peter Kovacik wrote:
When I made small software programs with .CSV files, the .CSV files were used to store information, and the program would use each row as an array and each column as an element in the array. The formatting had to be perfect because the delimiters were used to select and write information.
That is true, but there is no single standard for CSV files. Different programs are able to understand different "dialects" of CSV... or not. I'm no expert, but I have had to deal with CSV files a lot about 2 decades ago, and my experience is that these things can affect the CSV dialect:
1. Whether *all* fields are quoted:
All fields are quoted:
"this is field #1","this is field #2, and that's okay","this is field #3"
Only fields that need to be quoted are quoted:
this is field #1,"this is field #2, and that's okay",this is field #3
2. How to deal with quotes within fields
she said yes,"she said ""yes"" and no",she said no (doubled up inside, quotes outside)
she said yes,"she said \"yes\" and no",she said no (escaped inside, quotes outside)
she said yes,she said "yes" and no,she said no (as-is inside, no quotes outside)
she said yes,'she said "yes" and no',she said no (as-is inside, apostrophes outside)
she said yes,she said ""yes"" and no,she said no (doubled up inside, no quotes outside)
she said yes,she said \"yes\" and no,she said no (escaped inside, no quotes outside)
...etc.
3. Whether line breaks within fields are supported (and whether such fields need to be quoted):
this is field #1 of the first record,"this is
field #2 of the first record",this is field #3 of the first record
this is field #1 of the first record,this is
field #2 of the first record,this is field #3 of the first record
4. Whether the last field is followed by a delimiter or not.
5. Whether all records must have the same number of fields.
I recall trying to import a CSV file (in which all fields were quoted), into a program that expected only fields to be quoted that had commas in the field, and when I told the developer about this, his attitude was "your file is obviously not a real CSV file", which I understood to mean "in the village where I grew up, this is not a CSV file" but I kept quiet and did not tell him that I thought that.
[Edited at 2020-11-30 17:13 GMT] | | | CSV dialects | Nov 30, 2020 |
Samuel Murray wrote:
That is true, but there is no single standard for CSV files. Different programs are able to understand different "dialects" of CSV... or not. I'm no expert, but I have had to deal with CSV files a lot about 2 decades ago, and my experience is that these things can affect the CSV dialect:
This is the first time I have heard of CSV dialects, but the three dialects I found are ‘excel’, ‘excel-tab’, and ‘unix’. The small program that I made was not used with other programs, so it did not have to correspond to any dialect nor did the information display properly in programs like Excel.
I am not sure which CSV dialect is used for Sarah’s file, but in case it is helpful, the features listed for the ‘excel’ dialect on Python3 are:
Delimiter: ,
Doublequote: True
Escapechar: None
lineterminator: '\r\n'
quotechar: "
Quoting: 0
skipinitialspace: False
strict: False | | |
I am not at all an expert but how about trying saving it as a txt file then translating the txt file in Trados? Then convert back to csv at the end.
[Edited at 2020-12-01 00:31 GMT] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » managing a .csv file exported from a website_quite urgent Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |