Tkh's Translation Helper program (downloadable)

RepeatPan · February 11, 2015, 10:25pm

How about neither guessing nor choosing nor actually doing any of the work yourself; use Encoder.GetEncoding(string) and have access to quite a bunch of encodings.

tkh · February 12, 2015, 7:27am

You mean that the user should input an appropriate encoding? Via some kind of settings file, input textbox (not likely) or so? So far, I have managed without a settings file, but it’s not a big deal to add one if really needed.

But again, as you said yourself, it still has to be in a format that Stonehearth expects. So I shouldn’t allow the user to use any encoding either? On the other hand, perhaps it’s not up to me to deny…

RepeatPan · February 12, 2015, 8:18am

Personally. Yes, yes it is up to you to set the encoding. If a user inputs something that Stonehearth doesn’t support, the file be will be useless.

Collaboration between users is also hurt badly if one tries to load a file in a different encoding than it was saved for.

tkh · February 12, 2015, 8:45am

Here’s a test release where you can set input and output encoding in a Settings-tab.

@law, please try this:

[Link removed, always use the latest version in the top post.]

Important info for testers:
Default is… well, system default. And you can only change to UTF-8 so far. This is not the way to do it in the long run, but it’s a way to start with the UTF-8 test. The english file seems to be read correctly regardless of input encoding. When you open up a previous translation file, make sure to have the same setting as you had when you saved it. Up until now, that would be “default”. Note that you can change input encoding between original and translation opening. And before export, try to change to UTF-8 and see how that looks like in the file. Then reopen it with UTF-8 encoding and verify that it is read correctly.

At the moment, default is set to standard value for both in and out. After this test, perhaps UTF-8 will be the default value.

When I tested this with my Swedish file, I had to open it as “default”, and could export it correctly to UTF-8. When re-opening the new translation file, I had to use UTF-8 to get it to display correctly. This would be an expected (correct) behavior.

Thanks in advance for testing this, it’s hard for me to test for different languages with different character sets

tkh · February 12, 2015, 8:46am

I fully agree, but since I don’t know what encoding formats are supported by the game I can’t really make that decision (yet)

Wiese2007 · February 12, 2015, 9:02am

so i have test it - and its works (better to say i can see any change in the code xD) - first started and set in default and out utf8 - load in and check - save as -> close and reopen load new saved (no error with showing etc.- but in is as default)

edit: Ingametest failed ^^ UTF-8 false code ^^

law · February 12, 2015, 9:59am

Wow, thank you very much! It was really great, I play test success!

tkh · February 12, 2015, 10:50am

Thanks for testing!

I’m not at home, so I can’t try myself, but it looks like UTF-8 isn’t working that well…

I guess I have to ask @sdee what encoding formats are supported. If there’s a simple answer to that question…

Wiese2007 · February 12, 2015, 10:52am

jup i think that would be the easiest way - or we have lots of time to try every codec ^^ Ähmmmmmmm better not

sdee · February 13, 2015, 11:32pm

There is a simple answer but you’re not going to like it: the Stonehearth UI is SUPPOSED to be encoded in UTF-8. If it’s not working for you, that’s a bug. But where is the bug, and how do we fix it?

Everything we implemented to support i18n we did by following the instructions on our i18n js software here: http://i18next.com/. However, we haven’t tested it nearly as rigorously as you are, so we very well may be doing something wrong.

Any guidance/help you can give us to help us make this work for you is very welcome :).

Wiese2007 · February 14, 2015, 12:24am

preventive i have test it again and same issue ^^

tkh · February 14, 2015, 10:06am

Good to know, then we can skip our planned test-of-all-encodings-in-the-world

Oh, I can’t guarantee that I’m doing it right either, but I’ll double check. Basically I just selected Encoding.UTF8 or something like that as encoder, don’t know if there’s any other way to do it.

But I’ll look into it, sometimes even a blind chicken can find a seed

RepeatPan · February 14, 2015, 11:52am

For all that I can tell, the JSON is loaded properly and the HTML is written (kinda?) properly too, as told by the Chrome inspector:

<div class="title">Settings (codepage: äöü) (utf8: Ã¤Ã¶Ã¼) </div>

So the characters are not lost, which is good. But it’s not rendered as UTF-8. I’ve tried switching the meta tag to <meta http-equiv="Content-Type" content="text/html; charset=utf-8">, however, that didn’t seem to improve anything.

My conclusion therefore is ¯\_(ツ)_/¯. The UTF8 is there, but the page is rendered as something else for some reason.

Wiese2007 · February 14, 2015, 12:01pm

just for info: at the moment i changed sucessfully also receipes,entities, jobs informations and sequences … if you want to know where i can send you a list (german translation is completed) ^^

perhaps you can change your programm to ask for direction of the stonehearth.smod and then load from previously named files designated lines? its much work but with this you can change all important informations

tkh · February 16, 2015, 7:42am

Please provide a list, thanks for helping!

I know what you mean, but at the moment I have no idea how to work within the zipped file… But I don’t think that this is how we’re gonna do it later on. I have a feeling that we can dump translation files into a folder and then ingame change language. Or something even easier (like downloading language packs uploaded to Radiant servers). So I don’t want to spend too much time on nice-to-have features that will disappear for sure

tkh · February 16, 2015, 8:05am

Is there a way to verify what encoding a file has? I mean, to double check so my files are exported with correct encoding. As I understand it, the encoding format should be told or known before reading a file… and if one doesn’t know, it’s a matter of trial and error with different formats and see if the result looks fine…

I checked my code again, and I can’t see that I’m doing anything wrong.

For the input selection:

case 0:
    inputEncoding = System.Text.Encoding.Default;
    break;
case 1:
    inputEncoding = Encoding.UTF8;
    break;

and for the output selection:

case 0:
    outputEncoding = System.Text.Encoding.Default;
    break;
case 1:
    outputEncoding = Encoding.UTF8;
    break;

This is how I read:

using (StreamReader sr = new StreamReader(FileName, inputEncoding))

And how I write:

RepeatPan · February 16, 2015, 8:46am

Not really. You can look at special characters (I’ve mentioned a few before: ä ö ü Ä Ö Ü) which would look different. In a default encoding (likely latin1/some similar code page), if you open that file in notepad, you should see one character per character.

If the file was encoded with UTF8, you should see multiple characters that look odd and out of place (above example: Ã¤ Ã¶ Ã¼ Ã Ã Ã - Discourse/my browser can’t even display them properly, but there are always two characters). If you open the file in notepad (i.e. an editor that has no understanding of encoding whatsoever), you should see that too.

Generally, I would just load the file as UTF8 and not care about what it is. There’s a chance that an exception is thrown if the file has an invalid encoding, but you’ll likely never encounter that with the “default” System.Text.Encoding.Default.

=> Just use Encoding.UTF8 for reading and offer a choice for exporting - although I’d export as UTF8 only too and maaybe bug @sdee about fixing it.

tkh · February 16, 2015, 12:19pm

Ok, cool.

I used my program to export two files after each other. One with “default” and one with UTF-8. In the standard Notepad in this computer’s 64-bit Enterprise SP1 Windows 7, both files looks just fine.

Tested the same thing in Notepad++, and it automatically identifies the different encodings. But since we can manually switch which encoding to use, I get the different looks. And all seems fine from my side. Btw, “default” (Identified by my program as: “Western European (Windows)”) is being read as ANSI, and the UTF-8 (Identified by my program as “Unicode (UTF-8)”) is indeed read as UTF-8 by Notepad++.

Default read (manually switched) as UTF-8:

å - xE5
ä - xE4
ö - XF6

But if I try to copy the new text here, it looks Chinese

UTF-8 read (manually switched) as ANSI:

å - Ã¥
ä - Ã¤
ö - Ã¶

(I could copy those characters, yay)

For now, during testing, I’ll keep both Default and UTF-8 as choices. Eventually, when all this is fixed, I’ll either use UTF-8 as standard setting for both in and out… Or I throw Default away completely - as this should not be used or needed.

Wiese2007 · February 16, 2015, 7:50pm

boooooaaaaahhhhhh what have i done - i write since 2 hours on the list and max 25% finished ^^

Wiese2007 · February 16, 2015, 9:18pm

sooo my fingers qualming - i will end it tommorow ^^

Topic		Replies	Views
Translation Issue Translation and Localization	4	2044	July 16, 2015
Making Translation Helper open source - Help wanted Translation and Localization	3	1177	May 22, 2018
Alpha 12 Translations System Translation and Localization	12	2282	September 18, 2015
Turkish Translate Translation and Localization	6	2559	March 25, 2016
Let's Translate Together! Translation and Localization	12	1974	October 16, 2015

Tkh's Translation Helper program (downloadable)

Related topics