Tkh's Translation Helper program (downloadable)

How about neither guessing nor choosing nor actually doing any of the work yourself; use Encoder.GetEncoding(string) and have access to quite a bunch of encodings.

You mean that the user should input an appropriate encoding? Via some kind of settings file, input textbox (not likely) or so? So far, I have managed without a settings file, but it’s not a big deal to add one if really needed.

But again, as you said yourself, it still has to be in a format that Stonehearth expects. So I shouldn’t allow the user to use any encoding either? On the other hand, perhaps it’s not up to me to deny…

Personally. Yes, yes it is up to you to set the encoding. If a user inputs something that Stonehearth doesn’t support, the file be will be useless.

Collaboration between users is also hurt badly if one tries to load a file in a different encoding than it was saved for.

Here’s a test release where you can set input and output encoding in a Settings-tab.

@law, please try this:

[Link removed, always use the latest version in the top post.]

Important info for testers:
Default is… well, system default. And you can only change to UTF-8 so far. This is not the way to do it in the long run, but it’s a way to start with the UTF-8 test. The english file seems to be read correctly regardless of input encoding. When you open up a previous translation file, make sure to have the same setting as you had when you saved it. Up until now, that would be “default”. Note that you can change input encoding between original and translation opening. And before export, try to change to UTF-8 and see how that looks like in the file. Then reopen it with UTF-8 encoding and verify that it is read correctly.

At the moment, default is set to standard value for both in and out. After this test, perhaps UTF-8 will be the default value.

When I tested this with my Swedish file, I had to open it as “default”, and could export it correctly to UTF-8. When re-opening the new translation file, I had to use UTF-8 to get it to display correctly. This would be an expected (correct) behavior.

Thanks in advance for testing this, it’s hard for me to test for different languages with different character sets :blush:

I fully agree, but since I don’t know what encoding formats are supported by the game I can’t really make that decision (yet) :blush:

so i have test it - and its works (better to say i can see any change in the code xD) - first started and set in default and out utf8 - load in and check - save as -> close and reopen load new saved (no error with showing etc.- but in is as default)

edit: Ingametest failed ^^ UTF-8 false code ^^

1 Like

Wow, thank you very much! It was really great, I play test success!

Thanks for testing!

I’m not at home, so I can’t try myself, but it looks like UTF-8 isn’t working that well…

I guess I have to ask @sdee what encoding formats are supported. If there’s a simple answer to that question… :blush:

jup i think that would be the easiest way - or we have lots of time to try every codec ^^ Ähmmmmmmm better not :sweat_smile:

There is a simple answer but you’re not going to like it: the Stonehearth UI is SUPPOSED to be encoded in UTF-8. If it’s not working for you, that’s a bug. But where is the bug, and how do we fix it?

Everything we implemented to support i18n we did by following the instructions on our i18n js software here: http://i18next.com/. However, we haven’t tested it nearly as rigorously as you are, so we very well may be doing something wrong.

Any guidance/help you can give us to help us make this work for you is very welcome :).

1 Like

preventive i have test it again and same issue ^^

Good to know, then we can skip our planned test-of-all-encodings-in-the-world :wink:

Oh, I can’t guarantee that I’m doing it right either, but I’ll double check. Basically I just selected Encoding.UTF8 or something like that as encoder, don’t know if there’s any other way to do it.

But I’ll look into it, sometimes even a blind chicken can find a seed :blush:

2 Likes

For all that I can tell, the JSON is loaded properly and the HTML is written (kinda?) properly too, as told by the Chrome inspector:

<div class="title">Settings (codepage: äöü) (utf8: äöü) </div>

So the characters are not lost, which is good. But it’s not rendered as UTF-8. I’ve tried switching the meta tag to <meta http-equiv="Content-Type" content="text/html; charset=utf-8">, however, that didn’t seem to improve anything.

My conclusion therefore is ¯\_(ツ)_/¯. The UTF8 is there, but the page is rendered as something else for some reason.

3 Likes

just for info: at the moment i changed sucessfully also receipes,entities, jobs informations and sequences … if you want to know where i can send you a list (german translation is completed) ^^

perhaps you can change your programm to ask for direction of the stonehearth.smod and then load from previously named files designated lines? its much work but with this you can change all important informations

Please provide a list, thanks for helping! :blush:

I know what you mean, but at the moment I have no idea how to work within the zipped file… But I don’t think that this is how we’re gonna do it later on. I have a feeling that we can dump translation files into a folder and then ingame change language. Or something even easier (like downloading language packs uploaded to Radiant servers). So I don’t want to spend too much time on nice-to-have features that will disappear for sure :wink:

Is there a way to verify what encoding a file has? I mean, to double check so my files are exported with correct encoding. As I understand it, the encoding format should be told or known before reading a file… and if one doesn’t know, it’s a matter of trial and error with different formats and see if the result looks fine…

I checked my code again, and I can’t see that I’m doing anything wrong.

For the input selection:

case 0:
    inputEncoding = System.Text.Encoding.Default;
    break;
case 1:
    inputEncoding = Encoding.UTF8;
    break;

and for the output selection:

case 0:
    outputEncoding = System.Text.Encoding.Default;
    break;
case 1:
    outputEncoding = Encoding.UTF8;
    break;

This is how I read:

using (StreamReader sr = new StreamReader(FileName, inputEncoding))

And how I write:

Not really. You can look at special characters (I’ve mentioned a few before: ä ö ü Ä Ö Ü) which would look different. In a default encoding (likely latin1/some similar code page), if you open that file in notepad, you should see one character per character.

If the file was encoded with UTF8, you should see multiple characters that look odd and out of place (above example: ä ö ü Ä Ö Ü - Discourse/my browser can’t even display them properly, but there are always two characters). If you open the file in notepad (i.e. an editor that has no understanding of encoding whatsoever), you should see that too.

Generally, I would just load the file as UTF8 and not care about what it is. There’s a chance that an exception is thrown if the file has an invalid encoding, but you’ll likely never encounter that with the “default” System.Text.Encoding.Default.

=> Just use Encoding.UTF8 for reading and offer a choice for exporting - although I’d export as UTF8 only too and maaybe bug @sdee about fixing it.

Ok, cool.

I used my program to export two files after each other. One with “default” and one with UTF-8. In the standard Notepad in this computer’s 64-bit Enterprise SP1 Windows 7, both files looks just fine.

Tested the same thing in Notepad++, and it automatically identifies the different encodings. But since we can manually switch which encoding to use, I get the different looks. And all seems fine from my side. Btw, “default” (Identified by my program as: “Western European (Windows)”) is being read as ANSI, and the UTF-8 (Identified by my program as “Unicode (UTF-8)”) is indeed read as UTF-8 by Notepad++.

Default read (manually switched) as UTF-8:

å - xE5
ä - xE4
ö - XF6

But if I try to copy the new text here, it looks Chinese :blush:

UTF-8 read (manually switched) as ANSI:

å - Ã¥
ä - ä
ö - ö

(I could copy those characters, yay)

For now, during testing, I’ll keep both Default and UTF-8 as choices. Eventually, when all this is fixed, I’ll either use UTF-8 as standard setting for both in and out… Or I throw Default away completely - as this should not be used or needed.

boooooaaaaahhhhhh what have i done - i write since 2 hours on the list and max 25% finished ^^

1 Like

sooo my fingers qualming - i will end it tommorow ^^