Hong Kong character set?

I’ve had this query (edited for privacy and clarity) passed back to me by the outfit that bought the rights to my programs (I’m under contract to provide tech support) but have no idea of the answer, Ideas? I’m presuming they’re using some form of English - in any other language the output would fall to pieces (e.g. appending s for plural):

We’ve come across an issue concerning a user in Hong Kong. He’s just renewed his licence and has had to reinstall on a new machine running Windows 10. When he uses the program and saves project file (format: ASCII text script), he then can’t open it. I have tested this and it appears to be an issue with character encoding in the file. It would seem that the code is not accepting his Hong Kong locale so it looks like we are going to have to make some changes to the existing code.

The project file (format dating back to 1980s) is effectively a key/value script like this

abc=123
def=This is some string data
ghi=N

Thanks

Tony

Are you able to get a copy of the file that has been saved? Should give an indication of whether the issue is with saving or with loading the file - could be both of course. If the saved file is messed up, can he load a known good file?

What does the code that reads/writes the file look like?

Tony, you don’t say how you read/ write your strings to the project file. ReadLn/WriteLn, TIniFile, TStringList.LoadFromFile/SaveToFile, etc?

I had an issue reading/writing European language characters to text files at one stage. The solution for me using TStringList was to force the encoding to utf8 and always write the BOM in text files.

Google something like “delphi stringlist utf8 encoding bom” to read more.

Cheers Richard

Thanks Mark/Richard. I’ve asked for a sample file. Files are assembled using a TStringList so the encoding tip may well be the clue.

I think you are right

is almost certainly the problem. TStringlist is Unicode 32 bit and will go through a conversion to ASCII text. Any 16 bit character and quite a few 7/8 bit characters will be replaced with ? in the translation.

You maybe able to overcome it by the simple change to UTF8 in the read/write routine.

UTF32 will double your file size but that is probably not a problem.

Another solution maybe to use TInifile and let Delphi handle he issue (Double file size I expect).

If you want to go down the track of managing the files yourself I do have some useful routines to bypass the conversion but

  • They do not work within the Tstringlist read/write
  • And I often confuse myself and spend some time debugging to get the stuff right

.

The version of delphi used here is an important factor - there are bugs with TStringList encoding handling - earlier versions simply ignored it, I recall there were issue in later versions where it would lose the encoding.

I can’t login to jira to lookup the bugs (the infernal captcha wars continue) - otherwise I would provide issue links.