Problem reading fields from dbf created with SetAnsi == FALSE
https://www.xsharp.eu/forum/topic?p=29347#p29347
Create a dbf with SetAnsi(FALSE) in VO, with this code:
FUNCTION Start() AS INT
LOCAL cDbf AS STRING
SetAnsi(FALSE)
cDbf := "C:\dbf\testansi"
DbCreate(cDbf,{{"TXT","C",20,0}})
SetAnsi(FALSE)
DbUseArea(,,cDbf)
DbAppend();FieldPut(1,"öÖäÄ")
? FieldGet(1)
? AllTrim(FieldGet(1)) == "öÖäÄ" // true
DbCloseArea()
SetAnsi(TRUE)
DbUseArea(TRUE,,cDbf)
? FieldGet(1)
? AllTrim(FieldGet(1)) == "öÖäÄ" // false
DbCloseArea()
WAIT
RETURN 0
Now try opening it in X#, with similar code, again using SetAnsi(FALSE), umlautes are not being read correctly:
FUNCTION Start() AS VOID
LOCAL cDbf AS STRING
cDbf := "C:\dbf\testansi"
SetAnsi(FALSE)
DbUseArea(TRUE,,cDbf)
? FieldGet(1)
? AllTrim(FieldGet(1)) == "öÖäÄ" // FALSE, wrong
DbCloseArea()
SetAnsi(TRUE)
DbUseArea(TRUE,,cDbf)
? FieldGet(1)
? AllTrim(FieldGet(1)) == "öÖäÄ" // FALSE, OK
DbCloseArea()
Confirmed fixed
This is really strange, but while the problem is fixed in the code supplied (which uses the default DBFNTX), but when using the DBFCDX driver (an index file does not need to exist!), the problem persists. To reproduce, just add a
RddSetDefault("DBFCDX")
in the beginning of the X# code.
The DBF Header for DBFNTX files is encoded with 0x03 for OEM files and 0x07 for Ansi files. The bit with 0x04 indicates that the file is encoded in Ansi. DBFCDX files do not use this bit.
In the RDD we detect that the file is opened with DBFNTX and that the Ansi bit is not set. In that case, the strings read from the DBF are translated from OEM to Ansi before they are converted to Unicode like this (in the DbfColumn:_GetString method)
PROTECTED METHOD _GetString(buffer AS BYTE[]) AS STRING
// The default implementation returns the part of the buffer as a string
local result as string
if SELF:RDD is DBFNTX var oDbfNtx .and. ! oDbfNtx:Header:IsAnsi
var tmp := Byte[]{SELF:Length}
Array.Copy(buffer, SELF:Offset, tmp, 0, SELF:Length)
Ansi2OemA(tmp)
result := SELF:RDD:_Encoding:GetString(tmp, 0, SELF:Length)
else
result := SELF:RDD:_Encoding:GetString(buffer, SELF:Offset, SELF:Length)
endif
return result
The Whole OEM/Ansi thing for DBFs is only valid for DBFNTX. In VO the DBFCDX driver never marks the DBF with an Ansi bit. If you create the file with "DBFCDX" then regardless of the SetAnsi setting the first byte is always 0x03. I can ignore the IsAnsi setting in the DBF Header and also convert the buffer in the code above when SetAnsi is FALSE and DBFCDX is used. That would probably work.
I did an extra check with VO. When created with SetAnsi(TRUE), byte 0x30 is 0x03, and with SetAnsi(FALSE) 0x01. This is the CodePage byte. We use that byte to determine the CodePage use. 0x01 on my machine translates to CodePage 437 and 0x03 translates to CodePage 1252. I checked and we do not need the Ansi2Oem translation at all with these codepages when we simply ignore the SetAnsi() setting in the RDD.
Chris, In your example code in VO with DBFNTX:
- When you create the file with SetAnsi(FALSE) then byte 1 = 0x03 and byte 30 = 0x01
- When you create the file with SetAnsi(TRUE) then byte 1 = 0x07 and byte 30 = 0x03
- If you then open the file and start adding records, and if you keep SetAnsi() the same as when creating the file then
"öÖäÄ" is always written as
0xF6D6E4C4regardless of the setting of SetAnsi() - The same happens with DBFCDX with the exception that byte 1 is always 0x03 with DBFCDX, so the Ansi flag (the value 0x04) for byte 1 is not set.
However if you create the file with one setting and then open it later with a different setting, then the data is written differently, but only when the table is created with SetAnsi(FALSE) and written with SetAnsi(TRUE)
The following table lists created / written settings for the string "öÖäÄ" with western European windows codepage (OEM = SetAnsi(FALSE))
| RDD | Ansi / Ansi | Ansi / Oem | Oem / Oem | Oem / Ansi |
|---|---|---|---|---|
| DBFNTX | 0xF6D6E4C4 | 0xF6D6E4C4 | 0xF6D6E4C4 | 0x9499848e |
| DBFCDX | 0xF6D6E4C4 | 0xF6D6E4C4 | 0xF6D6E4C4 | 0x9499848e |
So it seems that in VO DBFNTX sets the Ansi bit in the header, but ignores that for the conversion. It does use that bit for the locking scheme though.
I personally do not understand why anyone would create the file with SetAnsi(FALSE) and then let the app use the file with SetAnsi(TRUE).
So apparently, only when the Ansi bit in the header is not set AND SetAnsi(TRUE) then VO converts Ansi to OEM when writing and OEM to Ansi when reading. The fact that DBFCDX never sets the Ansi bit is somehow ignored.
Robert, the sample code tries to use the file with SetAnsi(FALSE), which is the same setting that was used when creating the dbf/index with VO. But this fails while it shouldn't.
I just added some extra code with SetAnsi(TRUE), to show that with this setting the data is actually read correctly in this case..
Robert, unfortunately it still doesn't work, in fact now it's not working even with DBFNTX or no index at all, while it was working before.
Please see this dbf/cdx, it's the files of the original report that were generated in VO with SetAnsi(FALSE) and test it with this code in X# (which also uses SetAnsi(FALSE)), both with DBFNTX and DBFCDX. Currently in both cases the umlautes are not read correctly:
FUNCTION Start() AS VOID LOCAL cDbf AS STRING RddSetDefault("DBFCDX") // RddSetDefault("DBFNTX")
cDbf := "C:\xSharp\Users\Franz\RBOEM\kgr"
SetAnsi(FALSE)
DbUseArea(,,cDbf)
? FieldGet(2)
LOCAL f AS System.Windows.Forms.Form
f := System.Windows.Forms.Form{}
f:Text := FieldGet(2)
f:ShowDialog()
DbCloseArea()
And here's a full sample generating a similar file in VO and using it in X#. Currently neither DBFCFX or DBFNTX work correctly:
VO code:
FUNCTION Start() AS INT
LOCAL cDbf AS STRING
SetAnsi(FALSE)
// RddSetDefault("DBFNTX")
RddSetDefault("DBFCDX")
cDbf := "C:\dbf\testansi"
FErase(cDbf + ".cdx")
FErase(cDbf + ".ntx")
DBCREATE(cDbf,{{"FLD","C",20,0}})
DBUSEAREA(,,cDbf)
DBCREATEINDEX(cDbf, "fld")
DBAPPEND()
FIELDPUT(1,"öÖäÄ")
? FIELDGET(1)
? AllTrim(FIELDGET(1)) == "öÖäÄ" // true
DBCLOSEAREA()
wait
RETURN 0
X# code:
FUNCTION Start() AS VOID
LOCAL cDbf AS STRING
// RddSetDefault("DBFNTX")
RddSetDefault("DBFCDX")
cDbf := "C:\dbf\testansi"
SetAnsi(FALSE)
DbUseArea(,,cDbf)
? DbSetIndex(cDbf)
? FieldGet(1)
LOCAL f AS System.Windows.Forms.Form
f := System.Windows.Forms.Form{}
f:Text := FieldGet(1)
f:ShowDialog()
DbCloseArea()
Can you try again?
Great, looks good now! But will also send it to Franz for confirmation.
Franz confirmed is working now. But will wait for the additional change we discussed to have the new behavior enabled only when SetAnsi() == FALSE and will resend for confirmation.
I made an additional change yesterday to ensure that the same codepage is also used when updating memo fields and the index keys in the indexes. The difference in the algorithm with before is that the check for the 'Ansi' flag in the DBF in combination with the SetAnsi setting is only done when opening the file. Once the file is opened it is either in OEM or ANSI mode. Changing the SetAnsi setting later will have no effect until the file is closed and opened again.
It looks good to me, but will send Franz the new X# installer when it's available for beta testing, for Franz to test with his actual app.