I have noticed something unexpected in the character encoding of certain character strings
when retrieved via pstsdk. I was assuming that everything is returned in Unicode, but that
does not seem to be true for all cases. For example, I have a .pst file with the folder
name "Smazaná pošta". In my MAPI application, if I retrieve PR_DISPLAY_NAME_W, I get the
53 00 6d 00 61 00 7a 00 61 00 6e 00 e1 00 20 00 S.m.a.z.a.n... .
70 00 6f 00 61 01 74 00 61 00 00 p.o.a.t.a..
Note that the lower case 'a' with acute accent is encoded as "E1 00",and the lower case
's' with caron is encoded as "61 01" which is correct Unicode.
Using pstsdk, I have retrieved the same property by using folder.get_name() and also by
reading directly the property 0x3001. In both cases, the bytes returned, by looking at
53 00 6d 00 61 00 7a 00 61 00 6e 00 e1 ff 20 00 S.m.a.z.a.n... .
70 00 6f 00 9a ff 74 00 61 00 00 p.o...t.a..
For the two characters in question, the second byte is FF, and the first byte represents
the ISO Latin-1 encoding of the character. So I suppose I do have a means of correcting
the problem, i.e. looking for 'FF' then doing a one byte conversion, but I would like to
understand why pstsdk is not supplying the same bytes that MAPI does. Is MAPI performing
some internal conversion for applications? Is my suggested workaround safe for all cases?
I have tested other pst files containing Asian character sets and not had a problem