Good morning! Here are two draft patches. These still need to be tested on multiple platforms, but I thought it would be good to ask for feedback now instead of later. :-) Once you've had a chance to suggest corrections, I'll send you a single big patch
Fix alignment of bth_nonleaf_entry under 32-bit Mac GCC
This is a fairly simple build patch for 32-bit MacOS X. We'll test this on 32-bit and 64-bit Linux before I officially submit it.
Convert ANSI string to Unicode wstring by assuming code page 1252
This is a much more dubious and debatable patch. :-) Our application calls 'props->read_prop<wstring>' quite often, and this works fine for Unicode PSTs. It also works for ANSI PSTs that contain ASCII text. But at least on my version
of GCC, it fails for all characters between 0x80 and 0xFF, inclusive. Specifically, a character like 0xA7 gets sign-extended to 0xFFA7, instead of being converted to 0x00A7.
Since ANSI PSTs apparently don't contain reliable code page information, there's no way to make 'props->read_prop<wstring>' do the right thing automatically. But the current code
appears to try to support ANSI PSTs using code page 1252, which are the most common ANSI PSTs. So my patch adds an explicit conversion function and some unit tests. It also tries to do the right thing with the tricky characters between 0x80 and
0x9F, inclusive, using the conversion tables Microsoft submitted to the Unicode consortium. Using this patch, it's possible to access code page 1252 ANSI PSTs using 'props->read_prop<wstring>'. Other code pages will still require calling 'props->read_prop<string>'
and converting strings using an external library.
Obviously, this patch is still neither complete nor correct, but it might arguably be a step in the right direction. Please feel free to tell me that I'm nuts for even suggesting it. :-)
As always, thank you for such an excellent library, and for all your feedback!