Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 2.5.0
Affects Version/s: None
Component/s: None
Labels:
- Windows
Environment:
Windows command line with text that isn't completely US-ASCII

Operating System:
Windows
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Any text characters above 0x7F entered on the command line for mongod.exe, mongos.exe, mongo.exe and the other programs in the suite are not necessarily being handled correctly in Windows. Although we build the Windows versions with UNICODE and _UNICODE defined, the entry point we declare is main() and this gets us text in the 8-bit code page of the invoking command window. We would need to change the entry point to wmain() to get a wide-character UTF-16 string, and this would then require using a wide version of boost::program_options to parse the 16-bit characters. The misbehavior that is seen will depend on the code page of the invoking command window. In US English versions of Windows, you get the DOS-compatible code page 437 if you haven't changed your configuration. In Western European versions of Windows you may get code page 1252 which is the same as ISO Latin 1 and so the same as Unicode for characters up to 0xFF. Beyond these issues, there may be instances where data isn't handled correctly: I found and am fixing a few I found in the Windows Service code. We were getting sign-extension of characters between 0x80 and 0xFF, which turned 0xE1 ("LATIN SMALL LETTER A WITH ACUTE", 'á') into U+FFE1 (displays as "FULLWIDTH POUND SIGN", '￡').

This may not be an issue for some users (US-only, or European/UK users using code page 1252) but the issue is likely to pop up repeatedly until we make the code fully Unicode-capable.

is depended on by

SERVER-5333 Issues with non-ASCII characters in filenames and paths in Windows

Closed

related to

SERVER-7496 Mongo.exe client crashes when username of home directory contains a unicode character

Closed

Assignee:: Unassigned
Reporter:: Tad Marshall
Participants:: auto, Tad Marshall
Votes:: 1 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Feb 26 2012 12:09:01 AM UTC
Updated:: Jul 11 2016 06:33:33 PM UTC
Resolved:: Mar 16 2013 07:47:00 PM UTC
Confidence Status Last Update:: 15/Mar/13 1:07 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates