www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4483] New: Make foreach over string or wstring where element type not specified a warning

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4483

           Summary: Make foreach over string or wstring where element type
                    not specified a warning
           Product: D
           Version: D2
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: jmdavisProg gmail.com



07:43:26 PDT ---
Strings are dealt with fantastically in D overall, but the very nature of UTF-8
and UTF-16 makes it easy to screw things up when trying to deal with individual
characters. In foreach loops, D can convert the chars or wchars to dchar and
avoid the problem.

foreach(dchar c; str)

However, if you fail to give an element type

foreach(c; str)

then the element type will be the element type of the array. Normally this is
good, but for strings, wstrings, char[], and wchar[] this is generally _bad_.
In most cases, what people will likely want is to loop over each code point
rather than each code unit - that is, what they really should be using is

foreach(dchar c; str)

But if they fail to put the element type, that's not what they'll get. So, I
propose that it become a warning whenever someone tries to loop over a char[],
wchar[], string, or wstring without giving the element type. That way, the
programmer will realize that they should have put dchar, since that's almost
certainly what they wanted. If, however, they really did want char or wchar,
then they're free to put that as the element type and iterate normally. But
because iterating over the chars or wchars is almost certainly _not_ what the
programmer wants in almost all cases, it seems to me that it would be
worthwhile to give a warning in cases where they fail to put the element type.
It doesn't restrict the programmer any, but it will help them avoid bugs with
regards to the UTF-8 and UTF-16 encodings.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 18 2010
parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4483


Max <awishformore gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |awishformore gmail.com



I agree.

I'm absolutely happy with the way D handles foreach loops on strings and
char[].

However, disambiguation of foreach loops for char and wchar is a an issue that
should be addressed. I think the best solution would be to mave it the coding
standard to specify the char/dchar/wchar.

Issuing a warning whenever you iterate over a string or dstring without
explicitely declaring char/char/wchar would indeed be the way to do so. The
user is still free to ignore the warning, but accidental misuse would be
avoided and a consistently clear coding style would be encouraged.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 20 2010