www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 22187] New: std.utf.byUTF lags behind underlying streams by

https://issues.dlang.org/show_bug.cgi?id=22187

          Issue ID: 22187
           Summary: std.utf.byUTF lags behind underlying streams by one
                    codepoint
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: phobos
          Assignee: nobody puremagic.com
          Reporter: hr hrhr.dev

I've got an InputRange!char which relies on user input from a C library
(ncurses) and I wanted to iterate over it by dchar, so I used std.utf.byDchar
(an alias for std.utf.byUTF!dchar) to wrap it, and I tried to terminate parsing
when I received the newline character from the dchar stream. However, as a
user, I had to input "my text\n[any other character]", while the underlying
stream was able to stop at "my text\n". This seems to be because, after byUTF
parses a codepoint (I think for char -> dchar, this is done in decodeFront) it
uses popFront. This blocks until the user inputs another character.

A simple way to fix this would be to store the front codepoint in a buffer and
then, on subsequent runs, run popFront before reading the front codepoint.

I may be able to help write a patch, but I'm not very familiar with all the
language features std.utf uses.

--
Aug 06 2021