digitalmars.D - Second Round CURL Wrapper Review

dsimcha (14/14) Dec 02 2011 I volunteered ages ago to manage the review for the second round of

mta`chrono (3/6) Dec 02 2011 What about:

Jonas Drewsen (4/10) Dec 03 2011 I would really hate if phobos would be depending on deimos. As I see it
Kapps (5/11) Dec 03 2011 Deimos is meant to be something you grab parts of as necessary; as such,...

Vladimir Panteleev (15/23) Dec 02 2011 Yes, please! Finding/building a library compatible with OPTLINK is a

Jonas Drewsen (18/28) Dec 03 2011 When the all buffers have been filled yes. I'll state that explicitely

Vladimir Panteleev (12/17) Dec 03 2011 I can't think of a realistic use case for the current asynchronous API. ...

Jonas Drewsen (18/31) Dec 03 2011 The standard example is downloading some content and saving it at the

Vladimir Panteleev (9/41) Dec 03 2011 Well, this makes sense from a theoretical / high-level perspective, but ...

Jonas Drewsen (6/48) Dec 04 2011 Read/write buffers does indeed help a lot and there have been quite some...

Vladimir Panteleev (9/41) Dec 04 2011 If you're referring to the discussion which was in the context of copyin...

Jonas Drewsen (12/52) Dec 04 2011 The same applies here because all it comes down to in the end is the

Vladimir Panteleev (9/19) Dec 05 2011 I still don't see how this could ever be useful in practice without the ...

Jonas Drewsen (18/36) Dec 05 2011 It is - but as stated it all depends on buffer sizes and IO speed. It is...

Vladimir Panteleev (6/20) Dec 05 2011 Thanks. Perhaps include this example in the documentation?
Somedude (3/8) Dec 12 2011 Just an idea: would it be possible/useful to use the signals/slots

jdrewsen (25/34) Dec 12 2011 This would be most useful if there were some kind of main loop.

David Nadlinger (5/13) Dec 12 2011 I don't know if you already have a solution in the works, but maybe the

dsimcha (14/19) Dec 12 2011 Doesn't std.parallelism's task parallelism API work for this?
jdrewsen (6/26) Dec 13 2011 I don't have any plans for this in the curl stuff. But it is

Jonas Drewsen (4/18) Dec 03 2011 Thank you for stepping up again even though I know you have plenty of
Jonas Drewsen (5/11) Dec 03 2011 If there are no licensing issues I think it should. At least until we
Jacob Carlborg (4/13) Dec 04 2011 I vote for std.net.curl.

Marco Leise (4/18) Dec 04 2011 Hey you std.net.curl people must be cheating. You were 50% of the etc.cu...

Vladimir Panteleev (6/27) Dec 04 2011 That's internet polls for you. I think we should keep voting as NG

Marco Leise (8/36) Dec 05 2011 Polling on the NG has two problems:

Vladimir Panteleev (8/13) Dec 05 2011 One fake vote is unlikely to swerve the result in either direction, and ...

dsimcha (28/28) Dec 11 2011 Here's my review. Remember, review ends on December 16.

dsimcha (4/6) Dec 11 2011 Oh, one more thing: Factor the protocol detection out into a function.

Jonathan M Davis (6/14) Dec 11 2011 And startsWith will take multiple arguments, so the code definitely shou...

jdrewsen (4/20) Dec 12 2011 I hadn't noticed that multiple arguments is accepted. Thanks.

Andrei Alexandrescu (3/24) Dec 12 2011 You can also pass a predicate for case insensitive comparison.

jdrewsen (3/32) Dec 13 2011 Perfect

Somedude (4/8) Dec 12 2011 I think it's good to remind that this is a wrapper and not a native D
jdrewsen (14/43) Dec 12 2011 will fix

Jakob Ovrum (4/8) Dec 12 2011 I agree with this, it's pointless to limit ourselves to an

Jacob Carlborg (6/18) Dec 12 2011 That's a good question. When I think about it, I think it should be
Jonathan M Davis (5/9) Dec 14 2011 It's too specific to curl for curl to be an implementation detail. I exp...

Somedude (2/10) Dec 12 2011 Yes. This is a must.

bearophile (4/7) Dec 12 2011 I agree it's better to have this battery included.

Jonathan M Davis (84/95) Dec 14 2011 I'd argue that it should be a separate download but that it should defin...

jdrewsen (56/172) Dec 17 2011 AutoConnect sounds like a command to connection automatically

Justin C Calvarese (10/27) Dec 17 2011 I absolutely agree that the "D Style" page should be revised and extende...
Justin C Calvarese (10/26) Dec 17 2011 (I'm sorry if this comes across as a repost, but I guess I don't know ho...
Justin C Calvarese (10/26) Dec 17 2011 (I'm sorry if this comes across as a repost, but I guess I don't know ho...
Jakob Ovrum (4/8) Dec 17 2011 How about something completely different then, like URLInfer?

jdrewsen (4/12) Dec 17 2011 Will change case.

Jakob Ovrum (2/17) Dec 17 2011 Andrei is right, a noun is better. I'm with InferredProtocol.

Jonathan M Davis (4/26) Dec 17 2011 You inferred protocol is better. Still annoyingly long, but it's a clear...

Jonathan M Davis (75/181) Dec 17 2011 I actually have such a pull request, but it's been languishing because t...
Jonathan M Davis (5/36) Dec 17 2011 Yeah. I need to get back to that. Portions of it were completely agreed ...

jdrewsen (5/62) Dec 18 2011 Nice to see that something is in progress. Please also add a note

Jonathan M Davis (64/64) Dec 15 2011 Line# 235 is identical to line# 239. Shouldn't line# 235 be creating an ...

jdrewsen (37/113) Dec 17 2011 It has been tested as you can see by the unittest just below the

Jonathan M Davis (25/81) Dec 17 2011 Well, when there's code like that that is clearly wrong, it makes it loo...

jdrewsen (5/99) Dec 18 2011 A static ternary operator wouldn't work in this case since

Jonathan M Davis (7/9) Dec 18 2011 Ah. That would be true. Still, it's the sort of thing which begs for a t...

Jonathan M Davis (113/113) Dec 16 2011 Please make sure that you remove trailing whitespace from the file. A lo...

jdrewsen (36/172) Dec 30 2011 ok

Jakob Ovrum (13/13) Dec 17 2011 The docs for onReceiveHeader,

jdrewsen (4/17) Dec 30 2011 I agree. Actually I thought that 'in' was just an alias of

Andrei Alexandrescu (67/78) Dec 17 2011 std.net.curl

jdrewsen (40/135) Dec 30 2011 ok

Somedude (5/25) Dec 19 2011 A bit late, but following Andrei's important remark that some operations

Somedude (10/39) Dec 19 2011 It looks like curl_multi_perform's behavior is more interesting than

jdrewsen (8/65) Dec 30 2011 I am aware of the multi_perform API. The multi API actually uses

dsimcha <dsimcha yahoo.com> writes:

I volunteered ages ago to manage the review for the second round of 
Jonas Drewsen's CURL wrapper.  After the first round it was decided 
that, after a large number of minor issues were fixed, a second round 
would be necessary.

Significant open issues:

1.  Should libcurl be bundled with DMD on Windows?

2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back 
but it was buried deep in a thread and a lot of people may have missed 
it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

For those of you on Windows, a libcurl binary built by DMC is available 
at http://gool.googlecode.com/files/libcurl_7.21.7.zip.

Review starts now and ends on December 16, followed by one week of voting.

Dec 02 2011

mta`chrono <chrono mta-international.net> writes:

 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

What about:

Low Level Part --> deimos
High Level Part --> std.curl

Dec 02 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 03/12/11 06.12, mta`chrono wrote:
 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 What about:

 Low Level Part -->  deimos
 High Level Part -->  std.curl

I would really hate if phobos would be depending on deimos. As I see it 
either both levels go in or none at all.

/Jonas

Dec 03 2011

Kapps <Kapps NotValidEmail.com> writes:

On 02/12/2011 11:12 PM, mta`chrono wrote:
 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 What about:

 Low Level Part -->  deimos
 High Level Part -->  std.curl

Deimos is meant to be something you grab parts of as necessary; as such, 
it would not (IMO) make too much sense to force something in Phobos to 
be dependent on it, as now the user has to go and get something else 
just to use Phobos.

Dec 03 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sat, 03 Dec 2011 06:26:10 +0200, dsimcha <dsimcha yahoo.com> wrote:

 I volunteered ages ago to manage the review for the second round of  
 Jonas Drewsen's CURL wrapper.  After the first round it was decided  
 that, after a large number of minor issues were fixed, a second round  
 would be necessary.

 Significant open issues:

 1.  Should libcurl be bundled with DMD on Windows?

Yes, please! Finding/building a library compatible with OPTLINK is a  
needless hassle.

 Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

I think the docs for the asynchronous functions need a bit more  
elaboration. For example, I had these questions after reading the  
documentation:

Does the working thread block or buffer when data is available but the  
main thread isn't reading it?
The examples for byLine and byLineAsync are almost identical. Is there any  
difference in functionality?

Is it possible to check if data is available without blocking? If not,  
what's the point of byLineAsync (why not use byLine)?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 02 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 03/12/11 08.56, Vladimir Panteleev wrote:
 On Sat, 03 Dec 2011 06:26:10 +0200, dsimcha <dsimcha yahoo.com> wrote:

<snip>
 I think the docs for the asynchronous functions need a bit more
 elaboration. For example, I had these questions after reading the
 documentation:

 Does the working thread block or buffer when data is available but the
 main thread isn't reading it?

When the all buffers have been filled yes. I'll state that explicitely 
in the docs.

 The examples for byLine and byLineAsync are almost identical. Is there
 any difference in functionality?

As mentioned the async version performs the request in another thread 
leaving the main thread available for you to do something else. I'll 
clarify in the docs that it is when you actually call empty/front on the 
returned range you will get data from the other thread and may be blocking.

 Is it possible to check if data is available without blocking? If not,
 what's the point of byLineAsync (why not use byLine)?

No.

The .empty property needs to wait in order to have the proper range 
behavior. But is does seem like a good idea to add a method for this. 
Just need a nice name for it since I guess this will be needed by future 
async ranges as well. Maybe range.wait(duration) returning true if data 
is present after waiting?

Going down this road makes me think that selecting on async ranges would 
be a very nice feature. But that is definitely not something that will 
get included in the version getting reviewed now.

/Jonas

Dec 03 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 As mentioned the async version performs the request in another thread  
 leaving the main thread available for you to do something else. I'll  
 clarify in the docs that it is when you actually call empty/front on the  
 returned range you will get data from the other thread and may be  
 blocking.

I can't think of a realistic use case for the current asynchronous API.  
Basically, all you can do is start a request in the background, but you're  
neither notified of the request nor can you poll it to check its status?  
So the only thing you CAN do is ask for a request that you will need in  
the future, and when that future moment comes, block if necessary to get  
the result?

Adding a method to query the request's status would certainly be useful.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 03 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

Den 03-12-2011 13:10, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 As mentioned the async version performs the request in another thread
 leaving the main thread available for you to do something else. I'll
 clarify in the docs that it is when you actually call empty/front on
 the returned range you will get data from the other thread and may be
 blocking.

 I can't think of a realistic use case for the current asynchronous API.
 Basically, all you can do is start a request in the background, but
 you're neither notified of the request nor can you poll it to check its
 status? So the only thing you CAN do is ask for a request that you will
 need in the future, and when that future moment comes, block if
 necessary to get the result?

The standard example is downloading some content and saving it at the 
same time.

While your main thread saves a chunk to disk (or uploads to another 
server) the async thread is busy buffering incoming chunks of data. This 
means that you effectively only wait for the slowest of the two IO 
operations. If you did it synchronously would worst case have to wait 
for all everything to be downloaded and the wait for everything to be 
saved or uploaded.

foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin"))
{
     // While writing to file in this thrad
     // new chunks are downloaded
     // in the background thread
     file.write(chunk);
}

I hope this makes sense.

/Jonas

Dec 03 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 Den 03-12-2011 13:10, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 As mentioned the async version performs the request in another thread
 leaving the main thread available for you to do something else. I'll
 clarify in the docs that it is when you actually call empty/front on
 the returned range you will get data from the other thread and may be
 blocking.

 I can't think of a realistic use case for the current asynchronous API.
 Basically, all you can do is start a request in the background, but
 you're neither notified of the request nor can you poll it to check its
 status? So the only thing you CAN do is ask for a request that you will
 need in the future, and when that future moment comes, block if
 necessary to get the result?

 The standard example is downloading some content and saving it at the  
 same time.

 While your main thread saves a chunk to disk (or uploads to another  
 server) the async thread is busy buffering incoming chunks of data. This  
 means that you effectively only wait for the slowest of the two IO  
 operations. If you did it synchronously would worst case have to wait  
 for all everything to be downloaded and the wait for everything to be  
 saved or uploaded.

 foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin"))
 {
      // While writing to file in this thrad
      // new chunks are downloaded
      // in the background thread
      file.write(chunk);
 }

 I hope this makes sense.

Well, this makes sense from a theoretical / high-level perspective, but OS  
write buffers greatly reduce the practicality of this. In common use cases  
the speed difference between disk and wire will differ by orders of  
magnitude as well.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 03 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

Den 03-12-2011 21:58, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 Den 03-12-2011 13:10, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 As mentioned the async version performs the request in another thread
 leaving the main thread available for you to do something else. I'll
 clarify in the docs that it is when you actually call empty/front on
 the returned range you will get data from the other thread and may be
 blocking.

 I can't think of a realistic use case for the current asynchronous API.
 Basically, all you can do is start a request in the background, but
 you're neither notified of the request nor can you poll it to check its
 status? So the only thing you CAN do is ask for a request that you will
 need in the future, and when that future moment comes, block if
 necessary to get the result?

 The standard example is downloading some content and saving it at the
 same time.

 While your main thread saves a chunk to disk (or uploads to another
 server) the async thread is busy buffering incoming chunks of data.
 This means that you effectively only wait for the slowest of the two
 IO operations. If you did it synchronously would worst case have to
 wait for all everything to be downloaded and the wait for everything
 to be saved or uploaded.

 foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin"))
 {
 // While writing to file in this thrad
 // new chunks are downloaded
 // in the background thread
 file.write(chunk);
 }

 I hope this makes sense.

 Well, this makes sense from a theoretical / high-level perspective, but
 OS write buffers greatly reduce the practicality of this. In common use
 cases the speed difference between disk and wire will differ by orders
 of magnitude as well.

Read/write buffers does indeed help a lot and there have been quite some 
discussions on this topic earlier in this newsgroups regarding tradeoffs 
etc. Please have a look at these threads (i dont have any links at hand 
unfortunately).

/Jonas

Dec 04 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sun, 04 Dec 2011 12:59:15 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 Den 03-12-2011 21:58, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:
 The standard example is downloading some content and saving it at the
 same time.

 While your main thread saves a chunk to disk (or uploads to another
 server) the async thread is busy buffering incoming chunks of data.
 This means that you effectively only wait for the slowest of the two
 IO operations. If you did it synchronously would worst case have to
 wait for all everything to be downloaded and the wait for everything
 to be saved or uploaded.

 foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin"))
 {
 // While writing to file in this thrad
 // new chunks are downloaded
 // in the background thread
 file.write(chunk);
 }

 I hope this makes sense.

 Well, this makes sense from a theoretical / high-level perspective, but
 OS write buffers greatly reduce the practicality of this. In common use
 cases the speed difference between disk and wire will differ by orders
 of magnitude as well.

 Read/write buffers does indeed help a lot and there have been quite some  
 discussions on this topic earlier in this newsgroups regarding tradeoffs  
 etc. Please have a look at these threads (i dont have any links at hand  
 unfortunately).

If you're referring to the discussion which was in the context of copying  
files, then I have read it. However, it does not apply to typical use  
cases of curl. The question here is if this example makes any practical  
sense, and by the looks of everything it does not. Or do you disagree?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 04 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

Den 04-12-2011 12:05, Vladimir Panteleev skrev:
 On Sun, 04 Dec 2011 12:59:15 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 Den 03-12-2011 21:58, Vladimir Panteleev skrev:
 On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:
 The standard example is downloading some content and saving it at the
 same time.

 While your main thread saves a chunk to disk (or uploads to another
 server) the async thread is busy buffering incoming chunks of data.
 This means that you effectively only wait for the slowest of the two
 IO operations. If you did it synchronously would worst case have to
 wait for all everything to be downloaded and the wait for everything
 to be saved or uploaded.

 foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin"))
 {
 // While writing to file in this thrad
 // new chunks are downloaded
 // in the background thread
 file.write(chunk);
 }

 I hope this makes sense.

 Well, this makes sense from a theoretical / high-level perspective, but
 OS write buffers greatly reduce the practicality of this. In common use
 cases the speed difference between disk and wire will differ by orders
 of magnitude as well.

 Read/write buffers does indeed help a lot and there have been quite
 some discussions on this topic earlier in this newsgroups regarding
 tradeoffs etc. Please have a look at these threads (i dont have any
 links at hand unfortunately).

 If you're referring to the discussion which was in the context of
 copying files, then I have read it. However, it does not apply to
 typical use cases of curl. The question here is if this example makes
 any practical sense, and by the looks of everything it does not. Or do
 you disagree?

The same applies here because all it comes down to in the end is the 
sizes of buffers.

The async ranges simply allows you to fill a specified number buffers in 
another thread async. Most OSes also have socket buffers that serves the 
same purpose but async ranges allows you to specify the buffer size 
without being privileged.

The standard max read buffer size for a tcp connection on the ubuntu I 
have is set to 112640 which is doubled by the kernel to 225280 bytes. A 
developer may very well like buffers larger that this without needing to 
set privileged kernel variables.

/Jonas

Dec 04 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sun, 04 Dec 2011 22:52:13 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 The same applies here because all it comes down to in the end is the  
 sizes of buffers.

 The async ranges simply allows you to fill a specified number buffers in  
 another thread async. Most OSes also have socket buffers that serves the  
 same purpose but async ranges allows you to specify the buffer size  
 without being privileged.

 The standard max read buffer size for a tcp connection on the ubuntu I  
 have is set to 112640 which is doubled by the kernel to 225280 bytes. A  
 developer may very well like buffers larger that this without needing to  
 set privileged kernel variables.

I still don't see how this could ever be useful in practice without the  
ability to poll for whether data is available. Is it possible to write two  
short programs that use the synchronous and asynchronous APIs in a way  
that makes a difference?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 05 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

Den 05-12-2011 10:26, Vladimir Panteleev skrev:
 On Sun, 04 Dec 2011 22:52:13 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 The same applies here because all it comes down to in the end is the
 sizes of buffers.

 The async ranges simply allows you to fill a specified number buffers
 in another thread async. Most OSes also have socket buffers that
 serves the same purpose but async ranges allows you to specify the
 buffer size without being privileged.

 The standard max read buffer size for a tcp connection on the ubuntu I
 have is set to 112640 which is doubled by the kernel to 225280 bytes.
 A developer may very well like buffers larger that this without
 needing to set privileged kernel variables.

 I still don't see how this could ever be useful in practice without the
 ability to poll for whether data is available. Is it possible to write
 two short programs that use the synchronous and asynchronous APIs in a
 way that makes a difference?

It is - but as stated it all depends on buffer sizes and IO speed. It is 
for the same reason that java recommends using BufferedReader around 
sockets instead of reading directly from the socket stream.

Another example that might show another async range advantage and is 
clearer:

auto f1 = byChunkAsync("www.foo.com/file1.txt", 2^^19);
auto f2 = byChunkAsync("www.foo.com/file2.txt", 2^^19);
auto f3 = byChunkAsync("www.foo.com/file3.txt", 2^^19);

// While this iteration goes is done the file2 and file3 are downloaded
// in the background threads
foreach (l; f1) { writeln(to!string(l)); }

// f2 and f3 are probably downloaded now and no waiting is necessary
foreach (l; f2) { writeln(to!string(l)); }
foreach (l; f3) { writeln(to!string(l)); }

Anyway - I _have_ said that I will add a data availability poll 
functionality so I guess your initial concern is also covered.

/Jonas

Dec 05 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 05 Dec 2011 23:36:27 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 It is - but as stated it all depends on buffer sizes and IO speed. It is  
 for the same reason that java recommends using BufferedReader around  
 sockets instead of reading directly from the socket stream.

 Another example that might show another async range advantage and is  
 clearer:

 auto f1 = byChunkAsync("www.foo.com/file1.txt", 2^^19);
 auto f2 = byChunkAsync("www.foo.com/file2.txt", 2^^19);
 auto f3 = byChunkAsync("www.foo.com/file3.txt", 2^^19);

 // While this iteration goes is done the file2 and file3 are downloaded
 // in the background threads
 foreach (l; f1) { writeln(to!string(l)); }

 // f2 and f3 are probably downloaded now and no waiting is necessary
 foreach (l; f2) { writeln(to!string(l)); }
 foreach (l; f3) { writeln(to!string(l)); }

Thanks. Perhaps include this example in the documentation?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 05 2011

Somedude <lovelydear mailmetrash.com> writes:

Le 05/12/2011 22:36, Jonas Drewsen a écrit :
 
 Anyway - I _have_ said that I will add a data availability poll
 functionality so I guess your initial concern is also covered.
 
 /Jonas

Just an idea: would it be possible/useful to use the signals/slots
mechanism for this kind of synch ?

Dec 12 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:
 Le 05/12/2011 22:36, Jonas Drewsen a écrit :
 
 Anyway - I _have_ said that I will add a data availability poll
 functionality so I guess your initial concern is also covered.
 
 /Jonas

 Just an idea: would it be possible/useful to use the 
 signals/slots
 mechanism for this kind of synch ?

This would be most useful if there were some kind of main loop. 
This is needed because the request is done in another thread and 
the response should be handled in the main thread. Without an 
main loop in the main thread or a destinct call to e.g. a poll() 
method the signal/slot s would not work.

As a side note I would very much like D to enter a main loop with 
a task scheduler as part of the runtime. That way co-routines and 
event handling would be much nicer to use out of the box. For 
example:

// The main function is running a fiber in the main loop.
// the main loop terminates when no more fibers are active
void main(string[] args) {

    fiber( { auto content= download("google.com"); write(content); 
} );
    fiber( { sleep(2.0); writeln("Hello in 2 secs"); } );

    writeln("first line written");
}

The download() yields until data is downloaded and the sleep() 
yield until the time has passed. You wouldn't notice this when 
not using fibers because you would only have the main() fiber 
which terminates the main loop (and the program) when it is done.

Optimally the runtime should support replacing the standard 
mainloop scheduler with your own.

/Jonas

Dec 12 2011

David Nadlinger <see klickverbot.at> writes:

On 12/12/11 1:18 PM, jdrewsen wrote:
 On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:
 Just an idea: would it be possible/useful to use the signals/slots
 mechanism for this kind of synch ?

 This would be most useful if there were some kind of main loop. This is
 needed because the request is done in another thread and the response
 should be handled in the main thread. Without an main loop in the main
 thread or a destinct call to e.g. a poll() method the signal/slot s
 would not work.

I don't know if you already have a solution in the works, but maybe the 
future interface I did for Thrift is similar to what you are looking 
for: http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html

David

Dec 12 2011

"dsimcha" <dsimcha yahoo.com> writes:

On Tuesday, 13 December 2011 at 00:47:26 UTC, David Nadlinger 
wrote:
 I don't know if you already have a solution in the works, but 
 maybe the future interface I did for Thrift is similar to what 
 you are looking for: 
 http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html

 David

Doesn't std.parallelism's task parallelism API work for this?  
(Roughly speaking a task in std.parallelism == a future in your 
Thrift API.)  If not, what can I do to fix it so that it can?

Looking briefly at your API, one thing I notice is the ability to 
cancel a future.  This would be trivial to implement in 
std.parallelism for tasks that haven't yet started executing, but 
difficult if not impossible for tasks that are already executing. 
  Does your Thrift API allow cancelling futures that are already 
executing?  If so, how is that accomplished?

The TFutureAggregatorRange could be handled by a parallel foreach 
loop if I understand correctly, though it would look a little 
different.

Dec 12 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Tuesday, 13 December 2011 at 00:47:26 UTC, David Nadlinger 
wrote:
 On 12/12/11 1:18 PM, jdrewsen wrote:
 On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:
 Just an idea: would it be possible/useful to use the 
 signals/slots
 mechanism for this kind of synch ?

 This would be most useful if there were some kind of main 
 loop. This is
 needed because the request is done in another thread and the 
 response
 should be handled in the main thread. Without an main loop in 
 the main
 thread or a destinct call to e.g. a poll() method the 
 signal/slot s
 would not work.

 I don't know if you already have a solution in the works, but 
 maybe the future interface I did for Thrift is similar to what 
 you are looking for: 
 http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html

 David

I don't have any plans for this in the curl stuff. But it is 
something like your future implementation and a builtin mainloop 
I think would make async programming much easier in D.

/Jonas

Dec 13 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 03/12/11 05.26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided that,
 after a large number of minor issues were fixed, a second round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 For those of you on Windows, a libcurl binary built by DMC is available
 at http://gool.googlecode.com/files/libcurl_7.21.7.zip.

 Review starts now and ends on December 16, followed by one week of voting.

Thank you for stepping up again even though I know you have plenty of 
other things to do.

/Jonas

Dec 03 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 03/12/11 05.26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided that,
 after a large number of minor issues were fixed, a second round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

If there are no licensing issues I think it should. At least until we 
have a proper package management tool in place that could download the 
library automatically when needed on windows.

/Jonas

Dec 03 2011

Jacob Carlborg <doob me.com> writes:

On 2011-12-03 05:26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided that,
 after a large number of minor issues were fixed, a second round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

I vote for std.net.curl.


-- 
/Jacob Carlborg

Dec 04 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:

 On 2011-12-03 05:26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided that,
 after a large number of minor issues were fixed, a second round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 I vote for std.net.curl.

Hey you std.net.curl people must be cheating. You were 50% of the etc.curl  
people in the past, now you are at 90% of them. Or did some change of mind  
happen in the community recently?

Dec 04 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sun, 04 Dec 2011 16:33:52 +0200, Marco Leise <Marco.Leise gmx.de> wrote:

 Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:

 On 2011-12-03 05:26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided  
 that,
 after a large number of minor issues were fixed, a second round would  
 be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 I vote for std.net.curl.

 Hey you std.net.curl people must be cheating. You were 50% of the  
 etc.curl people in the past, now you are at 90% of them. Or did some  
 change of mind happen in the community recently?

That's internet polls for you. I think we should keep voting as NG  
replies, the same thing we do when voting on Phobos contributions.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 04 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 04.12.2011, 15:36 Uhr, schrieb Vladimir Panteleev  
<vladimir thecybershadow.net>:

 On Sun, 04 Dec 2011 16:33:52 +0200, Marco Leise <Marco.Leise gmx.de>  
 wrote:

 Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:

 On 2011-12-03 05:26, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided  
 that,
 after a large number of minor issues were fixed, a second round would  
 be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back  
 but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 I vote for std.net.curl.

 Hey you std.net.curl people must be cheating. You were 50% of the  
 etc.curl people in the past, now you are at 90% of them. Or did some  
 change of mind happen in the community recently?

 That's internet polls for you. I think we should keep voting as NG  
 replies, the same thing we do when voting on Phobos contributions.

Polling on the NG has two problems:
1. Polls generate a lot of noise here (single word posts).
2. Counting the results must be done manually.

Although voting is not anonymous, you can cheat here as well, by using two  
names and email addresses. I'll have future polls allow only one vote per  
IP address in any case.

Dec 05 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 05 Dec 2011 10:12:00 +0200, Marco Leise <Marco.Leise gmx.de> wrote:

 Polling on the NG has two problems:
 1. Polls generate a lot of noise here (single word posts).
 2. Counting the results must be done manually.

These are rather minor problems.

 Although voting is not anonymous, you can cheat here as well, by using  
 two names and email addresses.

One fake vote is unlikely to swerve the result in either direction, and  
abundant abuse is obvious. Web form polls are considerably easier to  
cheat, even with IP limitations.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Dec 05 2011

"dsimcha" <dsimcha yahoo.com> writes:

Here's my review.  Remember, review ends on December 16.

Overall, this library has massively improved due to the rounds of 
review it's been put through.  I only found a few minor nitpicks. 
  However, a recurring pattern is minor grammar mistakes in the 
documentation.  Please proofread all documentation again.

Docs:

"The high level API is build" -> "The high level API is built"

"LibCurl is licensed under a MIT/X derivate license" -> "LibCurl 
is licensed under an MIT/X derivative license"

AutoConnect:  "Connection type used when the url should be used 
to auto detect protocol."  ->  "auto detect THE protocol"

Why is there a link to curl_easy_set_opt in the byLineAsync and 
byChunkAsync docs?

In onSend:  "The length of the void[] specifies the maximum 
number of bytes that can be send." -> "can be SENT"

What is the use case for exposing struct Curl?  I prefer if this 
were unexposed because we'll obviously be unable to provide a 
replacement if/when the backend to this library is rewritten in 
pure D.

Actually, that leads to another question:  Should this module 
really be named etc.curl/std.curl/std.net.curl, or should the 
fact that it currently uses Curl as a backend be relegated to an 
implementation detail?

Code:

pragma(lib) basically doesn't work on Linux because the object 
format doesn't support it.  Don't rely on it.

Should the protocol detection be case-insensitive, i.e. "ftp://" 
== "FTP://"?

Dec 11 2011

dsimcha <dsimcha yahoo.com> writes:

On 12/11/2011 7:53 PM, dsimcha wrote:
 Should the protocol detection be case-insensitive, i.e. "ftp://" ==
 "FTP://"?

Oh, one more thing:  Factor the protocol detection out into a function. 
  You have the same expression cut and pasted everywhere:

if(url.startsWith("ftp://") || url.startsWith("ftps://") ...

Dec 11 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, December 11, 2011 19:55:28 dsimcha wrote:
 On 12/11/2011 7:53 PM, dsimcha wrote:
 Should the protocol detection be case-insensitive, i.e. "ftp://" ==
 "FTP://"?

 
 Oh, one more thing:  Factor the protocol detection out into a function.
   You have the same expression cut and pasted everywhere:
 
 if(url.startsWith("ftp://") || url.startsWith("ftps://") ...

And startsWith will take multiple arguments, so the code definitely shouldn't 
have

x.startsWith(y) || x.startsWith(z) || ...

It's more verbose and less efficient that way.

- Jonathan M Davis

Dec 11 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis 
wrote:
 On Sunday, December 11, 2011 19:55:28 dsimcha wrote:
 On 12/11/2011 7:53 PM, dsimcha wrote:
 Should the protocol detection be case-insensitive, i.e. 
 "ftp://" ==
 "FTP://"?

 
 Oh, one more thing:  Factor the protocol detection out into a 
 function.
 You have the same expression cut and pasted everywhere:
 
 if(url.startsWith("ftp://") || url.startsWith("ftps://") ...

 And startsWith will take multiple arguments, so the code 
 definitely shouldn't have

 x.startsWith(y) || x.startsWith(z) || ...

 It's more verbose and less efficient that way.

 - Jonathan M Davis

I hadn't noticed that multiple arguments is accepted. Thanks.
/Jonas

Dec 12 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/12/11 6:25 AM, jdrewsen wrote:
 On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis wrote:
 On Sunday, December 11, 2011 19:55:28 dsimcha wrote:
 On 12/11/2011 7:53 PM, dsimcha wrote:
 Should the protocol detection be case-insensitive, i.e. > "ftp://" ==
 "FTP://"?

 Oh, one more thing: Factor the protocol detection out into a function.
 You have the same expression cut and pasted everywhere:

 if(url.startsWith("ftp://") || url.startsWith("ftps://") ...

 And startsWith will take multiple arguments, so the code definitely
 shouldn't have

 x.startsWith(y) || x.startsWith(z) || ...

 It's more verbose and less efficient that way.

 - Jonathan M Davis

 I hadn't noticed that multiple arguments is accepted. Thanks.
 /Jonas

You can also pass a predicate for case insensitive comparison.

Andrei

Dec 12 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Monday, 12 December 2011 at 14:42:17 UTC, Andrei Alexandrescu 
wrote:
 On 12/12/11 6:25 AM, jdrewsen wrote:
 On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis 
 wrote:
 On Sunday, December 11, 2011 19:55:28 dsimcha wrote:
 On 12/11/2011 7:53 PM, dsimcha wrote:
 Should the protocol detection be case-insensitive, i.e. > 
 "ftp://" ==
 "FTP://"?

 Oh, one more thing: Factor the protocol detection out into a 
 function.
 You have the same expression cut and pasted everywhere:

 if(url.startsWith("ftp://") || url.startsWith("ftps://") ...

 And startsWith will take multiple arguments, so the code 
 definitely
 shouldn't have

 x.startsWith(y) || x.startsWith(z) || ...

 It's more verbose and less efficient that way.

 - Jonathan M Davis

 I hadn't noticed that multiple arguments is accepted. Thanks.
 /Jonas

 You can also pass a predicate for case insensitive comparison.

 Andrei

Perfect

Dec 13 2011

Somedude <lovelydear mailmetrash.com> writes:

Le 12/12/2011 01:53, dsimcha a écrit :
 Actually, that leads to another question:  Should this module really be
 named etc.curl/std.curl/std.net.curl, or should the fact that it
 currently uses Curl as a backend be relegated to an implementation detail?
 

I think it's good to remind that this is a wrapper and not a native D
implementation. At least if I was a user, I would like to know upfront.
So I would keep curl in the name.

Dec 12 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Monday, 12 December 2011 at 00:53:14 UTC, dsimcha wrote:
 Here's my review.  Remember, review ends on December 16.

 Overall, this library has massively improved due to the rounds 
 of review it's been put through.  I only found a few minor 
 nitpicks.
 However, a recurring pattern is minor grammar mistakes in the 
 documentation.  Please proofread all documentation again.

 Docs:

 "The high level API is build" -> "The high level API is built"

 "LibCurl is licensed under a MIT/X derivate license" -> 
 "LibCurl is licensed under an MIT/X derivative license"

 AutoConnect:  "Connection type used when the url should be used 
 to auto detect protocol."  ->  "auto detect THE protocol"

ok

 Why is there a link to curl_easy_set_opt in the byLineAsync and 
 byChunkAsync docs?

will fix

 In onSend:  "The length of the void[] specifies the maximum 
 number of bytes that can be send." -> "can be SENT"

ok

 What is the use case for exposing struct Curl?  I prefer if 
 this were unexposed because we'll obviously be unable to 
 provide a replacement if/when the backend to this library is 
 rewritten in pure D.

Initially it was not exposed but there were a couple of requests 
to expose it. Thats why.

 Actually, that leads to another question:  Should this module 
 really be named etc.curl/std.curl/std.net.curl, or should the 
 fact that it currently uses Curl as a backend be relegated to 
 an implementation detail?

I think it should have curl in the name. I do not believe that a 
native D network library should have the same API as this module. 
It is limited by the design of libcurl and should be improved by 
a native D net library.

 Code:

 pragma(lib) basically doesn't work on Linux because the object 
 format doesn't support it.  Don't rely on it.

ok

 Should the protocol detection be case-insensitive, i.e. 
 "ftp://" == "FTP://"?

yes


Thanks for the feedback
/Jonas

Dec 12 2011

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Monday, 12 December 2011 at 12:24:26 UTC, jdrewsen wrote:
 I think it should have curl in the name. I do not believe that 
 a native D network library should have the same API as this 
 module. It is limited by the design of libcurl and should be 
 improved by a native D net library.

I agree with this, it's pointless to limit ourselves to an 
interface that was designed with the limitations of curl in mind. 
Using a generic name for this module gains us absolutely nothing.

Dec 12 2011

Jacob Carlborg <doob me.com> writes:

On 2011-12-12 01:53, dsimcha wrote:
 Here's my review. Remember, review ends on December 16.

 What is the use case for exposing struct Curl? I prefer if this were
 unexposed because we'll obviously be unable to provide a replacement
 if/when the backend to this library is rewritten in pure D.

 Actually, that leads to another question: Should this module really be
 named etc.curl/std.curl/std.net.curl, or should the fact that it
 currently uses Curl as a backend be relegated to an implementation detail?

That's a good question. When I think about it, I think it should be 
considered an implementation detail.

 Code:

 pragma(lib) basically doesn't work on Linux because the object format
 doesn't support it. Don't rely on it.

 Should the protocol detection be case-insensitive, i.e. "ftp://" ==
 "FTP://"?

I think so.

-- 
/Jacob Carlborg

Dec 12 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, December 12, 2011 01:53:13 dsimcha wrote:
 Actually, that leads to another question:  Should this module
 really be named etc.curl/std.curl/std.net.curl, or should the
 fact that it currently uses Curl as a backend be relegated to an
 implementation detail?

It's too specific to curl for curl to be an implementation detail. I expect 
that if we were designing a general API or one of our own which didn't use 
curl at all, I wouldn't expect the API to be the same.

- Jonathan M Davis

Dec 14 2011

Somedude <lovelydear mailmetrash.com> writes:

Le 03/12/2011 05:26, dsimcha a �crit :
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper.  After the first round it was decided
 that, after a large number of minor issues were fixed, a second round
 would be necessary.
 
 Significant open issues:
 
 1.  Should libcurl be bundled with DMD on Windows?

Yes. This is a must.

Dec 12 2011

bearophile <bearophileHUGS lycos.com> writes:

Somedude:

 1.  Should libcurl be bundled with DMD on Windows?

 
 Yes. This is a must.

I agree it's better to have this battery included.

Bye,
bearophile

Dec 12 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday, December 02, 2011 23:26:10 dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper.  After the first round it was decided
 that, after a large number of minor issues were fixed, a second round
 would be necessary.
 
 Significant open issues:
 
 1.  Should libcurl be bundled with DMD on Windows?

I'd argue that it should be a separate download but that it should definitely 
be provided.

 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

If we're going to follow a policy that wrappers around 3rd party libraries go 
in etc, then it should be etc.curl. Otherwise, I think that it should be 
std.net.curl.

I'd argue that AutoConnection should be AutoConnect, because it's shorter and 
just as informative.

I'd argue that the acronyms should be in all caps if camelcasing would require 
that the first letter of the acronym be capitalized and all lower case if 
camelcasing would require that the first letter of the acronym be lower case. 
We're not completely consistent in how we using acronyms in names in Phobos, 
but I believe that we primarily follow that rule when putting them in symbol 
names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, 
Ftp, and Smtp.

I'd rename T to C or Char in the template parameters, since it's a character 
type rather than a generic type. Not a huge deal, but I think that it makes 
the code quicker to understand.

Why are some parameters const(char)[] and others string? Why not make them all 
const(char)[] or even actually templatize them on character type (one per 
parameter which is a string)?

Shorten the URL in your examples. Its long length increases the risk of making 
the examples too wide. It's a made up URL, so there's no need for it to be 
that long. Something like foo.bar.com is probably more than enough.

Why does byLine use '\x0a' instead of '\n'? '\n' is clear, whereas most people 
will have to look up what '\x0a' is, so I really think that it should be '\n'. 
Also, if byLine is going to return an internal range type instead of having an 
externally defined one, its documentation needs to explain what Char is used 
for. As it stands, it looks like a pointless template parameter. To do that, 
the return section should probably say something like "A range of Char[]..." 
All of this goes for byLineAsync as well.

In general, the functions should do a better job of documenting their 
parameters. Many of them don't document their parameters at all. For instance, 
while it's fairly easy to guess what keepTerminator and terminator are used 
for, there is no explanation about them whatsoever in either byLine or 
byLineAsync's documentation.

If any of the range return types of these functions have non-standard 
functions on them, they need to be properly documented. e.g. wait(Duration) is 
mentioned in passing in byLineAsync's documentation, but no explanation for 
its purpose is given. The documentation needs to explain everything the 
programmer needs to know about the return types. For most ranges, it's enough 
to say that the return type is a range of X, but if the return type has 
anything other than the standard range functions, then just saying that the 
function returns a range of X is not enough. The extra functions must be 
listed and explained.

I'd warn you that the fact that the documentation for Protocol is showing up 
in the documentation is likely a bug, since Protocol is private - probably the 
bug that all templates are currently public even if they're marked private. 
So, either Protocol needs to be public (or maybe protected), or it's going to 
disappear from the documentation eventually. Another option would be to do 
something similar to std.container.TotalContainer and declare a struct which 
is intended only for documentation (probably in a version(StdDdoc) block), and 
put all of the protocol documenation on _it_ instead of the mixin.

Also, Protocol needs a better explanation as to what it's actually for. "Mixin 
template for all supported curl protocols" doesn't really tell me anything 
other than the fact that it relates to the supported protocols somehow. 
Looking at the source, it seems clear enough, but not from the documentation - 
especially when the documentation seems to imply that it's specific to the Http 
struct.

onReceiveHeader has const(char)[] parameters but mentions char[] in its 
documentation. The meaning is clear enough, but it would be more correct to 
say const(char)[] or to just mention the variable names explicitly.

It would be nice if Smtp.mailTo was variadic  - i.e. mailTo(string[] 
recipients...). All of the current use cases for it would be exactly the same 
with the addition of being able to do something like mailTo(str1, str2, str3).

The exception type's constructors should look like this:

    nothrow this(string msg, string file = __FILE__, size_t line = __LINE__, 
Throwable next = null)
    {
        super(msg, file, line, next);
    }

As it stands, the line number will be completely wrong. You can just copy 
std.string.StringException and rename it. And please use that exact signature. 
Don't make it take a const(char)[]. Exception requires a string, and with the 
constructor taking const(char)[], it's going to needlessly idup every string 
that gets passed into it. The code which calls it can idup if it has a char[] 
instead of an immutable(int)[]. Having it take a const(char)[] adds no value 
and is less efficient. The same goes for any other function which is always 
going to idup its const(char)[] argument. Just make it take a string.

Why do CURLoption and CURLcode  have CURL in uppercase? That doesn't match the 
rest of the module. I'd argue that they should be CurlOption and CurlCode. To 
make matters even weirder, your examples appear to use CurlOption but the 
declarations use CURLoption. They should be consistent.

- Jonathan M Davis

Dec 14 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis 
wrote:
 On Friday, December 02, 2011 23:26:10 dsimcha wrote:
 I volunteered ages ago to manage the review for the second 
 round of
 Jonas Drewsen's CURL wrapper.  After the first round it was 
 decided
 that, after a large number of minor issues were fixed, a 
 second round
 would be necessary.
 
 Significant open issues:
 
 1.  Should libcurl be bundled with DMD on Windows?

 I'd argue that it should be a separate download but that it 
 should definitely be provided.

 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a 
 while back
 but it was buried deep in a thread and a lot of people may 
 have missed
 it:  
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 If we're going to follow a policy that wrappers around 3rd 
 party libraries go in etc, then it should be etc.curl. 
 Otherwise, I think that it should be std.net.curl.

 I'd argue that AutoConnection should be AutoConnect, because 
 it's shorter and just as informative.

AutoConnect sounds like a command to connection automatically 
which might be confusing since that is not what it does. 
Therefore I went with AutoConnection which I still believe is 
better.

 I'd argue that the acronyms should be in all caps if 
 camelcasing would require that the first letter of the acronym 
 be capitalized and all lower case if camelcasing would require 
 that the first letter of the acronym be lower case. We're not 
 completely consistent in how we using acronyms in names in 
 Phobos, but I believe that we primarily follow that rule when 
 putting them in symbol names. So, for instance, it would be 
 HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.

I like Http better as personal taste. But if HTTP is preferred 
then I'll do that. Actually I think I will make a pull request to 
extend the dlang.org/dstyle.html doc with and example that we can 
point to when someone asks about styling. I've spend too much 
time restyling because there is no such style doc already and 
people are complaining about style in reviews anyway.

 I'd rename T to C or Char in the template parameters, since 
 it's a character type rather than a generic type. Not a huge 
 deal, but I think that it makes the code quicker to understand.

I've named it T because it can also be ubyte. In that case C for 
Char is confusing.

 Why are some parameters const(char)[] and others string? Why 
 not make them all const(char)[] or even actually templatize 
 them on character type (one per parameter which is a string)?

The parameters that are const(char)[] will accept both string and 
char[] which is what I want because libcurl make an internal copy 
of it anyway. If I typed it as string then you would have to idup 
a char[] for the parameter without any reason.
The parameters that are string is internally passed to functions 
(in other std modules) that only accept strings. I could accept 
const(char)[] and idup it in order to have consistant parameter 
types but that gives an unnecessary overhead.

 Shorten the URL in your examples. Its long length increases the 
 risk of making the examples too wide. It's a made up URL, so 
 there's no need for it to be that long. Something like 
 foo.bar.com is probably more than enough.

Actually the long urls point to a google appspot app that fits 
the example. I've done it this way so that people can just 
copy/paste the example and have something working. Not too many 
people have a test HTTP POST server setup for a quick test AFAIK.

 Why does byLine use '\x0a' instead of '\n'? '\n' is clear, 
 whereas most people will have to look up what '\x0a' is, so I 
 really think that it should be '\n'.

I looked at how it was done elsewhere in phobos and did like that:
http://dlang.org/phobos/std_stdio.html#byLine

But I agree with you that \n is better and will change it.

 Also, if byLine is going to return an internal range type 
 instead of having an externally defined one, its documentation 
 needs to explain what Char is used for. As it stands, it looks 
 like a pointless template parameter. To do that, the return 
 section should probably say something like "A range of 
 Char[]..." All of this goes for byLineAsync as well.

Ok.

 In general, the functions should do a better job of documenting 
 their parameters. Many of them don't document their parameters 
 at all. For instance, while it's fairly easy to guess what 
 keepTerminator and terminator are used for, there is no 
 explanation about them whatsoever in either byLine or 
 byLineAsync's documentation.

Again I looked at phobos to see what was current the style:
http://dlang.org/phobos/std_stdio.html#byLine

I'll include some more detail though.

 If any of the range return types of these functions have 
 non-standard functions on them, they need to be properly 
 documented. e.g. wait(Duration) is mentioned in passing in 
 byLineAsync's documentation, but no explanation for its purpose 
 is given. The documentation needs to explain everything the 
 programmer needs to know about the return types. For most 
 ranges, it's enough to say that the return type is a range of 
 X, but if the return type has anything other than the standard 
 range functions, then just saying that the function returns a 
 range of X is not enough. The extra functions must be listed 
 and explained.

This is what is currently documented: "If no data is available 
and the main thread access the range it will block
until data becomes available. An exception to this is the $(D 
wait(Duration))
method on the range. This method will wait at maximum for the 
specified
duration and return true if data is available."

I guess that explains it?

 I'd warn you that the fact that the documentation for Protocol 
 is showing up in the documentation is likely a bug, since 
 Protocol is private - probably the bug that all templates are 
 currently public even if they're marked private. So, either 
 Protocol needs to be public (or maybe protected), or it's going 
 to disappear from the documentation eventually. Another option 
 would be to do something similar to 
 std.container.TotalContainer and declare a struct which is 
 intended only for documentation (probably in a version(StdDdoc) 
 block), and put all of the protocol documenation on _it_ 
 instead of the mixin.

Ok. didn't know about the private template bug. I really think it 
is a bug in ddoc that it should be able to insert mixed in docs 
ie. the Protocol documentation should be mixed into the Http 
documentation.

Anyway - I don't see the TotalContainer docs showing in the 
generated html?

Another workaround is to manually copy the Protocol docs into the 
HTTP/FTP/SMTP structs inside a version(StdDdoc).

 Also, Protocol needs a better explanation as to what it's 
 actually for. "Mixin template for all supported curl protocols" 
 doesn't really tell me anything other than the fact that it 
 relates to the supported protocols somehow. Looking at the 
 source, it seems clear enough, but not from the documentation - 
 especially when the documentation seems to imply that it's 
 specific to the Http struct.

I'll remove the HTTP hint and improve the doc.

 onReceiveHeader has const(char)[] parameters but mentions 
 char[] in its documentation. The meaning is clear enough, but 
 it would be more correct to say const(char)[] or to just 
 mention the variable names explicitly.

ok.

 It would be nice if Smtp.mailTo was variadic  - i.e. 
 mailTo(string[] recipients...). All of the current use cases 
 for it would be exactly the same with the addition of being 
 able to do something like mailTo(str1, str2, str3).

ok

 The exception type's constructors should look like this:

   nothrow this(string msg, string file = __FILE__, size_t line 
 = __LINE__, Throwable next = null)
   {
       super(msg, file, line, next);
   }

 As it stands, the line number will be completely wrong. You can 
 just copy std.string.StringException and rename it. And please 
 use that exact signature. Don't make it take a const(char)[]. 
 Exception requires a string, and with the constructor taking 
 const(char)[], it's going to needlessly idup every string that 
 gets passed into it. The code which calls it can idup if it has 
 a char[] instead of an immutable(int)[]. Having it take a 
 const(char)[] adds no value and is less efficient. The same 
 goes for any other function which is always going to idup its 
 const(char)[] argument. Just make it take a string.

ok

 Why do CURLoption and CURLcode  have CURL in uppercase? That 
 doesn't match the rest of the module. I'd argue that they 
 should be CurlOption and CurlCode. To make matters even 
 weirder, your examples appear to use CurlOption but the 
 declarations use CURLoption. They should be consistent.

ok

Thanks for the comments.

-Jonas

Dec 17 2011

Justin C Calvarese <technocrat7+d gmail.com> writes:

On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <jdrewsen nospam.com> wrote:

 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:

 ...



  I'd argue that the acronyms should be in all caps if camelcasing would
 require that the first letter of the acronym be capitalized and all lower
 case if camelcasing would require that the first letter of the acronym be
 lower case. We're not completely consistent in how we using acronyms in
 names in Phobos, but I believe that we primarily follow that rule when
 putting them in symbol names. So, for instance, it would be HTTP, FTP, and
 SMTP rather than Http, Ftp, and Smtp.

 I like Http better as personal taste. But if HTTP is preferred then I'll
 do that. Actually I think I will make a pull request to extend the
 dlang.org/dstyle.html doc with and example that we can point to when
 someone asks about styling. I've spend too much time restyling because
 there is no such style doc already and people are complaining about style
 in reviews anyway.


I absolutely agree that the "D Style" page should be revised and extended
to reflect the current standards that new Phobos modules are expected to
adhere to.

I found an open pull request that indicates a past effort in this regard:
https://github.com/D-Programming-Language/d-programming-language.org/pull/16

(I thought that another document was also floating around that was a
proposed new "Style Guide for Phobos", but I don't recall who authored it
and I don't remember where it was located.)

jcc7

Dec 17 2011

Justin C Calvarese <jccalvarese gmail.com> writes:

== Quote from jdrewsen (jdrewsen nospam.com)'s article
 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis
 wrote:

...
 I'd argue that the acronyms should be in all caps if
 camelcasing would require that the first letter of the acronym
 be capitalized and all lower case if camelcasing would require
 that the first letter of the acronym be lower case. We're not
 completely consistent in how we using acronyms in names in
 Phobos, but I believe that we primarily follow that rule when
 putting them in symbol names. So, for instance, it would be
 HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.

 I like Http better as personal taste. But if HTTP is preferred
 then I'll do that. Actually I think I will make a pull request to
 extend the dlang.org/dstyle.html doc with and example that we can
 point to when someone asks about styling. I've spend too much
 time restyling because there is no such style doc already and
 people are complaining about style in reviews anyway.

(I'm sorry if this comes across as a repost, but I guess I don't know how to
post using the mailing list.)

I absolutely agree that the "D Style" page should be revised and extended to
reflect the current standards
that new Phobos modules are expected to adhere to.

I found an open pull request that indicated a past effort in this regard:
https://github.com/D-Programming-Language/d-programming-language.org/pull/16

(I thought that another document was also floating around that was a proposed
new "Style Guide for Phobos",
but I don't recall who authored it and I don't remember where it was located.)

jcc7

Dec 17 2011

Justin C Calvarese <jccalvarese gmail.com> writes:

== Quote from jdrewsen (jdrewsen nospam.com)'s article
 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis
 wrote:

...
 I'd argue that the acronyms should be in all caps if
 camelcasing would require that the first letter of the acronym
 be capitalized and all lower case if camelcasing would require
 that the first letter of the acronym be lower case. We're not
 completely consistent in how we using acronyms in names in
 Phobos, but I believe that we primarily follow that rule when
 putting them in symbol names. So, for instance, it would be
 HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.

 I like Http better as personal taste. But if HTTP is preferred
 then I'll do that. Actually I think I will make a pull request to
 extend the dlang.org/dstyle.html doc with and example that we can
 point to when someone asks about styling. I've spend too much
 time restyling because there is no such style doc already and
 people are complaining about style in reviews anyway.

(I'm sorry if this comes across as a repost, but I guess I don't know how to
post using the mailing list.)

I absolutely agree that the "D Style" page should be revised and extended to
reflect the current standards
that new Phobos modules are expected to adhere to.

I found an open pull request that indicated a past effort in this regard:
https://github.com/D-Programming-Language/d-programming-language.org/pull/16

(I thought that another document was also floating around that was a proposed
new "Style Guide for Phobos",
but I don't recall who authored it and I don't remember where it was located.)

jcc7

Dec 17 2011

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:
 AutoConnect sounds like a command to connection automatically 
 which might be confusing since that is not what it does. 
 Therefore I went with AutoConnection which I still believe is 
 better.

How about something completely different then, like URLInfer?

Also, some instances in the docs of "url" in lowercase should 
probably be changed to uppercase.

Dec 17 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum wrote:
 On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:
 AutoConnect sounds like a command to connection automatically 
 which might be confusing since that is not what it does. 
 Therefore I went with AutoConnection which I still believe is 
 better.

 How about something completely different then, like URLInfer?

 Also, some instances in the docs of "url" in lowercase should 
 probably be changed to uppercase.

Will change case.

Don't know if the URLInfer name is better though.

/Jonas

Dec 17 2011

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Saturday, 17 December 2011 at 22:25:10 UTC, jdrewsen wrote:
 On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum 
 wrote:
 On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:
 AutoConnect sounds like a command to connection automatically 
 which might be confusing since that is not what it does. 
 Therefore I went with AutoConnection which I still believe is 
 better.

 How about something completely different then, like URLInfer?

 Also, some instances in the docs of "url" in lowercase should 
 probably be changed to uppercase.

 Will change case.

 Don't know if the URLInfer name is better though.

 /Jonas

Andrei is right, a noun is better. I'm with InferredProtocol.

Dec 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, December 17, 2011 23:51:08 Jakob Ovrum wrote:
 On Saturday, 17 December 2011 at 22:25:10 UTC, jdrewsen wrote:
 On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum
 
 wrote:
 On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:
 AutoConnect sounds like a command to connection automatically
 which might be confusing since that is not what it does.
 Therefore I went with AutoConnection which I still believe is
 better.

 
 How about something completely different then, like URLInfer?
 
 Also, some instances in the docs of "url" in lowercase should
 probably be changed to uppercase.

 
 Will change case.
 
 Don't know if the URLInfer name is better though.
 
 /Jonas

 
 Andrei is right, a noun is better. I'm with InferredProtocol.

You inferred protocol is better. Still annoyingly long, but it's a clearer 
name.

- JonathnM m Davis

Dec 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, December 17, 2011 12:56:02 jdrewsen wrote:
 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis
 
 I'd argue that the acronyms should be in all caps if
 camelcasing would require that the first letter of the acronym
 be capitalized and all lower case if camelcasing would require
 that the first letter of the acronym be lower case. We're not
 completely consistent in how we using acronyms in names in
 Phobos, but I believe that we primarily follow that rule when
 putting them in symbol names. So, for instance, it would be
 HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.

 
 I like Http better as personal taste. But if HTTP is preferred
 then I'll do that. Actually I think I will make a pull request to
 extend the dlang.org/dstyle.html doc with and example that we can
 point to when someone asks about styling. I've spend too much
 time restyling because there is no such style doc already and
 people are complaining about style in reviews anyway.

I actually have such a pull request, but it's been languishing because there 
are some items in it that need to be discussed. I keep forgetting to bring 
them up in the Phobos news group so that they can be decided and we can move 
on with that.

https://github.com/D-Programming-Language/d-programming-language.org/pull/16

 I'd rename T to C or Char in the template parameters, since
 it's a character type rather than a generic type. Not a huge
 deal, but I think that it makes the code quicker to understand.

 
 I've named it T because it can also be ubyte. In that case C for
 Char is confusing.

Yeah. I got that after reading the source, but the needs to be clearer in the 
documentation. Normally, when something is templated the way that that is, 
it's going to be templated on character type. And something which can be 
either ubyte or char but nothing else is rather abnormal. It needs to be 
clearer. And it's possible that at some point, it really should be changed to 
be any character type and ubyte rather than just char or ubyte (depending on 
how that affect efficiency - basically if it's more efficient to make it work
with 
dchar from the get-go rather then get a string and convert it, then it's more 
desirable to make it work with wchar and dchar, but if you're just going to 
have to duplicate it or convert it anyway, then you might as well just 
restrict it to char and ubyte).

 Why are some parameters const(char)[] and others string? Why
 not make them all const(char)[] or even actually templatize
 them on character type (one per parameter which is a string)?

 
 The parameters that are const(char)[] will accept both string and
 char[] which is what I want because libcurl make an internal copy
 of it anyway. If I typed it as string then you would have to idup
 a char[] for the parameter without any reason.
 The parameters that are string is internally passed to functions
 (in other std modules) that only accept strings. I could accept
 const(char)[] and idup it in order to have consistant parameter
 types but that gives an unnecessary overhead.

Okay. In general, functions should either take string or take a range of 
dchar. Taking a range of dchar is the most flexible, so it's generally the best 
(though this often results in specializations within the template for narrow 
strings - std.algorithm does that a lot). If you need a string spefically (e.g. 
all of the std.string functions specifically operate on strings), then taking 
an array that's templated on character type is generally better, because it's 
more flexible.

If you need a string internally, then do _not_ use const(char)[]. or const(C)
[]. When you do that, you're forced to idup the string (or to!string is forced 
to idup it). It's far better to either take string (in which case the caller 
will idup if it's necessary) or to not use const. That way, the string is 
copied only when it actually needs to be copied.

That does sometimes result in functions which take string for some parameters 
and const(char)[] for others, which looks odd to the programmer calling it, 
but that's life I guess. It's less of an issue when ranges are used, however, 
because then they're all ranges and it's just how theyre treated internally, I 
guess.

I don't know how well you've done with this in general without looking over 
the code again. I know that you've generally taken const(char)[] instead of 
string, and sometimes you've iduped that. And those parameters needs to be 
fixed to take string and use to!string so that the iduping isn't always 
necessary. Also, since in most cases, you're ultimately passing the strings to 
the C curl API, the benefit of operating on ranges of dchar instead of strings 
is debatable, so it's not necessarily a problem that you're not taking ranges 
of dchar. You do, however, need to make sure that you use string rather than 
const(char)[] in places where you're iduping so that we can avoid such 
unnecessary allocations.

 Shorten the URL in your examples. Its long length increases the
 risk of making the examples too wide. It's a made up URL, so
 there's no need for it to be that long. Something like
 foo.bar.com is probably more than enough.

 
 Actually the long urls point to a google appspot app that fits
 the example. I've done it this way so that people can just
 copy/paste the example and have something working. Not too many
 people have a test HTTP POST server setup for a quick test AFAIK.

I thought that it was fake. Well, if you can, use a shorter URL. Like dlang 
instead of d-programming-language now that we havethat domain.

 Why does byLine use '\x0a' instead of '\n'? '\n' is clear,
 whereas most people will have to look up what '\x0a' is, so I
 really think that it should be '\n'.

 
 I looked at how it was done elsewhere in phobos and did like that:
 http://dlang.org/phobos/std_stdio.html#byLine
 
 But I agree with you that \n is better and will change it.

std.stdio should be changed then IMHO.

 In general, the functions should do a better job of documenting
 their parameters. Many of them don't document their parameters
 at all. For instance, while it's fairly easy to guess what
 keepTerminator and terminator are used for, there is no
 explanation about them whatsoever in either byLine or
 byLineAsync's documentation.

 
 Again I looked at phobos to see what was current the style:
 http://dlang.org/phobos/std_stdio.html#byLine
 
 I'll include some more detail though.

Phobos' documentation can definitely be improved in some cases, and I think 
that this is one of them.

 If any of the range return types of these functions have
 non-standard functions on them, they need to be properly
 documented. e.g. wait(Duration) is mentioned in passing in
 byLineAsync's documentation, but no explanation for its purpose
 is given. The documentation needs to explain everything the
 programmer needs to know about the return types. For most
 ranges, it's enough to say that the return type is a range of
 X, but if the return type has anything other than the standard
 range functions, then just saying that the function returns a
 range of X is not enough. The extra functions must be listed
 and explained.

 
 This is what is currently documented: "If no data is available
 and the main thread access the range it will block
 until data becomes available. An exception to this is the $(D
 wait(Duration))
 method on the range. This method will wait at maximum for the
 specified
 duration and return true if data is available."
 
 I guess that explains it?

I really think that you need to state specifically that the returned range has 
an the additional wait(Duration) function on it and explain directly what it's 
for and how it's used. Also, put it in the example. The way it states it by 
saying "An exception to this is..." makes the fact that wait(Duration) is 
mentioned at all seem like an afterthought, and while it does give some 
explanation for it, I really don't think that it's clear enough how it's used 
and what it does. It's enough to start to get the idea, but not enough to 
actually use it. An example would probably be the biggest help in that regard, 
but I do think that the function needs to be explicitly documented with a 
description of what it does rather than seemingly mentioning it as an 
afterthought - especially since it also raises the question of what other non-
standard functions the range has that aren't mentioned.

 I'd warn you that the fact that the documentation for Protocol
 is showing up in the documentation is likely a bug, since
 Protocol is private - probably the bug that all templates are
 currently public even if they're marked private. So, either
 Protocol needs to be public (or maybe protected), or it's going
 to disappear from the documentation eventually. Another option
 would be to do something similar to
 std.container.TotalContainer and declare a struct which is
 intended only for documentation (probably in a version(StdDdoc)
 block), and put all of the protocol documenation on _it_
 instead of the mixin.

 
 Ok. didn't know about the private template bug. I really think it
 is a bug in ddoc that it should be able to insert mixed in docs
 ie. the Protocol documentation should be mixed into the Http
 documentation.
 
 Anyway - I don't see the TotalContainer docs showing in the
 generated html?
 
 Another workaround is to manually copy the Protocol docs into the
 HTTP/FTP/SMTP structs inside a version(StdDdoc).

We have at least 2 bugs here. One, that templates are always public, and two, 
that the documentation isn't mixed in. As for TotalContainer, you appear to be 
right. I'm in such a habit of reading the code rather than the docs, that I 
missed that. It's using /* rather than /**, so it doesn't end up in the ddoc.

The best would be if the functions were on their approriate types with StdDoc, 
but at present, that means duplicating all of that documentation, which would 
kind of suck. So, it's up to you. From the perspective of those reading the 
documentation, it would definitely be better if it were on the appropriate 
types though, so I would be inclined to argue that they should go on the types 
and that we can remove them when the mixin bug is fixed, but I don't know what 
others would think about that. At minimum, it needs to be more clearly 
explained that Protocol is a template mixin that is being mixed into Http, 
Ftp, and Smtp, and those have all of Protocol's functions.

- Jonathan M Davis

Dec 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, December 17, 2011 09:21:42 Justin C Calvarese wrote:
 On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <jdrewsen nospam.com> wrote:
 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:
 ...


  
  I'd argue that the acronyms should be in all caps if camelcasing would
  
 require that the first letter of the acronym be capitalized and all
 lower case if camelcasing would require that the first letter of the
 acronym be lower case. We're not completely consistent in how we
 using acronyms in names in Phobos, but I believe that we primarily
 follow that rule when putting them in symbol names. So, for instance,
 it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.

 
 I like Http better as personal taste. But if HTTP is preferred then I'll
 do that. Actually I think I will make a pull request to extend the
 dlang.org/dstyle.html doc with and example that we can point to when
 someone asks about styling. I've spend too much time restyling because
 there is no such style doc already and people are complaining about
 style
 in reviews anyway.

 
 I absolutely agree that the "D Style" page should be revised and extended
 to reflect the current standards that new Phobos modules are expected to
 adhere to.
 
 I found an open pull request that indicates a past effort in this regard:
 https://github.com/D-Programming-Language/d-programming-language.org/pull/16
 
 (I thought that another document was also floating around that was a
 proposed new "Style Guide for Phobos", but I don't recall who authored it
 and I don't remember where it was located.)

Yeah. I need to get back to that. Portions of it were completely agreed upon 
and others need further discussion, and I keep forgetting to get those 
discussed among the Phobos devs and get those questions settled.

- Jonathan m Davis

Dec 17 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Sunday, 18 December 2011 at 01:30:10 UTC, Jonathan M Davis 
wrote:
 On Saturday, December 17, 2011 09:21:42 Justin C Calvarese 
 wrote:
 On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen 
 <jdrewsen nospam.com> wrote:
 On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M 
 Davis wrote:
 ...


 
 I'd argue that the acronyms should be in all caps if 
 camelcasing would
 
 require that the first letter of the acronym be capitalized 
 and all
 lower case if camelcasing would require that the first 
 letter of the
 acronym be lower case. We're not completely consistent in 
 how we
 using acronyms in names in Phobos, but I believe that we 
 primarily
 follow that rule when putting them in symbol names. So, for 
 instance,
 it would be HTTP, FTP, and SMTP rather than Http, Ftp, and 
 Smtp.

 
 I like Http better as personal taste. But if HTTP is 
 preferred then I'll
 do that. Actually I think I will make a pull request to 
 extend the
 dlang.org/dstyle.html doc with and example that we can point 
 to when
 someone asks about styling. I've spend too much time 
 restyling because
 there is no such style doc already and people are 
 complaining about
 style
 in reviews anyway.

 
 I absolutely agree that the "D Style" page should be revised 
 and extended
 to reflect the current standards that new Phobos modules are 
 expected to
 adhere to.
 
 I found an open pull request that indicates a past effort in 
 this regard:
 https://github.com/D-Programming-Language/d-programming-language.org/pull/16
 
 (I thought that another document was also floating around that 
 was a
 proposed new "Style Guide for Phobos", but I don't recall who 
 authored it
 and I don't remember where it was located.)

 Yeah. I need to get back to that. Portions of it were 
 completely agreed upon and others need further discussion, and 
 I keep forgetting to get those discussed among the Phobos devs 
 and get those questions settled.

 - Jonathan m Davis

Nice to see that something is in progress. Please also add a note 
about acronyms in names then.

/Jonas

Dec 18 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:


object, not an Ftp object? That mistake definitely makes it look like download 
hasn't been properly tested.

You should create template similar to

template isCurlConn(Conn)
{
    auto isCurlConn = is(Conn : Http) || is(Conn : Ftp) || is(Conn : 
AutoConnection);
}

It would reduce code duplication.

The functions which have a template parameter T = char really need to have 
that parameter properly explained in the documentation. From the 
documentation, I would have thought that it was any character type, but it's 
char or ubyte. There is no hint whatsoever in the documentation that ubyte 
would work. All of the parameters need to be made clear. That goes for all 
functions.

Is there a reason that the functions which have a template argument T which 
can be either a char or a ubyte can't work with immutable char? Glancing at 
_basicHttp, I don't see any reason why T couldn't be immutable char. Yes, it 
would require casting to immutable(char)[], but you're already casting to 
char[], and the data being returned appears to be unique such that it could be 
safely cast to immutable. That being the case, I'd encourage you to not only 
make it work with immutable char but to make immutable char the default 
instead of char.

Once you've fixed the exception types like I requested in my first review, you 
can and should use enforceEx!CurlException(cond, msg) instead of enforce(cond, 
new CurlException(msg)).

it may not matter, since we're dealing with strings here, but I'd argue that 

than strings it definitely can be more efficient, and even with strings, it may 
be, since I think that checking whether the length is 0 (as empty does) is 
slightly more efficient than checking whether it's greater than 0.

Please make sure that opening braces on on their own line. That's the way that 
the rest of Phobos does it and it's one of the few formatting rules that we've 
been trying to use consistently throughout Phobos. For the most part, you get 
it right, but not everywhere - nested functions in particular seem to have 
braces on the same line for some reason.




I have no idea why you keep putting parens around uses of the in operator, but 
it's not necessary and makes the code harder to read IMHO. It's certainly not 
required that you change that, but I'd appreciate it if you did.

I see that regexes are used in the module. Please make sure that they still 
work correctly with the new std.regex. They probably do, but it's not 100% 
backwards compatible.

byLine's template constraint needs to verify that the types of Terminator and 
Char are valid. As it is, I could try and pass it something like an Http 
struct if I wanted to. In general, _all_ templates need to verify that their 
arguments are of the appropriate type for _all_ of their arguments.


findSplit or findSplitAfter prevent it?

With a name like Char on byLine, I'd expect it to take any char type, but not 
only do you not verify the type (as previously mentioned), but you instantiate 
_basicHttp with it, which works with char and ubyte, not any char. If Char is 
going to take char and ubyte specifically, then Char is a bad name for it.



auto result = get(url, isFtpUrl(url) ? FTP() : Http());

It would save several lines of code.



This would be easier if I could comment on the code directly, but I don't seem 
to be able to with the link that dsimcha provided (probably because it's blob; 
there's probably another link which would allow direct commenting, but oh 
well). I'll comment more on the implementation later, but bed beckons...

- Jonathan M Davis

Dec 15 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis 
wrote:

 creating an Http object, not an Ftp object? That mistake 
 definitely makes it look like download hasn't been properly 
 tested.

It has been tested as you can see by the unittest just below the 
function. It just did not fail because libcurl requires the same 
setup for this download situation for both ftp and http. Will fix 
it of course.

 You should create template similar to

 template isCurlConn(Conn)
 {
   auto isCurlConn = is(Conn : Http) || is(Conn : Ftp) || 
 is(Conn : AutoConnection);
 }

 It would reduce code duplication.

ok

 The functions which have a template parameter T = char really 
 need to have that parameter properly explained in the 
 documentation. From the documentation, I would have thought 
 that it was any character type, but it's char or ubyte. There 
 is no hint whatsoever in the documentation that ubyte would 
 work. All of the parameters need to be made clear. That goes 
 for all functions.

ok

 Is there a reason that the functions which have a template 
 argument T which can be either a char or a ubyte can't work 
 with immutable char? Glancing at _basicHttp, I don't see any 
 reason why T couldn't be immutable char. Yes, it would require 
 casting to immutable(char)[], but you're already casting to 
 char[], and the data being returned appears to be unique such 
 that it could be safely cast to immutable. That being the case, 
 I'd encourage you to not only make it work with immutable char 
 but to make immutable char the default instead of char.

Is is indeed unique and can be cast to immutable. I'll add that 
option.
Why do you think immutable char as the default is better than 
char? I know that the return type in that case would be string 
and not char[] - but why is that better?

 Once you've fixed the exception types like I requested in my 
 first review, you can and should use 
 enforceEx!CurlException(cond, msg) instead of enforce(cond, new 
 CurlException(msg)).

ok

 it may not matter, since we're dealing with strings here, but 
 I'd argue that !empty should be used rather than length > 0 

 be more efficient, and even with strings, it may be, since I 
 think that checking whether the length is 0 (as empty does) is 
 slightly more efficient than checking whether it's greater than 
 0.

ok

 Please make sure that opening braces on on their own line. 
 That's the way that the rest of Phobos does it and it's one of 
 the few formatting rules that we've been trying to use 
 consistently throughout Phobos. For the most part, you get it 
 right, but not everywhere - nested functions in particular seem 
 to have braces on the same line for some reason.

When I look through the code it seems ok with regards to braces 
on own line. Maybe by nested functions you mean delegates? The 
delegates I use do indeed have braces on same line which is ok 
afaik.

 It's not a big deal, but you should use auto more. For 

 could.

I'm a bit in doubt. On one hand it is great to make everything 
auto in order to make it easy to change type and to remove 
redundancy. One the other hand it is very convenient to be able 
to see the type when you read the code. I know that a clever 
editor would be able to figure out the type for me. But I also 
read code in normal editors for diffs etc. or on github.

Anyway - I've changed as much as possible to auto now.

 I have no idea why you keep putting parens around uses of the 
 in operator, but it's not necessary and makes the code harder 
 to read IMHO. It's certainly not required that you change that, 
 but I'd appreciate it if you did.

Bad habit. Will remove.

 I see that regexes are used in the module. Please make sure 
 that they still work correctly with the new std.regex. They 
 probably do, but it's not 100% backwards compatible.

ok

 byLine's template constraint needs to verify that the types of 
 Terminator and Char are valid. As it is, I could try and pass 
 it something like an Http struct if I wanted to. In general, 
 _all_ templates need to verify that their arguments are of the 
 appropriate type for _all_ of their arguments.

ok


 front are. Does findSplit or findSplitAfter prevent it?

exactly

 With a name like Char on byLine, I'd expect it to take any char 
 type, but not only do you not verify the type (as previously 
 mentioned), but you instantiate _basicHttp with it, which works 
 with char and ubyte, not any char. If Char is going to take 
 char and ubyte specifically, then Char is a bad name for it.

I'll verify the type. For byLine it will only accept char types.



 auto result = get(url, isFtpUrl(url) ? FTP() : Http());

Not possible. get() is a template function where the template 
parameter is connection type ie. one of FTP or HTTP. 
isFtpUrl(url) is evaluated at runtime and not compile time.

 It would save several lines of code.



ok

 This would be easier if I could comment on the code directly, 
 but I don't seem to be able to with the link that dsimcha 
 provided (probably because it's blob; there's probably another 
 link which would allow direct commenting, but oh well). I'll 
 comment more on the implementation later, but bed beckons...

 - Jonathan M Davis

Thanks for this second round of comments

/Jonas

Dec 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, December 17, 2011 23:10:00 jdrewsen wrote:
 On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis
 
 wrote:

 creating an Http object, not an Ftp object? That mistake
 definitely makes it look like download hasn't been properly
 tested.

 
 It has been tested as you can see by the unittest just below the
 function. It just did not fail because libcurl requires the same
 setup for this download situation for both ftp and http. Will fix
 it of course.

Well, when there's code like that that is clearly wrong, it makes it look like 
it's either not being tested by the unit tests or that the unit tests haven't 
been run. I guess that this is just a case where the unit tests are unable to 
catch the problem.

 Is there a reason that the functions which have a template
 argument T which can be either a char or a ubyte can't work
 with immutable char? Glancing at _basicHttp, I don't see any
 reason why T couldn't be immutable char. Yes, it would require
 casting to immutable(char)[], but you're already casting to
 char[], and the data being returned appears to be unique such
 that it could be safely cast to immutable. That being the case,
 I'd encourage you to not only make it work with immutable char
 but to make immutable char the default instead of char.

 
 Is is indeed unique and can be cast to immutable. I'll add that
 option.
 Why do you think immutable char as the default is better than
 char? I know that the return type in that case would be string
 and not char[] - but why is that better?

strings can't be sliced with impunity, because their elements are immutable. 
With char[] and const(char)[], you have to worry about the elements changing, 
so you're often forced to duplicate the string rather than simply slice it. As 
such, string is far preferrable to char[] or const(char)[] in the general 
case. It's even better when there's a choice between them, but if you have to 
choose, you should almost always choose string (there are exceptions though - 
e.g. buffers like in std.stdio.ByLine where it reuses its char[] buffer rather 
than allocating a new one).

 Please make sure that opening braces on on their own line.
 That's the way that the rest of Phobos does it and it's one of
 the few formatting rules that we've been trying to use
 consistently throughout Phobos. For the most part, you get it
 right, but not everywhere - nested functions in particular seem
 to have braces on the same line for some reason.

 When I look through the code it seems ok with regards to braces
 on own line. Maybe by nested functions you mean delegates? The
 delegates I use do indeed have braces on same line which is ok
 afaik.

In general, just make sure that braces are on their own line unless you're 
dealing with a one line lambda or something similar. You do have nested 
functions or delegates which don't do that.
 
 It's not a big deal, but you should use auto more. For

 could.

 
 I'm a bit in doubt. On one hand it is great to make everything
 auto in order to make it easy to change type and to remove
 redundancy. One the other hand it is very convenient to be able
 to see the type when you read the code. I know that a clever
 editor would be able to figure out the type for me. But I also
 read code in normal editors for diffs etc. or on github.
 
 Anyway - I've changed as much as possible to auto now.

Just in general, it's best practice to use auto unless you can't. There _are_ 
exceptions to that rule, but that's almost always the way that it should be.


 
 auto result = get(url, isFtpUrl(url) ? FTP() : Http());

 
 Not possible. get() is a template function where the template
 parameter is connection type ie. one of FTP or HTTP.
 isFtpUrl(url) is evaluated at runtime and not compile time.

Yeah. I missed that. Stuff like that makes me wish that we had some kind of 
static ternary operator, but it would probably complicate the language too 
much. In some cases though, you might be able to use alias to fix the problem 
(static if the alias and then use the alias in the function call so that the 
entire function call isn't duplicated).

- Jonathan M Davis

Dec 17 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Sunday, 18 December 2011 at 01:27:45 UTC, Jonathan M Davis 
wrote:
 On Saturday, December 17, 2011 23:10:00 jdrewsen wrote:
 On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis
 
 wrote:

 creating an Http object, not an Ftp object? That mistake
 definitely makes it look like download hasn't been properly
 tested.

 
 It has been tested as you can see by the unittest just below 
 the
 function. It just did not fail because libcurl requires the 
 same
 setup for this download situation for both ftp and http. Will 
 fix
 it of course.

 Well, when there's code like that that is clearly wrong, it 
 makes it look like it's either not being tested by the unit 
 tests or that the unit tests haven't been run. I guess that 
 this is just a case where the unit tests are unable to catch 
 the problem.

 Is there a reason that the functions which have a template
 argument T which can be either a char or a ubyte can't work
 with immutable char? Glancing at _basicHttp, I don't see any
 reason why T couldn't be immutable char. Yes, it would 
 require
 casting to immutable(char)[], but you're already casting to
 char[], and the data being returned appears to be unique such
 that it could be safely cast to immutable. That being the 
 case,
 I'd encourage you to not only make it work with immutable 
 char
 but to make immutable char the default instead of char.

 
 Is is indeed unique and can be cast to immutable. I'll add that
 option.
 Why do you think immutable char as the default is better than
 char? I know that the return type in that case would be string
 and not char[] - but why is that better?

 strings can't be sliced with impunity, because their elements 
 are immutable. With char[] and const(char)[], you have to worry 
 about the elements changing, so you're often forced to 
 duplicate the string rather than simply slice it. As such, 
 string is far preferrable to char[] or const(char)[] in the 
 general case. It's even better when there's a choice between 
 them, but if you have to choose, you should almost always 
 choose string (there are exceptions though - e.g. buffers like 
 in std.stdio.ByLine where it reuses its char[] buffer rather 
 than allocating a new one).

 Please make sure that opening braces on on their own line.
 That's the way that the rest of Phobos does it and it's one 
 of
 the few formatting rules that we've been trying to use
 consistently throughout Phobos. For the most part, you get it
 right, but not everywhere - nested functions in particular 
 seem
 to have braces on the same line for some reason.

 When I look through the code it seems ok with regards to braces
 on own line. Maybe by nested functions you mean delegates? The
 delegates I use do indeed have braces on same line which is ok
 afaik.

 In general, just make sure that braces are on their own line 
 unless you're dealing with a one line lambda or something 
 similar. You do have nested functions or delegates which don't 
 do that.

 It's not a big deal, but you should use auto more. For

 could.

 
 I'm a bit in doubt. On one hand it is great to make everything
 auto in order to make it easy to change type and to remove
 redundancy. One the other hand it is very convenient to be able
 to see the type when you read the code. I know that a clever
 editor would be able to figure out the type for me. But I also
 read code in normal editors for diffs etc. or on github.
 
 Anyway - I've changed as much as possible to auto now.

 Just in general, it's best practice to use auto unless you 
 can't. There _are_ exceptions to that rule, but that's almost 
 always the way that it should be.


 
 auto result = get(url, isFtpUrl(url) ? FTP() : Http());

 
 Not possible. get() is a template function where the template
 parameter is connection type ie. one of FTP or HTTP.
 isFtpUrl(url) is evaluated at runtime and not compile time.

 Yeah. I missed that. Stuff like that makes me wish that we had 
 some kind of static ternary operator, but it would probably 
 complicate the language too much. In some cases though, you 
 might be able to use alias to fix the problem (static if the 
 alias and then use the alias in the function call so that the 
 entire function call isn't duplicated).

A static ternary operator wouldn't work in this case since 
isFtpUrl(url) cannot be evaluated at compile time.

/Jonas

Dec 18 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, December 18, 2011 21:49:16 jdrewsen wrote:
 A static ternary operator wouldn't work in this case since
 isFtpUrl(url) cannot be evaluated at compile time.

Ah. That would be true. Still, it's the sort of thing which begs for a ternary 
operator but which just can't use it due to technical limitations in the 
language. It may be completely impractical to try and overcome those technical 
limitations, but it's still frustrating sometimes. Still, at least we _have_ 
static if, unlike C++.

- Jonathan M Davis

Dec 18 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

Please make sure that you remove trailing whitespace from the file. A lot of 
the lines have trailing whitespace. Also, make sure that you don't have any 
tabs in the file. There are a few places where you used tabs.


so please fix it before it goes into Phobos (assuming that it's voted in).



auto arr = new Char[](bufferSize);

There's no reason to create it and then assign to its length property.


wrong on some of them, and not all of them are on their own line.

You should see if the cast(void[])'s can be removed from byLineAsync. They 
might still be necessary, but null was given a type with the last release, so 
it might work without the cast now. The same goes with byChunkAsync.


declaring an array and the assigning to its length variable. The code is 
probably slightly less efficient that way too. Just allocate a new array of the 
correct length.


Just check over your braces in general to make sure that they're consistent 
and are on their own line unless you're dealing with something like a one line 
lambda.

And as I mentioned before, all of these enforces which take a new 
CurlException, should be changed to use enforceEx after you've fixed 
CurlException's constructor.

The documentation on Protocol's isValid is wrong. It claims that isValid is 
true if the instance is stoppend and invalid. Wouldn't that mean that the 
object _wasn't_ valid?

There are quite a few places where you're concatenating several strings and 
format would make the code much cleaner (e.g. Protocol's netInterface 
function). Unless you're avoiding format in an attempt to allow functions to 
be pure (since unfortunately, format can't currently be pure), you really 
should be using format.

I'd argue that it's better to use empty than check whether a string is equal 


There bodies of Http's constructor, static opCall, and dup functions are all 
very similar - especially the constructor and static opCall. You should look 
at refactoring those so that they don't have to duplicate so much code. Maybe 
it can't be reasonably done easily, but it would certainly be better if that 
code duplication could be reduced.

Change setTimeCondition to take a SysTime. DateTime is not intended for 
timestamps. It's intended for calendar-based time, not the system time. Also, 
SysTime makes it easier to convert to time_t. In general, the fact that you're 
looking to deal with a time_t is a sign that you should be using SysTime. This 
is especially true, since DateTime is going to give the wrong time_t in 
general, since it has no time zone. As it stands, the timestamp is assumed to 
be in UTC. That's just begging for bugs. Users won't expect that. Definitely 

2017 would become

p.curl.set(CurlOption.timevalue, timestamp.toUnixTime());


~ or to!string in a call to format. It's just creating unnecessary memory 
allocations. It should be something more like

format("%s%s(%s.", code, reason, majorVersion, minorVersion);

Though I find that '(' to be rather odd, since it has no matching closing 
paren, so one may need to be added. Regardless, that line shouldn't be 
concatenating strings or use conv.to. format takes care of all of that and 
should do it with fewer memory allocations.

Pointers really should have their * next to the type not floating in space like 

C/C++, the * is clearly associated with the type. e.g.

int* a, b;

creates an int* and an int in C, but it creates int* and int* in D. Separating 
the * from the type has a tendancy to make the code harder to understand.

Ftp's consturctor and static opCall have a code duplication problem similar to 
that of Http.

Make encoding take a string. Any time that you are _always_ going to idup a 
parameter which is a character array, make it a string, _not_ const. That way, 
if it's a string, no idup is necessary, and if it's char[], then it can be 
iduped when it's passed in, and iduping only occurs when it's actually 
necessary.

In Smtp's constructor, yo ushould probably create a variable for the toLowered 
url. e.g.

auto lowered = url.toLower();

That way, you avoid having to lower it twice if the else branch of the if-else 
is taken.


needs parens. The curl module should be able to be built with -property. I 
believe that Phobos as a whole is currently being built with tha flag. If not, 
it will be soon, and this module will need to do that if it's merged into 
Phobos.

If you can, please use a static foreach with EnumMembers!CurlOption in Curl's 
dup. And if you can't, because you're not clearing all of them, then put all 
of the ones that you _are_ clearing in an array (or TypeTuple if you want to 
avoid the allocation for the array) and iterate over them with a foreach. e.g.

foreach(option; TypeTuple!(CurlOption.file, CurlOptions.writefunction))
    copy.clear(option);

You should be able to cut down on the number of lines by doing that 
(_especially_ if you're actually clearing all of them and can use 
EnumMembers!CurlOption).


doing more than one ~, consider using format unless you're trying to make a 
function pure.

I'd suggest renaming Message to something like CurlMessage. The odds of a name 
clash with Message are likely high. And while they won't be able to use 
Message, since it's private, it will still affect whether the full path for 
Message must be given (e.g. my.module.Message instead of Message). CurlMessage 
is far less likely to clash.

Rename DATA to Data. DATA does not follow Phobos' naming conventions. Granted, 
it's private, but it's completely off. Type names should be pascal cased.


more consistent with everything else if it were  a property.

In general, you should use when converting between TickDuration and Duration, 

shouldn't need to do any converting. Duration's opOpAssign should be able to 
handle a TickDuration.

Overall, the design looks fairly solid, and the code looks pretty good, but 
there are definitely a number of minor issues which should be addressed before 
this code makes it into Phobos. The biggest involve the functions' parameters 
(such as using SysTime, not DateTime, and using string in some places where 
you're using const(char)[] in order to avoid unnecessary idups). And those 
definitely need to be sorted out sooner rather than later, since they affect
the 
API and will break code if they're changed later.

- Jonathan M Davis

Dec 16 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Friday, 16 December 2011 at 09:42:50 UTC, Jonathan M Davis 
wrote:
 Please make sure that you remove trailing whitespace from the 
 file. A lot of the lines have trailing whitespace. Also, make 
 sure that you don't have any tabs in the file. There are a few 
 places where you used tabs.

ok


 needs to be fixed, so please fix it before it goes into Phobos 
 (assuming that it's voted in).

ok



 auto arr = new Char[](bufferSize);

 There's no reason to create it and then assign to its length 
 property.

ok


 indentation is wrong on some of them, and not all of them are 
 on their own line.

ok

 You should see if the cast(void[])'s can be removed from 
 byLineAsync. They might still be necessary, but null was given 
 a type with the last release, so it might work without the cast 
 now. The same goes with byChunkAsync.

Still doesn't work.


 line of code by declaring an array and the assigning to its 
 length variable. The code is probably slightly less efficient 
 that way too. Just allocate a new array of the correct length.

ok



 sure that they're consistent and are on their own line unless 
 you're dealing with something like a one line lambda.

ok

 And as I mentioned before, all of these enforces which take a 
 new CurlException, should be changed to use enforceEx after 
 you've fixed CurlException's constructor.

ok

 The documentation on Protocol's isValid is wrong. It claims 
 that isValid is true if the instance is stoppend and invalid. 
 Wouldn't that mean that the object _wasn't_ valid?

ok

 There are quite a few places where you're concatenating several 
 strings and format would make the code much cleaner (e.g. 
 Protocol's netInterface function). Unless you're avoiding 
 format in an attempt to allow functions to be pure (since 
 unfortunately, format can't currently be pure), you really 
 should be using format.

ok

 I'd argue that it's better to use empty than check whether a 
 string is equal to "". e.g. domain.empty instead of domain != 


ok

 There bodies of Http's constructor, static opCall, and dup 
 functions are all very similar - especially the constructor and 
 static opCall. You should look at refactoring those so that 
 they don't have to duplicate so much code. Maybe it can't be 
 reasonably done easily, but it would certainly be better if 
 that code duplication could be reduced.

ok

 Change setTimeCondition to take a SysTime. DateTime is not 
 intended for timestamps. It's intended for calendar-based time, 
 not the system time. Also, SysTime makes it easier to convert 
 to time_t. In general, the fact that you're looking to deal 
 with a time_t is a sign that you should be using SysTime. This 
 is especially true, since DateTime is going to give the wrong 
 time_t in general, since it has no time zone. As it stands, the 
 timestamp is assumed to be in UTC. That's just begging for 
 bugs. Users won't expect that. Definitely make it a SysTime. 

 2017 would become

 p.curl.set(CurlOption.timevalue, timestamp.toUnixTime());

ok


 There should be no ~ or to!string in a call to format. It's 
 just creating unnecessary memory allocations. It should be 
 something more like

 format("%s%s(%s.", code, reason, majorVersion, minorVersion);

 Though I find that '(' to be rather odd, since it has no 
 matching closing paren, so one may need to be added. 
 Regardless, that line shouldn't be concatenating strings or use 
 conv.to. format takes care of all of that and should do it with 
 fewer memory allocations.

ok - ')' was missing

 Pointers really should have their * next to the type not 

 like a multiplication. Unlike in C/C++, the * is clearly 
 associated with the type. e.g.

 int* a, b;

 creates an int* and an int in C, but it creates int* and int* 
 in D. Separating the * from the type has a tendancy to make the 
 code harder to understand.

If that's the D style I'll do that.

 Ftp's consturctor and static opCall have a code duplication 
 problem similar to that of Http.

ok

 Make encoding take a string. Any time that you are _always_ 
 going to idup a parameter which is a character array, make it a 
 string, _not_ const. That way, if it's a string, no idup is 
 necessary, and if it's char[], then it can be iduped when it's 
 passed in, and iduping only occurs when it's actually necessary.

ok

 In Smtp's constructor, yo ushould probably create a variable 
 for the toLowered url. e.g.

?
 auto lowered = url.toLower();

 That way, you avoid having to lower it twice if the else branch 
 of the if-else is taken.

ok


 If not, then it needs parens. The curl module should be able to 
 be built with -property. I believe that Phobos as a whole is 
 currently being built with tha flag. If not, it will be soon, 
 and this module will need to do that if it's merged into Phobos.

ok

 If you can, please use a static foreach with 
 EnumMembers!CurlOption in Curl's dup. And if you can't, because 
 you're not clearing all of them, then put all of the ones that 
 you _are_ clearing in an array (or TypeTuple if you want to 
 avoid the allocation for the array) and iterate over them with 
 a foreach. e.g.

 foreach(option; TypeTuple!(CurlOption.file, 
 CurlOptions.writefunction))
   copy.clear(option);

 You should be able to cut down on the number of lines by doing 
 that (_especially_ if you're actually clearing all of them and 
 can use EnumMembers!CurlOption).

Nice.

I actually found out that you can do:

with (CurlOption)
  foreach(option; TypeTuple!(file, writefunction))
    copy.clear(option);

Which is pretty neat when you have to list as many options as in 
the curl wrapper.


 general, if you're doing more than one ~, consider using format 
 unless you're trying to make a function pure.

ok

 I'd suggest renaming Message to something like CurlMessage. The 
 odds of a name clash with Message are likely high. And while 
 they won't be able to use Message, since it's private, it will 
 still affect whether the full path for Message must be given 
 (e.g. my.module.Message instead of Message). CurlMessage is far 
 less likely to clash.

ok

 Rename DATA to Data. DATA does not follow Phobos' naming 
 conventions. Granted, it's private, but it's completely off. 
 Type names should be pascal cased.

ok


 but it would be more consistent with everything else if it were 
 a property.

ok

 In general, you should use when converting between TickDuration 
 and Duration, but in some cases, no conversion should be 

 converting. Duration's opOpAssign should be able to handle a 
 TickDuration.

ok

 Overall, the design looks fairly solid, and the code looks 
 pretty good, but there are definitely a number of minor issues 
 which should be addressed before this code makes it into 
 Phobos. The biggest involve the functions' parameters (such as 
 using SysTime, not DateTime, and using string in some places 
 where you're using const(char)[] in order to avoid unnecessary 
 idups). And those definitely need to be sorted out sooner 
 rather than later, since they affect the API and will break 
 code if they're changed later.

 - Jonathan M Davis

Thanks for the comments
/Jonas

Dec 30 2011

"Jakob Ovrum" <jakobovrum gmail.com> writes:

The docs for onReceiveHeader,

http://freeze.steamwinter.com/D/web/phobos/etc_curl.html#onReceiveHeader

explicitly says that the const string parameters are not valid 
after the function returns. This is all well and good; but a 
minor improvement would be to change the signature to:

 property void onReceiveHeader(void delegate(in char[] key, in 
char[] value) callback);

to add `scope` to the parameters. This documents the fact that 
the constant strings shouldn't be escaped as-is in both code and 
documentation. I suspect many other instances of `const(char)[]` 
could be changed to `in char[]` to better document how they use 
the parameters (I also personally think it looks better as it is 
more concise).

Dec 17 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Saturday, 17 December 2011 at 20:01:04 UTC, Jakob Ovrum wrote:
 The docs for onReceiveHeader,

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html#onReceiveHeader

 explicitly says that the const string parameters are not valid 
 after the function returns. This is all well and good; but a 
 minor improvement would be to change the signature to:

  property void onReceiveHeader(void delegate(in char[] key, in 
 char[] value) callback);

 to add `scope` to the parameters. This documents the fact that 
 the constant strings shouldn't be escaped as-is in both code 
 and documentation. I suspect many other instances of 
 `const(char)[]` could be changed to `in char[]` to better 
 document how they use the parameters (I also personally think 
 it looks better as it is more concise).

I agree. Actually I thought that 'in' was just an alias of 
'const' but now I see that it is actually an alias of 'const 
scope'.

Dec 30 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 12/2/11 10:26 PM, dsimcha wrote:
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper. After the first round it was decided that,
 after a large number of minor issues were fixed, a second round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

Yes.

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but
 it was buried deep in a thread and a lot of people may have missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

std.net.curl

 Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

I have a few comments. They are not critical for library's admission 
(apologies I didn't make the deadline) and could be integrated before 
the commit.

1. Tables must be centered (this is a note to self)

2. Replace d-p-l.org in examples with dlang.org

3. "Compared to using libcurl directly this module provides a simpler 
API for performing common tasks." -> "Compared to using libcurl directly 
this module allows simpler client code for common uses, requires no 
unsafe operations, and integrates better with the rest of the language."

4. From the high level ops it seems there's no way to issue a PUT/POST 
and then direct the result to a file, or read the result with a foreach.

5. Also there seems to be no way to issue a PUT/POST from a range.

6. s/Examples:/Example:/

7. s/HTTP http = HTTP("www.d-p-l.org");/auto http = HTTP("www.d-p-l.org");/

8. "receives a header or a data buffer" -> "receives a header and a data 
buffer, respectively"

9. "In this simple example, the headers are writting to stdout and the 
data is ignored." How can we make it to stop the request? A sentence 
about that would be great.

10. "Finally the HTTP request is started by calling perform()." -> 
"Finally the HTTP request is effected by calling perform(), which is 
synchronous."

11. Use InferredProtocol instead of AutoConnect(ion). The type is 
exactly what it claims. If "inferred" is too long, "AutoProtocol" would 
be fine too. But it's not the connection that is automated. The word 
must not be a verb, i.e. InferProtocol/DeduceProtocol would not be good.

12. The example for InferredProtocol nee AutoConnection does not show 
any interesting example, e.g. ftp.

13. I think download() and upload() are good convenience functions but 
are very limited. Such operations can last arbitrarily long and it's 
impossible to stop them. Therefore, they will foster poor program design 
(e.g. program hangs until you kill it etc). We should keep these for 
convenience in scripts, and we already have byXxx() for downloading, but 
we don't have anything for uploading. We should offer output ranges for 
uploading sync/async. Maybe in the next minor release.

14. Parameter doc for put(), post(), del() etc. is messed up.

15. I don't think "connection" is the right term. For example we have

T[] get(Conn = AutoConnection, T = char)(const(char)[] url, Conn conn = 
Conn());

The connection is established only during the call to get(), so conn is 
not really a connection - more like a sort of a bundle of connection 
parameters. But then it does clean up the connection once established, 
so I'm unsure...

16. For someone who'd not versed in HTTP options, the example

string content = options("d-programming-language.appspot.com/testUrl2", 
"something");

isn't terribly informative.

17. In byLine: "A range of lines is returned when the request is 
complete." This suggests the last byte has been read as the client gets 
to the first, which is not the case. "Data will be read on demand as the 
foreach loop makes progress." Similar for byChunk.

18. In byXxxAsync wait(Duration) should be cross-referenced to its 
definition.

=========================

Overall: I think this is a valuable addition to Phobos, but I have the 
feeling we don't have a good scenario for interrupting connections. For 
example, if someone wants to offer a "Cancel" button, the current API 
does not give them a robust option to do so: all functions that transfer 
data are potentially blocking, and there's no thread-shared way to 
interrupt a connection in a thread from another thread.

Such functionality may be a pure addition later, so I think we can 
commit this as is. Please use std.net.curl.


Thanks,

Andrei

Dec 17 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Saturday, 17 December 2011 at 22:25:07 UTC, Andrei 
Alexandrescu wrote:
 On 12/2/11 10:26 PM, dsimcha wrote:
 I volunteered ages ago to manage the review for the second 
 round of
 Jonas Drewsen's CURL wrapper. After the first round it was 
 decided that,
 after a large number of minor issues were fixed, a second 
 round would be
 necessary.

 Significant open issues:

 1. Should libcurl be bundled with DMD on Windows?

 Yes.

 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while 
 back but
 it was buried deep in a thread and a lot of people may have 
 missed it:
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 std.net.curl

ok

 Code: 
 https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 I have a few comments. They are not critical for library's 
 admission (apologies I didn't make the deadline) and could be 
 integrated before the commit.

 1. Tables must be centered (this is a note to self)

 2. Replace d-p-l.org in examples with dlang.org

ok

 3. "Compared to using libcurl directly this module provides a 
 simpler API for performing common tasks." -> "Compared to using 
 libcurl directly this module allows simpler client code for 
 common uses, requires no unsafe operations, and integrates 
 better with the rest of the language."

ok

 4. From the high level ops it seems there's no way to issue a 
 PUT/POST and then direct the result to a file, or read the 
 result with a foreach.

Use byLine/Chunk with a connection initialized with post/put data:

auto http = HTTP();
http.postData = "hello world";
foreach (line; byLine("dlang.org/there", KeepTerminator.no, '\n', 
http))
   writeln(line);

Same deal for directing to file using download()

 5. Also there seems to be no way to issue a PUT/POST from a 
 range.

Correct. It will be a later extension.

 6. s/Examples:/Example:/

ok

 7. s/HTTP http = HTTP("www.d-p-l.org");/auto http = 
 HTTP("www.d-p-l.org");/

ok

 8. "receives a header or a data buffer" -> "receives a header 
 and a data buffer, respectively"

ok

 9. "In this simple example, the headers are writting to stdout 
 and the data is ignored." How can we make it to stop the 
 request? A sentence about that would be great.

ok

 10. "Finally the HTTP request is started by calling perform()." 
 -> "Finally the HTTP request is effected by calling perform(), 
 which is synchronous."

ok

 11. Use InferredProtocol instead of AutoConnect(ion). The type 
 is exactly what it claims. If "inferred" is too long, 
 "AutoProtocol" would be fine too. But it's not the connection 
 that is automated. The word must not be a verb, i.e. 
 InferProtocol/DeduceProtocol would not be good.

ok

 12. The example for InferredProtocol nee AutoConnection does 
 not show any interesting example, e.g. ftp.

improved

 13. I think download() and upload() are good convenience 
 functions but are very limited. Such operations can last 
 arbitrarily long and it's impossible to stop them. Therefore, 
 they will foster poor program design (e.g. program hangs until 
 you kill it etc). We should keep these for convenience in 
 scripts, and we already have byXxx() for downloading, but we 
 don't have anything for uploading. We should offer output 
 ranges for uploading sync/async. Maybe in the next minor 
 release.

Using the same method as for post it is possible to set timeouts.

auto http = HTTP();
http.dataTimeout = dur!"minutes"(2);
foreach (line; byLine("dlang.org/there", KeepTerminator.no, '\n', 
http))
   writeln(line);

Regarding uploading ranges this will be handled by the same 
extension as
for POST/PUT output ranges.

 14. Parameter doc for put(), post(), del() etc. is messed up.

ok

 15. I don't think "connection" is the right term. For example 
 we have

 T[] get(Conn = AutoConnection, T = char)(const(char)[] url, 
 Conn conn = Conn());

 The connection is established only during the call to get(), so 
 conn is not really a connection - more like a sort of a bundle 
 of connection parameters. But then it does clean up the 
 connection once established, so I'm unsure...

But you can provide an existing connection to use has that is 
already connected to the server. Hence the name connection.


 16. For someone who'd not versed in HTTP options, the example

 string content = 
 options("d-programming-language.appspot.com/testUrl2", 
 "something");

 isn't terribly informative.

ok - improved

 17. In byLine: "A range of lines is returned when the request 
 is complete." This suggests the last byte has been read as the 
 client gets to the first, which is not the case. "Data will be 
 read on demand as the foreach loop makes progress." Similar for 
 byChunk.

This will change when multi API is supported as well.

 18. In byXxxAsync wait(Duration) should be cross-referenced to 
 its definition.

ok

 =========================

 Overall: I think this is a valuable addition to Phobos, but I 
 have the feeling we don't have a good scenario for interrupting 
 connections. For example, if someone wants to offer a "Cancel" 
 button, the current API does not give them a robust option to 
 do so: all functions that transfer data are potentially 
 blocking, and there's no thread-shared way to interrupt a 
 connection in a thread from another thread.

 Such functionality may be a pure addition later, so I think we 
 can commit this as is. Please use std.net.curl.

They can abort from the callback handlers - but otherwise it is a 
limitation of the libcurl easy API.
To actually do this the multi extension would have to be added.

Thank you for the comments and sorry for the late reply.

/Jonas

Dec 30 2011

Somedude <lovelydear mailmetrash.com> writes:

Le 03/12/2011 05:26, dsimcha a �crit :
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper.  After the first round it was decided
 that, after a large number of minor issues were fixed, a second round
 would be necessary.
 
 Significant open issues:
 
 1.  Should libcurl be bundled with DMD on Windows?
 
 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
 
 Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
 
 For those of you on Windows, a libcurl binary built by DMC is available
 at http://gool.googlecode.com/files/libcurl_7.21.7.zip.
 
 Review starts now and ends on December 16, followed by one week of voting.

A bit late, but following Andrei's important remark that some operations
are blocking, I also noticed that curl-multi API is not covered.

http://curl.haxx.se/libcurl/c/libcurl-multi.html

Is there a reason why ? Would it break the current API to add it later ?

Dec 19 2011

Somedude <lovelydear mailmetrash.com> writes:

Le 19/12/2011 19:05, Somedude a �crit :
 Le 03/12/2011 05:26, dsimcha a �crit :
 I volunteered ages ago to manage the review for the second round of
 Jonas Drewsen's CURL wrapper.  After the first round it was decided
 that, after a large number of minor issues were fixed, a second round
 would be necessary.

 Significant open issues:

 1.  Should libcurl be bundled with DMD on Windows?

 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a while back
 but it was buried deep in a thread and a lot of people may have missed
 it:  http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )

 Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 For those of you on Windows, a libcurl binary built by DMC is available
 at http://gool.googlecode.com/files/libcurl_7.21.7.zip.

 Review starts now and ends on December 16, followed by one week of voting.

 
 A bit late, but following Andrei's important remark that some operations
 are blocking, I also noticed that curl-multi API is not covered.
 
 http://curl.haxx.se/libcurl/c/libcurl-multi.html
 
 Is there a reason why ? Would it break the current API to add it later ?
 

It looks like curl_multi_perform's behavior is more interesting than
curl_easy_perform. *
From the tutorial:

http://curl.haxx.se/libcurl/c/libcurl-tutorial.html

"curl_multi_perform(3) is asynchronous. It will only execute as little
as possible and then return back control to your program. It is designed
to never block. If it returns CURLM_CALL_MULTI_PERFORM you better call
it again soon, as that is a signal that it still has local data to send
or remote data to receive. "

Dec 19 2011

"jdrewsen" <jdrewsen nospam.com> writes:

On Monday, 19 December 2011 at 18:14:29 UTC, Somedude wrote:
 Le 19/12/2011 19:05, Somedude a écrit :
 Le 03/12/2011 05:26, dsimcha a écrit :
 I volunteered ages ago to manage the review for the second 
 round of
 Jonas Drewsen's CURL wrapper.  After the first round it was 
 decided
 that, after a large number of minor issues were fixed, a 
 second round
 would be necessary.

 Significant open issues:

 1.  Should libcurl be bundled with DMD on Windows?

 2.  etc.curl, std.curl, or std.net.curl?  (We had a vote a 
 while back
 but it was buried deep in a thread and a lot of people may 
 have missed
 it:  
 http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab 
 )

 Code: 
 https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
 Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 For those of you on Windows, a libcurl binary built by DMC is 
 available
 at http://gool.googlecode.com/files/libcurl_7.21.7.zip.

 Review starts now and ends on December 16, followed by one 
 week of voting.

 
 A bit late, but following Andrei's important remark that some 
 operations
 are blocking, I also noticed that curl-multi API is not 
 covered.
 
 http://curl.haxx.se/libcurl/c/libcurl-multi.html
 
 Is there a reason why ? Would it break the current API to add 
 it later ?
 

 It looks like curl_multi_perform's behavior is more interesting 
 than
 curl_easy_perform. *
 From the tutorial:

 http://curl.haxx.se/libcurl/c/libcurl-tutorial.html

 "curl_multi_perform(3) is asynchronous. It will only execute as 
 little
 as possible and then return back control to your program. It is 
 designed
 to never block. If it returns CURLM_CALL_MULTI_PERFORM you 
 better call
 it again soon, as that is a signal that it still has local data 
 to send
 or remote data to receive. "

I am aware of the multi_perform API. The multi API actually uses 
easy handles and therefore it would be natural to extend the 
wrapper with multi support later on.

It will allow for read() and write() methods on the HTTP/FTP 
structs as well.

For now I want to limit the scope and stick to the easy handles.

/Jonas

Dec 30 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Second Round CURL Wrapper Review