digitalmars.D - Curl wrapper

jdrewsen (7/7) May 16 2011 Hi,

dsimcha (14/21) May 16 2011 This looks very good and I'd definitely be in favor of including it in t...

Jimmy Cao (2/2) May 16 2011 It looks nice. I plan to use this soon.
Daniel Gibson (3/24) May 16 2011 Maybe it's common sense, but it doesn't hurt to have that information.
Jonas Drewsen (9/30) May 17 2011 I included it because the libcurl handle used in a HTTP instance is not

Andrei Alexandrescu (54/61) May 16 2011 Now that's one sight for sore eyes. Nice work. A few comments and nits:

Vladimir Panteleev (7/8) May 16 2011 I was the one who suggested using void[]. If data on wire is ubyte[], wh...

Jonas Drewsen (3/8) May 17 2011 I believe we decided on ubyte[] for reading and void[] for write in an
Andrei Alexandrescu (3/8) May 17 2011 Bug in std.file.

Vladimir Panteleev (10/19) May 17 2011 I remember suggesting a long time ago that all types that do not have

Andrei Alexandrescu (6/23) May 17 2011 Yes. The problem is that converting a type implicitly to ubyte[] allows

Vladimir Panteleev (7/31) May 17 2011 Ah, of course.

Andrei Alexandrescu (4/34) May 17 2011 Yah, const(ubyte)[] is much more reasonable. We may as well add a

Andrej Mitrovic (3/5) May 17 2011 Would this mean that e.g. you could save the state of an object to

Andrei Alexandrescu (4/9) May 17 2011 If it embeds no pointers, sure. Otherwise, a more involved serialization...

KennyTM~ (3/13) May 17 2011 I suppose that applies to std.mmfile too? MmFile.opSlice() currently

Jonas Drewsen (46/110) May 17 2011 The code actually defined the Http/Ftp classes as final. But somehow it

Andrei Alexandrescu (31/83) May 17 2011 Thanks for the response! A few more answers and comments within

jdrewsen (19/66) May 17 2011 I guess I should just remove that comment because other defaults has

Johannes Pfau (16/73) May 18 2011 I'm not sure if it'd be that easy. Using select to implement a blocking

Jonas Drewsen (25/105) May 18 2011 _Doable_ - not easy :)

Johannes Pfau (17/127) May 18 2011 I think it's not necessary to do multiple calls if the first didn't

jdrewsen (2/139) May 18 2011

Andrei Alexandrescu (22/27) May 18 2011 Perhaps this would be too complicated. In any case the core

jdrewsen (34/59) May 18 2011 If byChunk is using a hidden thread to download into buffers, then how

Andrei Alexandrescu (16/68) May 18 2011 Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as

jdrewsen (5/77) May 19 2011 It see your point. By buffering data asynchronously the reads and writes...

Masahiro Nakagawa (12/34) May 23 2011 Good!

jdrewsen (8/48) May 23 2011 If you have any code for inspiration or that could be used as a base I

Masahiro Nakagawa (6/55) May 29 2011 My old design is based on Boost/Asio. After I rewrote std.socket(includi...

Jimmy Cao (4/4) May 29 2011 Is there any reason why this code (using SMTP + Curl):

jdrewsen (6/10) May 30 2011 etc.curl is currently a moving target. The last version I pushed to

Jimmy Cao (2/14) May 30 2011 Uncomment line 19 and it works.

Robert Jacques (93/100) May 16 2011 I think the API looks good with regard to the core functionality it

Jonas Drewsen (56/160) May 17 2011 Please see my reply to Andrei

Robert Jacques (28/83) May 17 2011 [snip]

jdrewsen (3/94) May 17 2011 I think I'll go for the toString() version.

Jacob Carlborg (4/11) May 17 2011 I like the API, seems simple.
bls (16/23) May 23 2011 You really think this is good for your mental health ?

jdrewsen <jdrewsen nospam.com> writes:

Hi,

    I've been working on a wrapper for the etc.c.curl module. It is now 
pretty stable and I would very much like some feedback on the API.

http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

BTW I use template mixins which doesn't seem to get included in the 
generated docs. Is there any way I can make this work?

/Jonas

May 16 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from jdrewsen (jdrewsen nospam.com)'s article
 Hi,
     I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.
 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?
 /Jonas

This looks very good and I'd definitely be in favor of including it in the next
release.  However, there are a few small issues:

1.  In the docs for Http, I don't think you need to say "Do not use the same
instance in two threads simultaneously."  I think this is pretty much common
sense.  Same with the Ftp class.

2.  A short overview of the asynch stuff at the top of the file would be nice.
I'm a little confused about what it's for and when to use it instead of the
regular synchronous functions.  Does this allow you to process the beginning of
a
request while the end is still being received?

3.  I don't think you need to say how all the convenience functions are
implemented, i.e. "Internally this is implemented using an instance of the Http
class."  This is both likely to be assumed and an irrelevant implementation
detail.

Overall, nice work.

May 16 2011

Jimmy Cao <jcao219 gmail.com> writes:

It looks nice.  I plan to use this soon.
I hope it can be in phobos soon.

May 16 2011

Daniel Gibson <metalcaedes gmail.com> writes:

Am 17.05.2011 02:43, schrieb dsimcha:
 == Quote from jdrewsen (jdrewsen nospam.com)'s article
 Hi,
      I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.
 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?
 /Jonas

 This looks very good and I'd definitely be in favor of including it in the next
 release.  However, there are a few small issues:

 1.  In the docs for Http, I don't think you need to say "Do not use the same
 instance in two threads simultaneously."  I think this is pretty much common
 sense.  Same with the Ftp class.

Maybe it's common sense, but it doesn't hurt to have that information. 
Someone will certainly try that and ask why it gives strange results ;)

 2.  A short overview of the asynch stuff at the top of the file would be nice.
 I'm a little confused about what it's for and when to use it instead of the
 regular synchronous functions.  Does this allow you to process the beginning
of a
 request while the end is still being received?

 3.  I don't think you need to say how all the convenience functions are
 implemented, i.e. "Internally this is implemented using an instance of the Http
 class."  This is both likely to be assumed and an irrelevant implementation
detail.

 Overall, nice work.

May 16 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 17/05/11 02.43, dsimcha wrote:
 == Quote from jdrewsen (jdrewsen nospam.com)'s article
 Hi,
      I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.
 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?
 /Jonas

 This looks very good and I'd definitely be in favor of including it in the next
 release.  However, there are a few small issues:

 1.  In the docs for Http, I don't think you need to say "Do not use the same
 instance in two threads simultaneously."  I think this is pretty much common
 sense.  Same with the Ftp class.

I included it because the libcurl handle used in a HTTP instance is not 
thread safe itself.

What if I instantiated a shared instance of this class. Wouldn't that 
make the comment relevant in that access to the internal libcurl handle 
is not protected by mutexes?

 2.  A short overview of the asynch stuff at the top of the file would be nice.
 I'm a little confused about what it's for and when to use it instead of the
 regular synchronous functions.  Does this allow you to process the beginning
of a
 request while the end is still being received?

Exactly. I'll add the missing overview.

 3.  I don't think you need to say how all the convenience functions are
 implemented, i.e. "Internally this is implemented using an instance of the Http
 class."  This is both likely to be assumed and an irrelevant implementation
detail.

You're right.

 Overall, nice work.

Thank you for the feedback!

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 05/16/2011 04:07 PM, jdrewsen wrote:
 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas

Now that's one sight for sore eyes. Nice work. A few comments and nits:

1. Http can't be a class. Network resources are the poster child of 
expensive resources that are NOT memory and don't work well with garbage 
collection. RAII is the poster child of deterministic resource 
management. So...

2. Furthermore I see customization via delegates (onReceive etc). That 
should be a good proxy for customization instead of using inheritance.

3. Data on wire is ubyte[], not void[].

4. Shouldn't onSend take a ref to its argument in case it wants to 
expand it?

5. Why HttpMethod and not Http.Method?

6. "Do not use the same instance of this class in two threads 
simultanously." -> remove. A D program isn't supposed to do that unless 
you offer shared methods.

7. HttpStatusLine -> Http.StatusLine. One up for encapsulation.

8. HttpMethod -> Http.Method. Ditto.

9. "this(in const(char)[] url);" -> "this(string url);" I assume you 
store a copy of the URL inside by calling e.g. to!string. If the client 
has a string, you refuse to tap into that nice information and share the 
string safely and efficiently, and make a copy of it for no good reason. 
If the client doesn't have a string, no worry, the compiler will tell 
her she needs to do a copy.

10. addHeader etc. Same comment. Use string whenever you want to keep a 
copy anyway.

11. setTimeCondition -> use core.Duration for time representation 
throughout.

12. AsyncHttpResult -> Http.AsyncResult :o)

13. HttpResult -> Http.Result

14. Isn't the max redirect configurable via a parameter?

15. Typo: "Callbacks is not supported..."

16. Documentation should point to descriptions of the HTTP methods 
wrapped (e.g. "post", "del" etc).

17. There should be an example with login/password.

18. See onReceiveHeader">std.curl.Curl.onReceiveHeader and others look 
like doc macros gone wrong.

19. "chuncked" -> "chunked"

20. "Max allowed redirs. -1 for infinite." -> use uint and uint.max for 
infinite.

21. "short isRunning()" -> what's the semantics of the short?

22. byLine/byChunk should not expose a string or generally data that can 
be shared safely by the client. That's because it would need to create a 
new string with each iteration, which is very inefficient. Instead, they 
should expose char[]/ubyte[] just like File.byLine does, and reuse the 
buffer with each call. Essentially byLine/byChunk should be 
near-zero-overhead means to transfer and inspect arbitrarily large streams.

23. Ftp should have get in-memory and streaming byLine/byChunk, not only 
get to file.

24. The shutdown mechanism should be handled properly. Shutting down 
libcurl would have all pending transfers instantly throw a special 
exception. Without a shutdown API, applications won't be able to 
implement e.g. a Quit button.

Again, great work!


Andrei

May 16 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

I was the one who suggested using void[]. If data on wire is ubyte[], why  
is data on disk void[] (according to std.file.read/write)?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

May 16 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 17/05/11 07.23, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

I believe we decided on ubyte[] for reading and void[] for write in an 
earlier thread. And std.file should be fixed.

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

Bug in std.file.

Andrei

May 17 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

 Bug in std.file.

I remember suggesting a long time ago that all types that do not have  
pointers be implicitly convertible to ubyte[]. That way, write functions  
can safely use ubyte[] without casts (and even provide a bit of safety  
against accidentally writing pointers to disk or network). Are you still  
against the idea?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/17/11 12:39 PM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

 Bug in std.file.

 I remember suggesting a long time ago that all types that do not have
 pointers be implicitly convertible to ubyte[]. That way, write functions
 can safely use ubyte[] without casts (and even provide a bit of safety
 against accidentally writing pointers to disk or network). Are you still
 against the idea?

Yes. The problem is that converting a type implicitly to ubyte[] allows 
anyone to write raw bytes into an object, making a hash of any type 
system guarantees without a cast in sight. That is, granted, memory-safe 
(because there are no indirections), but is not recommendable.

Andrei

May 17 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Tue, 17 May 2011 20:43:11 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:39 PM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

 Bug in std.file.

 I remember suggesting a long time ago that all types that do not have
 pointers be implicitly convertible to ubyte[]. That way, write functions
 can safely use ubyte[] without casts (and even provide a bit of safety
 against accidentally writing pointers to disk or network). Are you still
 against the idea?

 Yes. The problem is that converting a type implicitly to ubyte[] allows  
 anyone to write raw bytes into an object, making a hash of any type  
 system guarantees without a cast in sight. That is, granted, memory-safe  
 (because there are no indirections), but is not recommendable.

Ah, of course.

const(ubyte)[] ? :)

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/17/11 1:00 PM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 20:43:11 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:39 PM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 17:45:00 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

 Bug in std.file.

 I remember suggesting a long time ago that all types that do not have
 pointers be implicitly convertible to ubyte[]. That way, write functions
 can safely use ubyte[] without casts (and even provide a bit of safety
 against accidentally writing pointers to disk or network). Are you still
 against the idea?

 Yes. The problem is that converting a type implicitly to ubyte[]
 allows anyone to write raw bytes into an object, making a hash of any
 type system guarantees without a cast in sight. That is, granted,
 memory-safe (because there are no indirections), but is not
 recommendable.

 Ah, of course.

 const(ubyte)[] ? :)

Yah, const(ubyte)[] is much more reasonable. We may as well add a 
function representation() that returns it for any object.

Andrei

May 17 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 5/17/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Yah, const(ubyte)[] is much more reasonable. We may as well add a
 function representation() that returns it for any object.

Would this mean that e.g. you could save the state of an object to
disk via repr(), and then retrieve the state at a later point?

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/17/11 1:35 PM, Andrej Mitrovic wrote:
 On 5/17/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 Yah, const(ubyte)[] is much more reasonable. We may as well add a
 function representation() that returns it for any object.

 Would this mean that e.g. you could save the state of an object to
 disk via repr(), and then retrieve the state at a later point?

If it embeds no pointers, sure. Otherwise, a more involved serialization 
mechanism must be used.

Andrei

May 17 2011

KennyTM~ <kennytm gmail.com> writes:

On May 17, 11 22:45, Andrei Alexandrescu wrote:
 On 5/17/11 12:23 AM, Vladimir Panteleev wrote:
 On Tue, 17 May 2011 07:43:06 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 3. Data on wire is ubyte[], not void[].

 I was the one who suggested using void[]. If data on wire is ubyte[],
 why is data on disk void[] (according to std.file.read/write)?

 Bug in std.file.

 Andrei

I suppose that applies to std.mmfile too? MmFile.opSlice() currently 
returns a void[].

May 17 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 17/05/11 06.43, Andrei Alexandrescu wrote:
 On 05/16/2011 04:07 PM, jdrewsen wrote:
 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas

 Now that's one sight for sore eyes. Nice work. A few comments and nits:

 1. Http can't be a class. Network resources are the poster child of
 expensive resources that are NOT memory and don't work well with garbage
 collection. RAII is the poster child of deterministic resource
 management. So...

I see what you mean. Will fix.

 2. Furthermore I see customization via delegates (onReceive etc). That
 should be a good proxy for customization instead of using inheritance.

The code actually defined the Http/Ftp classes as final. But somehow it 
is not reflected in the generated docs. Anyway, I will change it to 
struct and inheritance is not an issue anyway.


 3. Data on wire is ubyte[], not void[].

Data received is ubyte[] and a convenience method for conversion to 
string is provided.

Data send is void[] since all arrays implicitely converts to void[]. 
This makes e.g [1,2,3,4,5] works as input. Convenience methods are 
provided for chars in order to set HTTP headers correctly.

For onSend callback it is simply void[] in order to use it as an out 
parameter.

For Http.post/put it is const(void)[]
For Http.postAsync/putAsync it is immutable(void)[] because internally I 
need to send the data to another thread using message passing.

Hopes this makes sense?


 4. Shouldn't onSend take a ref to its argument in case it wants to
 expand it?

It is not allowed to expand it. The callback originates from libcurl 
which determines the size of the array.

 5. Why HttpMethod and not Http.Method?

I had the thought myself but decided on the former. By looking at the 
replies to my post it seems that it should be changed. I'll do that.

 6. "Do not use the same instance of this class in two threads
 simultanously." -> remove. A D program isn't supposed to do that unless
 you offer shared methods.

Ok. So even though I declare:

shared Http http = new Http();

I cannot have multiple threads call http.perform without that method 
being marked as shared as well?

If thats the case I'll remove the comment.

 7. HttpStatusLine -> Http.StatusLine. One up for encapsulation.

 8. HttpMethod -> Http.Method. Ditto.

 9. "this(in const(char)[] url);" -> "this(string url);" I assume you
 store a copy of the URL inside by calling e.g. to!string. If the client
 has a string, you refuse to tap into that nice information and share the
 string safely and efficiently, and make a copy of it for no good reason.
 If the client doesn't have a string, no worry, the compiler will tell
 her she needs to do a copy.

I do not store a copy but calls libcurl which in turn does a copy by 
itself. So I believe the current signature is appropriate?

 10. addHeader etc. Same comment. Use string whenever you want to keep a
 copy anyway.

At 9. the string is copied by libcurl itself.

 11. setTimeCondition -> use core.Duration for time representation
 throughout.

Will fix.

 12. AsyncHttpResult -> Http.AsyncResult :o)

 13. HttpResult -> Http.Result

 14. Isn't the max redirect configurable via a parameter?

Yes. It is called Http.followLocation from libcurls followLocation 
option. I will rename it to maxRedirections for clearity.

 15. Typo: "Callbacks is not supported..."

ok.

 16. Documentation should point to descriptions of the HTTP methods
 wrapped (e.g. "post", "del" etc).

Do you mean point to the relevant RFC?

 17. There should be an example with login/password.

ok.

 18. See onReceiveHeader">std.curl.Curl.onReceiveHeader and others look
 like doc macros gone wrong.

 19. "chuncked" -> "chunked"

 20. "Max allowed redirs. -1 for infinite." -> use uint and uint.max for
 infinite.

Will fix.

 21. "short isRunning()" -> what's the semantics of the short?

This is a bug. It returns an internal bool value.

 22. byLine/byChunk should not expose a string or generally data that can
 be shared safely by the client. That's because it would need to create a
 new string with each iteration, which is very inefficient. Instead, they
 should expose char[]/ubyte[] just like File.byLine does, and reuse the
 buffer with each call. Essentially byLine/byChunk should be
 near-zero-overhead means to transfer and inspect arbitrarily large streams.

byLine/byChunk is currenly only available for the async methods which is 
implemented using message passing. This means that for passing the 
message a copy has to be made because a message is by-value or immutable 
data. Is there another way for me to pass the message without doing a 
copy... some kind of move semantic?

 23. Ftp should have get in-memory and streaming byLine/byChunk, not only
 get to file.

Definitely. I wanted to get feedback on HTTP API before doing the same 
to FTP.

 24. The shutdown mechanism should be handled properly. Shutting down
 libcurl would have all pending transfers instantly throw a special
 exception. Without a shutdown API, applications won't be able to
 implement e.g. a Quit button.

If you're only using this API this should be handled. But I see that it 
is not documented.

 Again, great work!

Thank you for the feedback.

May 17 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Thanks for the response! A few more answers and comments within 
(everything deleted counts as "sounds great").

On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 3. Data on wire is ubyte[], not void[].

 Data received is ubyte[] and a convenience method for conversion to
 string is provided.

 Data send is void[] since all arrays implicitely converts to void[].
 This makes e.g [1,2,3,4,5] works as input. Convenience methods are
 provided for chars in order to set HTTP headers correctly.

 For onSend callback it is simply void[] in order to use it as an out
 parameter.
 For Http.post/put it is const(void)[]
 For Http.postAsync/putAsync it is immutable(void)[] because internally I
 need to send the data to another thread using message passing.

 Hopes this makes sense?

I think it's best to systematically use void[] when sending data out 
(i.e. you start with structured data in memory and you let it implicitly 
"decay" as you pass it on the wire), and ubyte[] when receiving data 
(i.e. you start with unstructured data and force the user to explicitly 
build structure on it). As noted the current File API is not to be 
considered an example to follow.

 6. "Do not use the same instance of this class in two threads
 simultanously." -> remove. A D program isn't supposed to do that unless
 you offer shared methods.

 Ok. So even though I declare:

 shared Http http = new Http();

 I cannot have multiple threads call http.perform without that method
 being marked as shared as well?

 If thats the case I'll remove the comment.

Yes, shared limits access to only shared fields and methods.

 9. "this(in const(char)[] url);" -> "this(string url);" I assume you
 store a copy of the URL inside by calling e.g. to!string. If the client
 has a string, you refuse to tap into that nice information and share the
 string safely and efficiently, and make a copy of it for no good reason.
 If the client doesn't have a string, no worry, the compiler will tell
 her she needs to do a copy.

 I do not store a copy but calls libcurl which in turn does a copy by
 itself. So I believe the current signature is appropriate?

Got it, thanks. I sense a little inefficiency here that we can't quite 
get rid of - the user passes a string, we must convert it to a 
zero-terminated char* (one allocation) and then libcurl copies it again 
(two allocations). Oh well.

 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

You mention this about many higher-level functions: "Maximum 
redirections are set to 10." Then I'm thinking, if it's an important 
parameter, make it a defaulted parameter for the function.

 16. Documentation should point to descriptions of the HTTP methods
 wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

Yah, or a good tutorial. (I didn't know DEL existed!)

 22. byLine/byChunk should not expose a string or generally data that can
 be shared safely by the client. That's because it would need to create a
 new string with each iteration, which is very inefficient. Instead, they
 should expose char[]/ubyte[] just like File.byLine does, and reuse the
 buffer with each call. Essentially byLine/byChunk should be
 near-zero-overhead means to transfer and inspect arbitrarily large
 streams.

 byLine/byChunk is currenly only available for the async methods which is
 implemented using message passing. This means that for passing the
 message a copy has to be made because a message is by-value or immutable
 data. Is there another way for me to pass the message without doing a
 copy... some kind of move semantic?

A library element is planned but not on the short list yet. You can work 
around that by internally doing a cast. As long as you observe the 
commitment that buffers are not accessed simultaneously by more than one 
thread you're well within defined behavior.

One more suggestion - you may want to also provide classic boring 
synchronous read/write methods that take a user-provided buffer. I'm 
sure people could use such e.g. when they want to stream data inside 
their own threads, or even in the main thread if they don't mind blocking.

 24. The shutdown mechanism should be handled properly. Shutting down
 libcurl would have all pending transfers instantly throw a special
 exception. Without a shutdown API, applications won't be able to
 implement e.g. a Quit button.

 If you're only using this API this should be handled. But I see that it
 is not documented.

Yah, in the future we need to put an onShutdown protocol inside core, 
similar to atexit() in C. Each library that may block would plant a 
hook. That way an application will be able to exit quickly and 
gracefully even if it has blocked threads.


Andrei

May 17 2011

jdrewsen <jdrewsen nospam.com> writes:

Please see comments below.

Den 17-05-2011 16:42, Andrei Alexandrescu skrev:
 Thanks for the response! A few more answers and comments within
 (everything deleted counts as "sounds great").

 On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

 You mention this about many higher-level functions: "Maximum
 redirections are set to 10." Then I'm thinking, if it's an important
 parameter, make it a defaulted parameter for the function.

I guess I should just remove that comment because other defaults has 
been selected as well e.g. timeouts.

 16. Documentation should point to descriptions of the HTTP methods
 wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

 Yah, or a good tutorial. (I didn't know DEL existed!)

Well DEL is actually called DELETE in the HTTP RFC. But delete is a 
reserved word i D so I used del() instead. delete() wouldn't work in 
following code in think:

with(auto http = ...) {
	delete(...);
}

 22. byLine/byChunk should not expose a string or generally data that can
 be shared safely by the client. That's because it would need to create a
 new string with each iteration, which is very inefficient. Instead, they
 should expose char[]/ubyte[] just like File.byLine does, and reuse the
 buffer with each call. Essentially byLine/byChunk should be
 near-zero-overhead means to transfer and inspect arbitrarily large
 streams.

 byLine/byChunk is currenly only available for the async methods which is
 implemented using message passing. This means that for passing the
 message a copy has to be made because a message is by-value or immutable
 data. Is there another way for me to pass the message without doing a
 copy... some kind of move semantic?

 A library element is planned but not on the short list yet. You can work
 around that by internally doing a cast. As long as you observe the
 commitment that buffers are not accessed simultaneously by more than one
 thread you're well within defined behavior.

ok nice to know.

 One more suggestion - you may want to also provide classic boring
 synchronous read/write methods that take a user-provided buffer. I'm
 sure people could use such e.g. when they want to stream data inside
 their own threads, or even in the main thread if they don't mind blocking.

This would mean hooking into libcurl and selecting on its sockets. 
Totally doable. But I would rather stop here feature wise and wrap this 
thing up. I would like to focus my efforts on a candidate for 
std.net.event/reactor/proactor module.

Anyone who wants to implement this read/write in the curl wrapper are 
ofcourse very welcome.


 24. The shutdown mechanism should be handled properly. Shutting down
 libcurl would have all pending transfers instantly throw a special
 exception. Without a shutdown API, applications won't be able to
 implement e.g. a Quit button.

 If you're only using this API this should be handled. But I see that it
 is not documented.

 Yah, in the future we need to put an onShutdown protocol inside core,
 similar to atexit() in C. Each library that may block would plant a
 hook. That way an application will be able to exit quickly and
 gracefully even if it has blocked threads.

That would be neat.

/Jonas

May 17 2011

Johannes Pfau <spam example.com> writes:

jdrewsen wrote:
Please see comments below.

Den 17-05-2011 16:42, Andrei Alexandrescu skrev:
 Thanks for the response! A few more answers and comments within
 (everything deleted counts as "sounds great").

 On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

 You mention this about many higher-level functions: "Maximum
 redirections are set to 10." Then I'm thinking, if it's an important
 parameter, make it a defaulted parameter for the function.

I guess I should just remove that comment because other defaults has=20
been selected as well e.g. timeouts.

 16. Documentation should point to descriptions of the HTTP methods
 wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

 Yah, or a good tutorial. (I didn't know DEL existed!)

Well DEL is actually called DELETE in the HTTP RFC. But delete is a=20
reserved word i D so I used del() instead. delete() wouldn't work in=20
following code in think:

with(auto http =3D ...) {
	delete(...);
}

 22. byLine/byChunk should not expose a string or generally data
 that can be shared safely by the client. That's because it would
 need to create a new string with each iteration, which is very
 inefficient. Instead, they should expose char[]/ubyte[] just like
 File.byLine does, and reuse the buffer with each call. Essentially
 byLine/byChunk should be near-zero-overhead means to transfer and
 inspect arbitrarily large streams.

 byLine/byChunk is currenly only available for the async methods
 which is implemented using message passing. This means that for
 passing the message a copy has to be made because a message is
 by-value or immutable data. Is there another way for me to pass the
 message without doing a copy... some kind of move semantic?

 A library element is planned but not on the short list yet. You can
 work around that by internally doing a cast. As long as you observe
 the commitment that buffers are not accessed simultaneously by more
 than one thread you're well within defined behavior.

ok nice to know.

 One more suggestion - you may want to also provide classic boring
 synchronous read/write methods that take a user-provided buffer. I'm
 sure people could use such e.g. when they want to stream data inside
 their own threads, or even in the main thread if they don't mind
 blocking.

This would mean hooking into libcurl and selecting on its sockets.=20
Totally doable.

I'm not sure if it'd be that easy. Using select to implement a blocking
api is possible, but select is only used to wait until data is ready,
reading and writing the data is still done by curl. And as curl does
not allow you to supply the data buffer, having user-provided buffers
is afaik impossible.

But I would rather stop here feature wise and wrap
this thing up. I would like to focus my efforts on a candidate for=20
std.net.event/reactor/proactor module.

Have you already started working on std.net.event? Do you plan to base
that on libev / libevent?

I've been trying to get familiar with libev lately, so I have bindings
for livev 3.9 / 4.04 (combined, version can be selected with version
statements): http://dl.dropbox.com/u/24218791/d/src/libev.html
(comments in the c header have been turned into doc comments. The
binding is partially based on Leandro Lucarella bindings:
http://git.llucax.com.ar/?p=3Dsoftware/ev.d.git)

--=20
Johannes Pfau

May 18 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 Please see comments below.

 Den 17-05-2011 16:42, Andrei Alexandrescu skrev:
 Thanks for the response! A few more answers and comments within
 (everything deleted counts as "sounds great").

 On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

 You mention this about many higher-level functions: "Maximum
 redirections are set to 10." Then I'm thinking, if it's an important
 parameter, make it a defaulted parameter for the function.

 I guess I should just remove that comment because other defaults has
 been selected as well e.g. timeouts.

 16. Documentation should point to descriptions of the HTTP methods
 wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

 Yah, or a good tutorial. (I didn't know DEL existed!)

 Well DEL is actually called DELETE in the HTTP RFC. But delete is a
 reserved word i D so I used del() instead. delete() wouldn't work in
 following code in think:

 with(auto http = ...) {
 	delete(...);
 }

 22. byLine/byChunk should not expose a string or generally data
 that can be shared safely by the client. That's because it would
 need to create a new string with each iteration, which is very
 inefficient. Instead, they should expose char[]/ubyte[] just like
 File.byLine does, and reuse the buffer with each call. Essentially
 byLine/byChunk should be near-zero-overhead means to transfer and
 inspect arbitrarily large streams.

 byLine/byChunk is currenly only available for the async methods
 which is implemented using message passing. This means that for
 passing the message a copy has to be made because a message is
 by-value or immutable data. Is there another way for me to pass the
 message without doing a copy... some kind of move semantic?

 A library element is planned but not on the short list yet. You can
 work around that by internally doing a cast. As long as you observe
 the commitment that buffers are not accessed simultaneously by more
 than one thread you're well within defined behavior.

 ok nice to know.

 One more suggestion - you may want to also provide classic boring
 synchronous read/write methods that take a user-provided buffer. I'm
 sure people could use such e.g. when they want to stream data inside
 their own threads, or even in the main thread if they don't mind
 blocking.

 This would mean hooking into libcurl and selecting on its sockets.
 Totally doable.

 I'm not sure if it'd be that easy. Using select to implement a blocking
 api is possible, but select is only used to wait until data is ready,
 reading and writing the data is still done by curl. And as curl does
 not allow you to supply the data buffer, having user-provided buffers
 is afaik impossible.

_Doable_ - not easy :)

Select will wait for data to be ready and ask curl to handle the data 
chunk. Curl in turn calls back to a registered callback handler with the
data read. That handler fills the buffer provided by the user. If not 
enough data has been receive an new select is performed until the 
requested amount of data is read. Then the blocking method can return.

 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to base
 that on libev / libevent?

I'll finish the curl wrapper before starting on it.

I have had a look at libev/event before and they are very nice libs. But 
I think I'll have to implement it from scratch for two reasons:

1, AFAIK we cannot include code in std phobos that is not boost licensed 
or a license as liberal. libev for example requires you to distribute 
the license with your software.

2, It introduces a dependency to an external project in phobos. 
Currently no dependencies are present. This makes phobos very easy to 
install and use out of the box.

The optimal solution to these problems from my point of view would be to 
get that "much discussed" package tool in place soon (CPAN,apt alike).

Heck, now I think about it maybe I should do that before any std.net 
stuff. Let's see how the wind blows - many interesting projects :)

 I've been trying to get familiar with libev lately, so I have bindings
 for livev 3.9 / 4.04 (combined, version can be selected with version
 statements): http://dl.dropbox.com/u/24218791/d/src/libev.html
 (comments in the c header have been turned into doc comments. The
 binding is partially based on Leandro Lucarella bindings:
 http://git.llucax.com.ar/?p=software/ev.d.git)

Very nice. I've also stumbled upon some other async D libs out there 
that could be a good starting point.

It seems that Leandro has both bindings and a wrapper in place. And a 
wrapper is really nice to have to make it more D'ish.

/Jonas

May 18 2011

Johannes Pfau <spam example.com> writes:

Jonas Drewsen wrote:
On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 Please see comments below.

 Den 17-05-2011 16:42, Andrei Alexandrescu skrev:
 Thanks for the response! A few more answers and comments within
 (everything deleted counts as "sounds great").

 On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

 You mention this about many higher-level functions: "Maximum
 redirections are set to 10." Then I'm thinking, if it's an
 important parameter, make it a defaulted parameter for the
 function.

 I guess I should just remove that comment because other defaults has
 been selected as well e.g. timeouts.

 16. Documentation should point to descriptions of the HTTP
 methods wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

 Yah, or a good tutorial. (I didn't know DEL existed!)

 Well DEL is actually called DELETE in the HTTP RFC. But delete is a
 reserved word i D so I used del() instead. delete() wouldn't work in
 following code in think:

 with(auto http =3D ...) {
 	delete(...);
 }

 22. byLine/byChunk should not expose a string or generally data
 that can be shared safely by the client. That's because it would
 need to create a new string with each iteration, which is very
 inefficient. Instead, they should expose char[]/ubyte[] just like
 File.byLine does, and reuse the buffer with each call.
 Essentially byLine/byChunk should be near-zero-overhead means to
 transfer and inspect arbitrarily large streams.

 byLine/byChunk is currenly only available for the async methods
 which is implemented using message passing. This means that for
 passing the message a copy has to be made because a message is
 by-value or immutable data. Is there another way for me to pass
 the message without doing a copy... some kind of move semantic?

 A library element is planned but not on the short list yet. You can
 work around that by internally doing a cast. As long as you observe
 the commitment that buffers are not accessed simultaneously by more
 than one thread you're well within defined behavior.

 ok nice to know.

 One more suggestion - you may want to also provide classic boring
 synchronous read/write methods that take a user-provided buffer.
 I'm sure people could use such e.g. when they want to stream data
 inside their own threads, or even in the main thread if they don't
 mind blocking.

 This would mean hooking into libcurl and selecting on its sockets.
 Totally doable.

 I'm not sure if it'd be that easy. Using select to implement a
 blocking api is possible, but select is only used to wait until data
 is ready, reading and writing the data is still done by curl. And as
 curl does not allow you to supply the data buffer, having
 user-provided buffers is afaik impossible.

_Doable_ - not easy :)

Select will wait for data to be ready and ask curl to handle the data=20
chunk. Curl in turn calls back to a registered callback handler with
the data read. That handler fills the buffer provided by the user. If
not enough data has been receive an new select is performed until the=20
requested amount of data is read. Then the blocking method can return.

I think it's not necessary to do multiple calls if the first didn't
return enough data. Most read calls are only guaranteed to return a
maximum of data, less is always possible. It's a bigger problem if
the user supplied buffer is smaller than the curl buffer. And you
always end up copying data. So it might be better
to just return a reference to the buffer allocated by curl, or just
forget about this api, curl just doesn't work well this way.

 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to
 base that on libev / libevent?

I'll finish the curl wrapper before starting on it.

I have had a look at libev/event before and they are very nice libs.
But I think I'll have to implement it from scratch for two reasons:

1, AFAIK we cannot include code in std phobos that is not boost
licensed or a license as liberal. libev for example requires you to
distribute the license with your software.

That could indeed be a problem. I think libev/libevent even have the
same license. Probably it's really best to implement it from scratch,
but it will be quite some work. I'd still have a look at the libev
documentation though, it contains some details about OS quirks. And try
to avoid Linux AIO ;-)

2, It introduces a dependency to an external project in phobos.=20
Currently no dependencies are present. This makes phobos very easy to=20
install and use out of the box.

The optimal solution to these problems from my point of view would be
to get that "much discussed" package tool in place soon (CPAN,apt
alike).

Heck, now I think about it maybe I should do that before any std.net=20
stuff. Let's see how the wind blows - many interesting projects :)

 I've been trying to get familiar with libev lately, so I have
 bindings for livev 3.9 / 4.04 (combined, version can be selected
 with version statements):
 http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c
 header have been turned into doc comments. The binding is partially
 based on Leandro Lucarella bindings:
 http://git.llucax.com.ar/?p=3Dsoftware/ev.d.git)

Very nice. I've also stumbled upon some other async D libs out there=20
that could be a good starting point.

It seems that Leandro has both bindings and a wrapper in place. And a=20
wrapper is really nice to have to make it more D'ish.

Yes, I have a higher level wrapper for libev as well, but the new
'shared' stuff made it a lot more complicated to get that right.
/Jonas

--=20
Johannes Pfau

May 18 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 18-05-2011 14:52, Johannes Pfau skrev:
 Jonas Drewsen wrote:
 On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 Please see comments below.

 Den 17-05-2011 16:42, Andrei Alexandrescu skrev:
 Thanks for the response! A few more answers and comments within
 (everything deleted counts as "sounds great").

 On 5/17/11 3:50 AM, Jonas Drewsen wrote:
 14. Isn't the max redirect configurable via a parameter?

 Yes. It is called Http.followLocation from libcurls followLocation
 option. I will rename it to maxRedirections for clearity.

 You mention this about many higher-level functions: "Maximum
 redirections are set to 10." Then I'm thinking, if it's an
 important parameter, make it a defaulted parameter for the
 function.

 I guess I should just remove that comment because other defaults has
 been selected as well e.g. timeouts.

 16. Documentation should point to descriptions of the HTTP
 methods wrapped (e.g. "post", "del" etc).

 Do you mean point to the relevant RFC?

 Yah, or a good tutorial. (I didn't know DEL existed!)

 Well DEL is actually called DELETE in the HTTP RFC. But delete is a
 reserved word i D so I used del() instead. delete() wouldn't work in
 following code in think:

 with(auto http = ...) {
 	delete(...);
 }

 22. byLine/byChunk should not expose a string or generally data
 that can be shared safely by the client. That's because it would
 need to create a new string with each iteration, which is very
 inefficient. Instead, they should expose char[]/ubyte[] just like
 File.byLine does, and reuse the buffer with each call.
 Essentially byLine/byChunk should be near-zero-overhead means to
 transfer and inspect arbitrarily large streams.

 byLine/byChunk is currenly only available for the async methods
 which is implemented using message passing. This means that for
 passing the message a copy has to be made because a message is
 by-value or immutable data. Is there another way for me to pass
 the message without doing a copy... some kind of move semantic?

 A library element is planned but not on the short list yet. You can
 work around that by internally doing a cast. As long as you observe
 the commitment that buffers are not accessed simultaneously by more
 than one thread you're well within defined behavior.

 ok nice to know.

 One more suggestion - you may want to also provide classic boring
 synchronous read/write methods that take a user-provided buffer.
 I'm sure people could use such e.g. when they want to stream data
 inside their own threads, or even in the main thread if they don't
 mind blocking.

 This would mean hooking into libcurl and selecting on its sockets.
 Totally doable.

 I'm not sure if it'd be that easy. Using select to implement a
 blocking api is possible, but select is only used to wait until data
 is ready, reading and writing the data is still done by curl. And as
 curl does not allow you to supply the data buffer, having
 user-provided buffers is afaik impossible.

 _Doable_ - not easy :)

 Select will wait for data to be ready and ask curl to handle the data
 chunk. Curl in turn calls back to a registered callback handler with
 the data read. That handler fills the buffer provided by the user. If
 not enough data has been receive an new select is performed until the
 requested amount of data is read. Then the blocking method can return.

 I think it's not necessary to do multiple calls if the first didn't
 return enough data. Most read calls are only guaranteed to return a
 maximum of data, less is always possible. It's a bigger problem if
 the user supplied buffer is smaller than the curl buffer. And you
 always end up copying data. So it might be better
 to just return a reference to the buffer allocated by curl, or just
 forget about this api, curl just doesn't work well this way.

Yeah. That's part of the reason why I'm not interested in implementing it.

 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to
 base that on libev / libevent?

 I'll finish the curl wrapper before starting on it.

 I have had a look at libev/event before and they are very nice libs.
 But I think I'll have to implement it from scratch for two reasons:

 1, AFAIK we cannot include code in std phobos that is not boost
 licensed or a license as liberal. libev for example requires you to
 distribute the license with your software.

 That could indeed be a problem. I think libev/libevent even have the
 same license. Probably it's really best to implement it from scratch,
 but it will be quite some work. I'd still have a look at the libev
 documentation though, it contains some details about OS quirks. And try
 to avoid Linux AIO ;-)
 2, It introduces a dependency to an external project in phobos.
 Currently no dependencies are present. This makes phobos very easy to
 install and use out of the box.

 The optimal solution to these problems from my point of view would be
 to get that "much discussed" package tool in place soon (CPAN,apt
 alike).

 Heck, now I think about it maybe I should do that before any std.net
 stuff. Let's see how the wind blows - many interesting projects :)

 I've been trying to get familiar with libev lately, so I have
 bindings for livev 3.9 / 4.04 (combined, version can be selected
 with version statements):
 http://dl.dropbox.com/u/24218791/d/src/libev.html (comments in the c
 header have been turned into doc comments. The binding is partially
 based on Leandro Lucarella bindings:
 http://git.llucax.com.ar/?p=software/ev.d.git)

 Very nice. I've also stumbled upon some other async D libs out there
 that could be a good starting point.

 It seems that Leandro has both bindings and a wrapper in place. And a
 wrapper is really nice to have to make it more D'ish.

 Yes, I have a higher level wrapper for libev as well, but the new
 'shared' stuff made it a lot more complicated to get that right.
 /Jonas

May 18 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/18/11 6:07 AM, Jonas Drewsen wrote:
 Select will wait for data to be ready and ask curl to handle the data
 chunk. Curl in turn calls back to a registered callback handler with the
 data read. That handler fills the buffer provided by the user. If not
 enough data has been receive an new select is performed until the
 requested amount of data is read. Then the blocking method can return.

Perhaps this would be too complicated. In any case the core 
functionality must be paid top attention. And the core functionality is 
streaming.

Currently there are two proposed ways to stream data from an HTTP 
address: (a) by using the onReceive callback, and (b) by using 
byLine/byChunk. If either of these perform slower than the 
best-of-the-breed streaming using libcurl, we have failed.

The onReceive method is not particularly appealing because the client 
and libcurl block each other: the client is blocked while libcurl is 
waiting for data, and the client blocks libcurl while inside the 
callback. (Please correct me if I'm wrong.)

To make byLine/byChunk fast, the basic setup should include a hidden 
thread that does the download in separation from the client's thread. 
There should be K buffers allocated (K = 2 to e.g. 10), and a simple 
protocol for passing the buffers back and forth between the client 
thread and the hidden thread. That way, in the quiescent state, there is 
no memory allocation and either both client and libcurl are busy doing 
work, or one is much slower than the other, which waits.

The same mechanism should be used in byChunkAsync or byFileAsync.


Thanks,

Andrei

May 18 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 18-05-2011 16:53, Andrei Alexandrescu skrev:
 On 5/18/11 6:07 AM, Jonas Drewsen wrote:
 Select will wait for data to be ready and ask curl to handle the data
 chunk. Curl in turn calls back to a registered callback handler with the
 data read. That handler fills the buffer provided by the user. If not
 enough data has been receive an new select is performed until the
 requested amount of data is read. Then the blocking method can return.

 Perhaps this would be too complicated. In any case the core
 functionality must be paid top attention. And the core functionality is
 streaming.

 Currently there are two proposed ways to stream data from an HTTP
 address: (a) by using the onReceive callback, and (b) by using
 byLine/byChunk. If either of these perform slower than the
 best-of-the-breed streaming using libcurl, we have failed.

 The onReceive method is not particularly appealing because the client
 and libcurl block each other: the client is blocked while libcurl is
 waiting for data, and the client blocks libcurl while inside the
 callback. (Please correct me if I'm wrong.)

 To make byLine/byChunk fast, the basic setup should include a hidden
 thread that does the download in separation from the client's thread.
 There should be K buffers allocated (K = 2 to e.g. 10), and a simple
 protocol for passing the buffers back and forth between the client
 thread and the hidden thread. That way, in the quiescent state, there is
 no memory allocation and either both client and libcurl are busy doing
 work, or one is much slower than the other, which waits.

 The same mechanism should be used in byChunkAsync or byFileAsync.

If byChunk is using a hidden thread to download into buffers, then how 
does it differ from the byChunkAsync that you mention?

The current curl wrapper actually does the hidden thread trick (based on 
a hint you gave me a while ago). It does not reuse buffers because I 
thought that all data had to be immutable or by value to go through the 
message passing system. I'll fix this since it is a good place to do 
some type casting to allow passing the buffers for reuse.

I think that we have to consider the context of the streaming before we 
can tell the best solution. I do not have any number to back the 
following up, but this is how I see it:

If data that is read is going to be processed (e.g. compressed) in some 
way it is most likely a benefit to spawn a thread to handle the data 
buffering.

If no processing is done (e.g. a simple copy from net to disk) I believe 
keeping things in the same thread and simply select on sockets (disk or 
net) is fastest. This way no message passing and context switching is 
taking place and does cause any overhead. libcurl can give you access to 
the file descriptors for this exact purpose but it does have some 
drawbacks: you are not in control of the buffers used by libcurl. This 
means that reading from one curl connection and sending on another you 
would have to copy the data. libcurl does in fact provide even simpler 
methods where you can provide your own buffers for read/writes. 
Unfortunately this is only supported for HTTP and a lot of the 
convenience features such as redirections are lost. The more you want to 
control to get the last drop of performance, the more you have to 
manually handle yourself.

In my opinion I think that providing the performance of the standard 
libcurl API in the D wrapper is the way to go (as done in the current 
curl wrapper). Generic and efficient streaming across protocols is best 
done in std.net where buffers can be handled entirely in D. I know this 
is not a small task which is why I started out with wrapping libcurl.

Thanks
Jonas

May 18 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/18/11 5:29 PM, jdrewsen wrote:
 Den 18-05-2011 16:53, Andrei Alexandrescu skrev:
 On 5/18/11 6:07 AM, Jonas Drewsen wrote:
 Select will wait for data to be ready and ask curl to handle the data
 chunk. Curl in turn calls back to a registered callback handler with the
 data read. That handler fills the buffer provided by the user. If not
 enough data has been receive an new select is performed until the
 requested amount of data is read. Then the blocking method can return.

 Perhaps this would be too complicated. In any case the core
 functionality must be paid top attention. And the core functionality is
 streaming.

 Currently there are two proposed ways to stream data from an HTTP
 address: (a) by using the onReceive callback, and (b) by using
 byLine/byChunk. If either of these perform slower than the
 best-of-the-breed streaming using libcurl, we have failed.

 The onReceive method is not particularly appealing because the client
 and libcurl block each other: the client is blocked while libcurl is
 waiting for data, and the client blocks libcurl while inside the
 callback. (Please correct me if I'm wrong.)

 To make byLine/byChunk fast, the basic setup should include a hidden
 thread that does the download in separation from the client's thread.
 There should be K buffers allocated (K = 2 to e.g. 10), and a simple
 protocol for passing the buffers back and forth between the client
 thread and the hidden thread. That way, in the quiescent state, there is
 no memory allocation and either both client and libcurl are busy doing
 work, or one is much slower than the other, which waits.

 The same mechanism should be used in byChunkAsync or byFileAsync.

 If byChunk is using a hidden thread to download into buffers, then how
 does it differ from the byChunkAsync that you mention?

Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as 
byFileAsync) would be methods File.

 The current curl wrapper actually does the hidden thread trick (based on
 a hint you gave me a while ago). It does not reuse buffers because I
 thought that all data had to be immutable or by value to go through the
 message passing system. I'll fix this since it is a good place to do
 some type casting to allow passing the buffers for reuse.

Great, thanks. Don't forget there's great motivation to do so.

 I think that we have to consider the context of the streaming before we
 can tell the best solution. I do not have any number to back the
 following up, but this is how I see it:

 If data that is read is going to be processed (e.g. compressed) in some
 way it is most likely a benefit to spawn a thread to handle the data
 buffering.

 If no processing is done (e.g. a simple copy from net to disk) I believe
 keeping things in the same thread and simply select on sockets (disk or
 net) is fastest.

Not at all. If operating with the network and operating with the disk 
block each other, you're guaranteed to be slower than the slowest of them.

Consider that disk speed is V1 MB/s and network speed is V2 MB/s, and 
that they're independent of each other. If you do one thing at a time, 
you need to take 1/V1 + 1/V2 seconds to transfer one MB. The speed of 
the process is therefore 1/(1/V1 + 1/V2) = V1 * V2 / (V1 + V2).

If the two devices have comparable speeds, you're halving the speed. As 
soon as you do buffering with two threads you can easily reach close to 
the minimum of the two speeds, which is the theoretical best.

 In my opinion I think that providing the performance of the standard
 libcurl API in the D wrapper is the way to go (as done in the current
 curl wrapper). Generic and efficient streaming across protocols is best
 done in std.net where buffers can be handled entirely in D. I know this
 is not a small task which is why I started out with wrapping libcurl.

Sounds reasonable, although if you take care of recycling the buffers in 
the implementation you have, your wrapper may as well be the best of breed.


Andrei

May 18 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 19-05-2011 00:54, Andrei Alexandrescu skrev:
 On 5/18/11 5:29 PM, jdrewsen wrote:
 Den 18-05-2011 16:53, Andrei Alexandrescu skrev:
 On 5/18/11 6:07 AM, Jonas Drewsen wrote:
 Select will wait for data to be ready and ask curl to handle the data
 chunk. Curl in turn calls back to a registered callback handler with
 the
 data read. That handler fills the buffer provided by the user. If not
 enough data has been receive an new select is performed until the
 requested amount of data is read. Then the blocking method can return.

 Perhaps this would be too complicated. In any case the core
 functionality must be paid top attention. And the core functionality is
 streaming.

 Currently there are two proposed ways to stream data from an HTTP
 address: (a) by using the onReceive callback, and (b) by using
 byLine/byChunk. If either of these perform slower than the
 best-of-the-breed streaming using libcurl, we have failed.

 The onReceive method is not particularly appealing because the client
 and libcurl block each other: the client is blocked while libcurl is
 waiting for data, and the client blocks libcurl while inside the
 callback. (Please correct me if I'm wrong.)

 To make byLine/byChunk fast, the basic setup should include a hidden
 thread that does the download in separation from the client's thread.
 There should be K buffers allocated (K = 2 to e.g. 10), and a simple
 protocol for passing the buffers back and forth between the client
 thread and the hidden thread. That way, in the quiescent state, there is
 no memory allocation and either both client and libcurl are busy doing
 work, or one is much slower than the other, which waits.

 The same mechanism should be used in byChunkAsync or byFileAsync.

 If byChunk is using a hidden thread to download into buffers, then how
 does it differ from the byChunkAsync that you mention?

 Sorry, byChunkAsync and byLineAsync (which I wrongly denoted as
 byFileAsync) would be methods File.

 The current curl wrapper actually does the hidden thread trick (based on
 a hint you gave me a while ago). It does not reuse buffers because I
 thought that all data had to be immutable or by value to go through the
 message passing system. I'll fix this since it is a good place to do
 some type casting to allow passing the buffers for reuse.

 Great, thanks. Don't forget there's great motivation to do so.

 I think that we have to consider the context of the streaming before we
 can tell the best solution. I do not have any number to back the
 following up, but this is how I see it:

 If data that is read is going to be processed (e.g. compressed) in some
 way it is most likely a benefit to spawn a thread to handle the data
 buffering.

 If no processing is done (e.g. a simple copy from net to disk) I believe
 keeping things in the same thread and simply select on sockets (disk or
 net) is fastest.

 Not at all. If operating with the network and operating with the disk
 block each other, you're guaranteed to be slower than the slowest of them.

 Consider that disk speed is V1 MB/s and network speed is V2 MB/s, and
 that they're independent of each other. If you do one thing at a time,
 you need to take 1/V1 + 1/V2 seconds to transfer one MB. The speed of
 the process is therefore 1/(1/V1 + 1/V2) = V1 * V2 / (V1 + V2).

 If the two devices have comparable speeds, you're halving the speed. As
 soon as you do buffering with two threads you can easily reach close to
 the minimum of the two speeds, which is the theoretical best.

It see your point. By buffering data asynchronously the reads and writes 
don't block each other and this increases performance. The thing is that 
the OS already does buffering for us. So while we're writing data to 
disk the OS is buffering incoming data from the network asynchronously.


 In my opinion I think that providing the performance of the standard
 libcurl API in the D wrapper is the way to go (as done in the current
 curl wrapper). Generic and efficient streaming across protocols is best
 done in std.net where buffers can be handled entirely in D. I know this
 is not a small task which is why I started out with wrapping libcurl.

 Sounds reasonable, although if you take care of recycling the buffers in
 the implementation you have, your wrapper may as well be the best of breed.


 Andrei

May 19 2011

"Masahiro Nakagawa" <repeatedly gmail.com> writes:

On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to base
 that on libev / libevent?

 I'll finish the curl wrapper before starting on it.

 I have had a look at libev/event before and they are very nice libs. But  
 I think I'll have to implement it from scratch for two reasons:

 1, AFAIK we cannot include code in std phobos that is not boost licensed  
 or a license as liberal. libev for example requires you to distribute  
 the license with your software.

 2, It introduces a dependency to an external project in phobos.  
 Currently no dependencies are present. This makes phobos very easy to  
 install and use out of the box.

 The optimal solution to these problems from my point of view would be to  
 get that "much discussed" package tool in place soon (CPAN,apt alike).

 Heck, now I think about it maybe I should do that before any std.net  
 stuff. Let's see how the wind blows - many interesting projects :)

Good!

Last year, I tried to implement std.event. In Windows, using IOCP with  
Overlapped-IO for proactor.
Others, IO multiplexing with worker-thread for proactor emulation.
But Johannes already started implementing event module using libev,
so I deleted this task from my tasks.

My pending projects need event module. I expect your efforts!


Masahiro

P.S.
cURL wrapper is nice. I wait this :)

May 23 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 23-05-2011 21:02, Masahiro Nakagawa skrev:
 On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com> 
 wrote:
 
 On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to base
 that on libev / libevent?

 I'll finish the curl wrapper before starting on it.

 I have had a look at libev/event before and they are very nice libs. 
 But I think I'll have to implement it from scratch for two reasons:

 1, AFAIK we cannot include code in std phobos that is not boost 
 licensed or a license as liberal. libev for example requires you to 
 distribute the license with your software.

 2, It introduces a dependency to an external project in phobos. 
 Currently no dependencies are present. This makes phobos very easy to 
 install and use out of the box.

 The optimal solution to these problems from my point of view would be 
 to get that "much discussed" package tool in place soon (CPAN,apt alike).

 Heck, now I think about it maybe I should do that before any std.net 
 stuff. Let's see how the wind blows - many interesting projects :)

 
 Good!
 
 Last year, I tried to implement std.event. In Windows, using IOCP with 
 Overlapped-IO for proactor.
 Others, IO multiplexing with worker-thread for proactor emulation.
 But Johannes already started implementing event module using libev,
 so I deleted this task from my tasks.
 

If you have any code for inspiration or that could be used as a base I
would very much like to see/use it. Especially if it is under Boost license.

I would like to create reactor initially and proceed to build the
proactor on top of that (not on top for windows and IOCP of course).
Reactor and proactor designs each have their forces and therefore I do
not think targeting a proactor only design is the way to go.

/Jonas

May 23 2011

"Masahiro Nakagawa" <repeatedly gmail.com> writes:

On Tue, 24 May 2011 04:50:10 +0900, jdrewsen <jdrewsen nospam.com> wrote:

 Den 23-05-2011 21:02, Masahiro Nakagawa skrev:
 On Wed, 18 May 2011 20:07:01 +0900, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 On 18/05/11 10.09, Johannes Pfau wrote:
 jdrewsen wrote:
 But I would rather stop here feature wise and wrap
 this thing up. I would like to focus my efforts on a candidate for
 std.net.event/reactor/proactor module.

 Have you already started working on std.net.event? Do you plan to base
 that on libev / libevent?

 I'll finish the curl wrapper before starting on it.

 I have had a look at libev/event before and they are very nice libs.
 But I think I'll have to implement it from scratch for two reasons:

 1, AFAIK we cannot include code in std phobos that is not boost
 licensed or a license as liberal. libev for example requires you to
 distribute the license with your software.

 2, It introduces a dependency to an external project in phobos.
 Currently no dependencies are present. This makes phobos very easy to
 install and use out of the box.

 The optimal solution to these problems from my point of view would be
 to get that "much discussed" package tool in place soon (CPAN,apt  
 alike).

 Heck, now I think about it maybe I should do that before any std.net
 stuff. Let's see how the wind blows - many interesting projects :)

 Good!

 Last year, I tried to implement std.event. In Windows, using IOCP with
 Overlapped-IO for proactor.
 Others, IO multiplexing with worker-thread for proactor emulation.
 But Johannes already started implementing event module using libev,
 so I deleted this task from my tasks.

 If you have any code for inspiration or that could be used as a base I
 would very much like to see/use it. Especially if it is under Boost  
 license.

My old design is based on Boost/Asio. After I rewrote std.socket(including
other utilities such as DNS, IPAddresses), I stopped implementing task.

 I would like to create reactor initially and proceed to build the
 proactor on top of that (not on top for windows and IOCP of course).
 Reactor and proactor designs each have their forces and therefore I do
 not think targeting a proactor only design is the way to go.

I agree. General library should provides two approaches.
My main purpose is high-performance RPC. In such cases,
Proactor is better(Java's Netty provides good abstraction APIs).

May 29 2011

Jimmy Cao <jcao219 gmail.com> writes:

Is there any reason why this code (using SMTP + Curl):
https://gist.github.com/998214 segfaults on Windows but not on Fedora 15?

I found an easy fix, but it looks like a bug somewhere that I can't figure
out.

May 29 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 30-05-2011 01:16, Jimmy Cao skrev:
 Is there any reason why this code (using SMTP + Curl):
 https://gist.github.com/998214 segfaults on Windows but not on Fedora 15?

 I found an easy fix, but it looks like a bug somewhere that I can't
 figure out.

etc.curl is currently a moving target. The last version I pushed to 
github is outdated. Currently the focus is on getting http and ftp 
working and the other protocol hasn't been tested.

Out of curiosity I would very much like to known what your easy fix was?

/Jonas

May 30 2011

Jimmy Cao <jcao219 gmail.com> writes:

On Mon, May 30, 2011 at 12:38 PM, jdrewsen <jdrewsen nospam.com> wrote:

 Den 30-05-2011 01:16, Jimmy Cao skrev:

  Is there any reason why this code (using SMTP + Curl):
 https://gist.github.com/998214 segfaults on Windows but not on Fedora 15?

 I found an easy fix, but it looks like a bug somewhere that I can't
 figure out.

 etc.curl is currently a moving target. The last version I pushed to github
 is outdated. Currently the focus is on getting http and ftp working and the
 other protocol hasn't been tested.

 Out of curiosity I would very much like to known what your easy fix was?

 /Jonas


Uncomment line 19 and it works.

May 30 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:

 Hi,

     I've been working on a wrapper for the etc.c.curl module. It is now  
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the  
 generated docs. Is there any way I can make this work?

 /Jonas

I think the API looks good with regard to the core functionality it  
provides, but the documentation is sub-par, an certain structs should be  
defined inside the Http class. I've listed a bunch of comments I had while  
reading the documentation below.

One minor question regarding the actual implementation: do you define a  
pragma(lib, ...) to ease setup/tool chain pains, or do you intend  
something different?

Regarding the First example:
  * Use string instead of const(char)[]
  * Don't waste vertical space with empty comment lines like this
//
// This comment style
//
  * Comment lines should be grammatically correct, i.e. '// GET using async  
range' should be '// GET using an asynchronous range' or '// Perform a GET  
using an asynchronous range', etc
  * Don't use 'l' as a variable. It looks like a 1. Also, using 'line'  
would make the meaning clearer
  * Example code should not exceed 80 chars in width. People's browser  
setups vary, and wrapped lines are bad.
  * You can drop the 'delegate size_t' from the PUT example by making the  
return 0u, 0U or size_t.init.
  * The last example is missing a description. (Also, it goes over 80 chars)

The license section should include a mention of libcurl's license, etc in  
addition to the wrapper's license. In general, for non-Boost libraries  
like this, should we include the license/copyright as a string/tuple enum?  
(i.e. for use in about boxes / command line options) Should said  
license/copyright be register to a global repository upon module import?  
(i.e. for auto-magical about boxes, etc)

'Do not use the same instance of this class in two threads simultanously.'  
First, you need to spell check all your documentation (i.e. simultaneously  
vs simultanously). Second, this should be assumed by most programmers to  
be true of all non-shared classes. Third, if you do want to leave this  
line in, use something short and sweet, like 'Http is not thread safe',  
instead.

You need to look at/think about the order in which things are declared.  
Having things like HttpMethod declared long after Http doesn't make any  
sense. Also, it looks like you're not using '///ditto' for HttpMethod.

On that note, why is HttpMethod a free-standing enum and not declared  
inside http? Http.Method, Http.Result, etc. make a lot more sense.

The setTimeCondition documentation seems a bit confusing.

I'd recommend stream-lining the convenience function documentation. For  
example, http.head only requires a /// Returns: An HttpResult containing  
the url's headers. Alternatively, you could use ditto and merge the  
documentation and examples for head/get/etc into a single block. Also,  
headAsync seems to document a data argument, but not take one, while post  
has an undocumented parameter, etc. In general, you might want to consider  
removing some of the redundant/self-obvious parameter documentation  
blocks. On second thought, use ditto, slim down the docs and move this set  
of functions either to the beginning or end of Http. It would make sense  
to put these right after the definition of Http.Method/method. You also  
might want to consider grouping all the xxxAsync functions together,  
instead of interleaving them.

Check the grammar in http.postData

http.onReceiveHeader has issues with it's 'See' link. It's example is too  
wide. You don't need really need the parameter documentation, particularly  
if you improve the example, i.e.:
string[string] headerInformation;
with(auto http = new Http("http://www.google.com")) {
     onReceiveHeader = (string key, string value) {
         if ( value.length <= 10 )
             headerInformation[key.idup] = value.idup;
     };
     onReceive = (ubyte[] data) { };
     perform;
}


Also, an tab/indent should be 4 spaces not 5 and no example should start  
indented. Remember to comment like this:
          /** My regular comment
           *  Example:
---
// Example code
---
           * Other Stuff:
           */

You need to fix/complete onSend's documentation.

contentLength(size_t len); => contentLength(size_t length);

'Perform http request' => 'Perform an http request'

AuthMethod/CurlAuth need to be part of docs.

Why is setAuthenticationMethod not authenticationMethod or authentication?  
(And why is AuthMethod not AuthenticationMethod?)
Similarly, why is setTimeCondition not timeCondition and CurlTimeCond not  
Http.TimeCondition?

Why is followLocation(int maxRedirs) not named maxRedirections?

HttpResult.text should have huge warning flags with regard to use and/or  
have its design re-thought. Is there some reason you're not at least  
caching the result and throwing errors on invalid uses of content, etc.  
Calling text multiple times should be okay. And calling content after text  
should be either logically valid, or throw.

AsyncHttpResult.byLine/byChunk examples is improperly indented and too  
wide.

Why is HttpStatusLine not Http.Status?

May 16 2011

Jonas Drewsen <jdrewsen nospam.com> writes:

On 17/05/11 08.03, Robert Jacques wrote:
 On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com> wrote:

 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas

 I think the API looks good with regard to the core functionality it
 provides, but the documentation is sub-par, an certain structs should be
 defined inside the Http class. I've listed a bunch of comments I had
 while reading the documentation below.

 One minor question regarding the actual implementation: do you define a
 pragma(lib, ...) to ease setup/tool chain pains, or do you intend
 something different?

I define a pragma yes.

 Regarding the First example:
 * Use string instead of const(char)[]

Please see my reply to Andrei

 * Don't waste vertical space with empty comment lines like this

ok

 //
 // This comment style
 //
 * Comment lines should be grammatically correct, i.e. '// GET using
 async range' should be '// GET using an asynchronous range' or '//
 Perform a GET using an asynchronous range', etc

ok

 * Don't use 'l' as a variable. It looks like a 1. Also, using 'line'
 would make the meaning clearer

ok

 * Example code should not exceed 80 chars in width. People's browser
 setups vary, and wrapped lines are bad.

ok

 * You can drop the 'delegate size_t' from the PUT example by making the
 return 0u, 0U or size_t.init.

nice. will do.

 * The last example is missing a description. (Also, it goes over 80 chars)

ok

 The license section should include a mention of libcurl's license, etc
 in addition to the wrapper's license. In general, for non-Boost
 libraries like this, should we include the license/copyright as a
 string/tuple enum? (i.e. for use in about boxes / command line options)
 Should said license/copyright be register to a global repository upon
 module import? (i.e. for auto-magical about boxes, etc)

I'll mention what license libcurl is under. Until a standard for a 
special string/tuple has been decided upon I'll leave this out.

 'Do not use the same instance of this class in two threads
 simultanously.' First, you need to spell check all your documentation
 (i.e. simultaneously vs simultanously). Second, this should be assumed
 by most programmers to be true of all non-shared classes. Third, if you
 do want to leave this line in, use something short and sweet, like 'Http
 is not thread safe', instead.

I'll do a spell check. See reply to Andrei on the other issue.

 You need to look at/think about the order in which things are declared.
 Having things like HttpMethod declared long after Http doesn't make any
 sense. Also, it looks like you're not using '///ditto' for HttpMethod.

I don't know what ditto does. I'll try to look at the ddoc documentation.

 On that note, why is HttpMethod a free-standing enum and not declared
 inside http? Http.Method, Http.Result, etc. make a lot more sense.

See reply to Andrei.

 The setTimeCondition documentation seems a bit confusing.

I'll try to improve.

 I'd recommend stream-lining the convenience function documentation. For
 example, http.head only requires a /// Returns: An HttpResult containing
 the url's headers. Alternatively, you could use ditto and merge the
 documentation and examples for head/get/etc into a single block. Also,
 headAsync seems to document a data argument, but not take one, while
 post has an undocumented parameter, etc. In general, you might want to
 consider removing some of the redundant/self-obvious parameter
 documentation blocks. On second thought, use ditto, slim down the docs
 and move this set of functions either to the beginning or end of Http.
 It would make sense to put these right after the definition of
 Http.Method/method. You also might want to consider grouping all the
 xxxAsync functions together, instead of interleaving them.

As mentioned I need to figure out what ditto does.

headAsync comment does not match its signature. I'll fix.

I've noticed that trying out examples live in the docs may be activated 
at some point. If that is the case I think it is nice to have running 
examples of each method.

 Check the grammar in http.postData

ok

 http.onReceiveHeader has issues with it's 'See' link. It's example is
 too wide. You don't need really need the parameter documentation,
 particularly if you improve the example, i.e.:
 string[string] headerInformation;
 with(auto http = new Http("http://www.google.com")) {
 onReceiveHeader = (string key, string value) {
 if ( value.length <= 10 )
 headerInformation[key.idup] = value.idup;
 };
 onReceive = (ubyte[] data) { };
 perform;
 }

I see your point. The question is how we would like phobos docs to be:

1, consistent with all parameters documented for each method and each 
method explained and exemplified
2, minimal with examples and explanations where absolutely needed.

or somewhere in between. The current style if the curl wrapper is closer 
to 1 than 2.

Anything official about this?


 Also, an tab/indent should be 4 spaces not 5 and no example should start
 indented. Remember to comment like this:
 /** My regular comment
 * Example:
 ---
 // Example code
 ---
 * Other Stuff:
 */

ok

 You need to fix/complete onSend's documentation.

Will give it a look

 contentLength(size_t len); => contentLength(size_t length);

 'Perform http request' => 'Perform an http request'

ok

 AuthMethod/CurlAuth need to be part of docs.

I guess it I should alias the enum in etc.curl directly

 Why is setAuthenticationMethod not authenticationMethod or
 authentication? (And why is AuthMethod not AuthenticationMethod?)
 Similarly, why is setTimeCondition not timeCondition and CurlTimeCond
 not Http.TimeCondition?

authenticationMethod will be made into a property.
AuthMethod because that's what it is called in libcurl. I like the short 
form better and have done the same for AsyncHttpResult (not 
AsynchronousHttpResult).

setTimeCondition cannot be made into a property since it takes two 
arguments.

CurlTimeCond is not calld Http.TimeCondition because it comes from a 
etc.c.curl. I could alias it in Http ofcourse.

 Why is followLocation(int maxRedirs) not named maxRedirections?

This is what it is called in libcurl. I think I'll rename it to 
maxRedirections anyway since it seems to confuse.

 HttpResult.text should have huge warning flags with regard to use and/or
 have its design re-thought. Is there some reason you're not at least
 caching the result and throwing errors on invalid uses of content, etc.
 Calling text multiple times should be okay. And calling content after
 text should be either logically valid, or throw.

It is mostly based on storage concerns. Initially the data is stored as 
ubyte[] and can be returned by content(). When calling text() to get a 
string it will be decoded and returned. I now have a coule of options:

1, to store the string in the result for future calls to text(). I can 
do that and keep the original data for future calls to content(). This 
will double the amount of memory needed.
2, generated the string on each call to text(). This way no copy of the 
data is kept in HttpResult.
3, generated and store the string and then throw away the original data. 
Again no copy is kept in HttpResult.

I've chosen the 2. version since I guessed that once you have called 
text() it is more likely that you'll call text again and not content().

 AsyncHttpResult.byLine/byChunk examples is improperly indented and too
 wide.

ok

 Why is HttpStatusLine not Http.Status?


Thank you for your feedback!

May 17 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Tue, 17 May 2011 07:24:16 -0400, Jonas Drewsen <jdrewsen nospam.com>  
wrote:
 On 17/05/11 08.03, Robert Jacques wrote:
 On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com>  
 wrote:

 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas



[snip]

 I don't know what ditto does. I'll try to look at the ddoc documentation.

"If a documentation comment for a declaration consists only of the  
identifier ditto then the documentation comment for the previous  
declaration at the same declaration scope is applied to this declaration  
as well."

[snip]

 http.onReceiveHeader has issues with it's 'See' link. It's example is
 too wide. You don't need really need the parameter documentation,
 particularly if you improve the example, i.e.:
 string[string] headerInformation;
 with(auto http = new Http("http://www.google.com")) {
 onReceiveHeader = (string key, string value) {
 if ( value.length <= 10 )
 headerInformation[key.idup] = value.idup;
 };
 onReceive = (ubyte[] data) { };
 perform;
 }

 I see your point. The question is how we would like phobos docs to be:

 1, consistent with all parameters documented for each method and each  
 method explained and exemplified
 2, minimal with examples and explanations where absolutely needed.

 or somewhere in between. The current style if the curl wrapper is closer  
 to 1 than 2.

 Anything official about this?

Official? I don't know. But looking over std.algorithm, Param: is only  
used 3 times, for each of the find variants, in order to clarify what  
haystack and needle are and in one case make the limitations/requirements  
on haystack/needle explicit. Looking over the rest of phobos, I see a mix  
of styles, depending on the author, and some things (like datetime) seem  
to do both.

[snip]

 CurlTimeCond is not calld Http.TimeCondition because it comes from a  
 etc.c.curl. I could alias it in Http ofcourse.

I think where it's defined is less of an issue than where the  
documentation is.

[snip]

 HttpResult.text should have huge warning flags with regard to use and/or
 have its design re-thought. Is there some reason you're not at least
 caching the result and throwing errors on invalid uses of content, etc.
 Calling text multiple times should be okay. And calling content after
 text should be either logically valid, or throw.

 It is mostly based on storage concerns. Initially the data is stored as  
 ubyte[] and can be returned by content(). When calling text() to get a  
 string it will be decoded and returned. I now have a coule of options:

 1, to store the string in the result for future calls to text(). I can  
 do that and keep the original data for future calls to content(). This  
 will double the amount of memory needed.
 2, generated the string on each call to text(). This way no copy of the  
 data is kept in HttpResult.
 3, generated and store the string and then throw away the original data.  
 Again no copy is kept in HttpResult.

 I've chosen the 2. version since I guessed that once you have called  
 text() it is more likely that you'll call text again and not content().

Hmm... I take it that content and text are going to be large enough, and  
result long lived enough to warrant not having the array cached inside  
result. In that case you might want to rename text to toString or decode.  
'text' feels like it should be cheaply reusable, while 'toString'/'decode'  
doesn't. (this is a verb/noun thing) Also, 'toString' tends to compose  
better with the rest of D while 'decode' is similar to the std.utf  
functions. Also, if memory re-use is a concern, a 'toStringInPlace' or  
'decodeInPlace' with the old behavior might be appropriate, although I  
might recommend setting content to the utf-8 string and updating the  
encodingSchemeName as appropriate.

May 17 2011

jdrewsen <jdrewsen nospam.com> writes:

Den 17-05-2011 16:57, Robert Jacques skrev:
 On Tue, 17 May 2011 07:24:16 -0400, Jonas Drewsen <jdrewsen nospam.com>
 wrote:
 On 17/05/11 08.03, Robert Jacques wrote:
 On Mon, 16 May 2011 17:07:57 -0400, jdrewsen <jdrewsen nospam.com>
 wrote:

 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas



 [snip]

 I don't know what ditto does. I'll try to look at the ddoc documentation.

 "If a documentation comment for a declaration consists only of the
 identifier ditto then the documentation comment for the previous
 declaration at the same declaration scope is applied to this declaration
 as well."

 [snip]

 http.onReceiveHeader has issues with it's 'See' link. It's example is
 too wide. You don't need really need the parameter documentation,
 particularly if you improve the example, i.e.:
 string[string] headerInformation;
 with(auto http = new Http("http://www.google.com")) {
 onReceiveHeader = (string key, string value) {
 if ( value.length <= 10 )
 headerInformation[key.idup] = value.idup;
 };
 onReceive = (ubyte[] data) { };
 perform;
 }

 I see your point. The question is how we would like phobos docs to be:

 1, consistent with all parameters documented for each method and each
 method explained and exemplified
 2, minimal with examples and explanations where absolutely needed.

 or somewhere in between. The current style if the curl wrapper is
 closer to 1 than 2.

 Anything official about this?

 Official? I don't know. But looking over std.algorithm, Param: is only
 used 3 times, for each of the find variants, in order to clarify what
 haystack and needle are and in one case make the
 limitations/requirements on haystack/needle explicit. Looking over the
 rest of phobos, I see a mix of styles, depending on the author, and some
 things (like datetime) seem to do both.

 [snip]

 CurlTimeCond is not calld Http.TimeCondition because it comes from a
 etc.c.curl. I could alias it in Http ofcourse.

 I think where it's defined is less of an issue than where the
 documentation is.

 [snip]

 HttpResult.text should have huge warning flags with regard to use and/or
 have its design re-thought. Is there some reason you're not at least
 caching the result and throwing errors on invalid uses of content, etc.
 Calling text multiple times should be okay. And calling content after
 text should be either logically valid, or throw.

 It is mostly based on storage concerns. Initially the data is stored
 as ubyte[] and can be returned by content(). When calling text() to
 get a string it will be decoded and returned. I now have a coule of
 options:

 1, to store the string in the result for future calls to text(). I can
 do that and keep the original data for future calls to content(). This
 will double the amount of memory needed.
 2, generated the string on each call to text(). This way no copy of
 the data is kept in HttpResult.
 3, generated and store the string and then throw away the original
 data. Again no copy is kept in HttpResult.

 I've chosen the 2. version since I guessed that once you have called
 text() it is more likely that you'll call text again and not content().

 Hmm... I take it that content and text are going to be large enough, and
 result long lived enough to warrant not having the array cached inside
 result. In that case you might want to rename text to toString or
 decode. 'text' feels like it should be cheaply reusable, while
 'toString'/'decode' doesn't. (this is a verb/noun thing) Also,
 'toString' tends to compose better with the rest of D while 'decode' is
 similar to the std.utf functions. Also, if memory re-use is a concern, a
 'toStringInPlace' or 'decodeInPlace' with the old behavior might be
 appropriate, although I might recommend setting content to the utf-8
 string and updating the encodingSchemeName as appropriate.

I think I'll go for the toString() version.

/Jonas

May 17 2011

Jacob Carlborg <doob me.com> writes:

On 2011-05-16 23:07, jdrewsen wrote:
 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas

I like the API, seems simple.

-- 
/Jacob Carlborg

May 17 2011

bls <bizprac orange.fr> writes:

Am 16.05.2011 23:07, schrieb jdrewsen:
 Hi,

 I've been working on a wrapper for the etc.c.curl module. It is now
 pretty stable and I would very much like some feedback on the API.

 http://freeze.steamwinter.com/D/web/phobos/etc_curl.html

 BTW I use template mixins which doesn't seem to get included in the
 generated docs. Is there any way I can make this work?

 /Jonas

You really think this is good for your mental health  ?

// PUT with data senders
string msg = "Hello world";
--------------------------------
http.onSend = (void[] data) {
   if (msg.empty) return 0;
   auto m = cast(void[])msg;
   typeof(size_t) len = m.length;
   data[0..len] = m[0..$];
   msg.length = 0;
   return len;
};
Remarkable stuff, indeed !
I am not even sure that this thingy compiles , but fuc da duc, at least 
it shows what you CAN do in D.  :)

May 23 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Curl wrapper