www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Curl support RFC

reply Jonas Drewsen <jdrewsen nospam.com> writes:
Hi,

    So I've spent some time trying to wrap libcurl for D. There is a lot 
of things that you can do with libcurl which I did not know so I'm 
starting out small.

For now I've created all the declarations for the latest public curl C 
api. I have put that in the etc.c.curl module.

On top of that I've created a more D like api as seen below. This is 
located in the 'etc.curl' module. What you can see below currently works 
but before proceeding further down this road I would like to get your 
comments on it.

//
// Simple HTTP GET with sane defaults
// provides the .content, .headers and .status
//
writeln( Http.get("http://www.google.com").content );

//
// GET with custom data receiver delegates
//
Http http = new Http("http://www.google.dk");
http.setReceiveHeaderCallback( (string key, string value) {
	writeln(key ~ ":" ~ value);
} );
http.setReceiveCallback( (string data) { /* drop */ } );
http.perform;

//
// POST with some timouts
//
http.setUrl("http://www.testing.com/test.cgi");
http.setReceiveCallback( (string data) { writeln(data); } );
http.setConnectTimeout(1000);
http.setDataTimeout(1000);
http.setDnsTimeout(1000);
http.setPostData("The quick....");
http.perform;

//
// PUT with data sender delegate
//
string msg = "Hello world";
size_t len = msg.length; /* using chuncked transfer if omitted */

http.setSendCallback( delegate size_t(char[] data) {
     if (msg.empty) return 0;
     auto l = msg.length;
     data[0..l] = msg[0..$];
     msg.length = 0;
     return l;
     },
     HttpMethod.put, len );
http.perform;

//
// HTTPS
//
writeln(Http.get("https://mail.google.com").content);

//
// FTP
//
writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
                 "./downloaded-file"));


// ... authenication, cookies, interface select, progress callback
// etc. is also implemented this way.


/Jonas
Mar 11 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
I don't know much about this kind of stuff except that I use it for very simple
use cases occasionally.  One thing I'll definitely give your design credit for,
based on your examples, is making simple things simple.  I don't know how it
scales to more complex use cases (not saying it doesn't, just that I'm not
qualified to evaluate that), but I definitely would use this.  Nice work.

BTW, what is the license status of libcurl?  According to Wikipedia it's MIT
licensed.  Where does that leave us with regard to the binary attribution issue?

== Quote from Jonas Drewsen (jdrewsen nospam.com)'s article
 Hi,
     So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.
 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.
 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );
 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 	writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;
 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;
 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */
 http.setSendCallback( delegate size_t(char[] data) {
      if (msg.empty) return 0;
      auto l = msg.length;
      data[0..l] = msg[0..$];
      msg.length = 0;
      return l;
      },
      HttpMethod.put, len );
 http.perform;
 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);
 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
                  "./downloaded-file"));
 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.
 /Jonas

Mar 11 2011
next sibling parent Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
dsimcha wrote:

 I don't know much about this kind of stuff except that I use it for very
 simple
 use cases occasionally.  One thing I'll definitely give your design credit
 for,
 based on your examples, is making simple things simple.  I don't know how
 it scales to more complex use cases (not saying it doesn't, just that I'm
 not
 qualified to evaluate that), but I definitely would use this.  Nice work.
 
 BTW, what is the license status of libcurl?  According to Wikipedia it's
 MIT
 licensed.  Where does that leave us with regard to the binary attribution
 issue?
 

Walter contacted the author, it's not a problem: http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=112832
Mar 11 2011
prev sibling parent Jonas Drewsen <jdrewsen nospam.com> writes:
Thank you.

Regarding scalability: In my experience the fastest network handling for 
multiple concurrent request is done asyncronously using select or epoll. 
The current wrapper would probably use threading and messages to handle 
multiple concurrent requests which is not as efficient.

Usually you only need this kind of scalability for server side 
networking and not client side like libcurl is providing so I do not see 
this as a major issue for an initial version.

I do know how to support epoll/select based curl and by that better 
scalability and that would fortunately just be an extension to the API 
I've shown. Currently I will focus on getting the common things finished 
and rock solid.

/Jonas


On 11/03/11 16.30, dsimcha wrote:
 I don't know much about this kind of stuff except that I use it for very simple
 use cases occasionally.  One thing I'll definitely give your design credit for,
 based on your examples, is making simple things simple.  I don't know how it
 scales to more complex use cases (not saying it doesn't, just that I'm not
 qualified to evaluate that), but I definitely would use this.  Nice work.

 BTW, what is the license status of libcurl?  According to Wikipedia it's MIT
 licensed.  Where does that leave us with regard to the binary attribution
issue?

 == Quote from Jonas Drewsen (jdrewsen nospam.com)'s article
 Hi,
      So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.
 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.
 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );
 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 	writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;
 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;
 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */
 http.setSendCallback( delegate size_t(char[] data) {
       if (msg.empty) return 0;
       auto l = msg.length;
       data[0..l] = msg[0..$];
       msg.length = 0;
       return l;
       },
       HttpMethod.put, len );
 http.perform;
 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);
 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
                   "./downloaded-file"));
 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.
 /Jonas


Mar 12 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 11 Mar 2011 17:20:38 +0200, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 writeln( Http.get("http://www.google.com").content );

Does this return a string? What if the page's encoding isn't UTF-8? Data should probably be returned as void[], similar to std.file.read. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Mar 11 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 11/03/11 17.33, Vladimir Panteleev wrote:
 On Fri, 11 Mar 2011 17:20:38 +0200, Jonas Drewsen <jdrewsen nospam.com>
 wrote:

 writeln( Http.get("http://www.google.com").content );

Does this return a string? What if the page's encoding isn't UTF-8? Data should probably be returned as void[], similar to std.file.read.

Currently it returns a string, but should probably return void[] as you suggest. Maybe the interface should be something like this to support misc. encodings (like the std.file.readText does): class Http { struct Result(S) { S content; ... } static Result!S get(S = void[])(in string url); } Actually I just took a look at Andrei's std.stream2 suggestion and Http/Ftp... Transports would be pretty neat to have as well for reading formatted data. I'll follow the newly spawned "Stream proposal" thread on this one :) /Jonas
Mar 12 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-03-11 16:20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas

Is there support for other HTTP methods/verbs in the D wrapper, like delete? -- /Jacob Carlborg
Mar 11 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 11/03/11 19.31, Jacob Carlborg wrote:
 On 2011-03-11 16:20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas

Is there support for other HTTP methods/verbs in the D wrapper, like delete?

Yes.. all methods in libcurl are supported. /Jonas
Mar 12 2011
prev sibling next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI
already contains this, I could see being able to specifically request one or
the other for performance or so www.google.com works.

And what about properties? They tend to be very nice instead of set methods.
examples below.

Jonas Drewsen Wrote:

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );
 
 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 	writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

http.onHeader = (string key, string value) {...}; http.onContent = (string data) { ... }; http.perform();
Mar 11 2011
parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI
already contains this, I could see being able to specifically request one or
the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.
 And what about properties? They tend to be very nice instead of set methods.
examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :) /Jonas
 Jonas Drewsen Wrote:

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 	writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

http.onHeader = (string key, string value) {...}; http.onContent = (string data) { ... }; http.perform();

Mar 12 2011
next sibling parent Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
Jonas Drewsen wrote:

 On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...? The
 URI already contains this, I could see being able to specifically request
 one or the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.
 And what about properties? They tend to be very nice instead of set
 methods. examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :) /Jonas

Properties *are* accessor methods, with some sugar. In fact you already have used them, try it: http.setReceiveHeaderCallback = (string key, string value) { writeln(key ~ ":" ~ value); }; Marking a function with property just signals it's intended use, in which case it's nicer to grop the get/set prefixes. Supposedly using parenthesis with such declarations will be outlawed in the future, but I don't think that's the case currently.
 Jonas Drewsen Wrote:

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

http.onHeader = (string key, string value) {...}; http.onContent = (string data) { ... }; http.perform();


Mar 12 2011
prev sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Jonas Drewsen Wrote:

 On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI
already contains this, I could see being able to specifically request one or
the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.

Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.
 And what about properties? They tend to be very nice instead of set methods.
examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :)

D was originally very friendly with properties. Your could can at this moment be written: http.setReceiveHeaderCallback = (string key, string value) { writeln(key ~ ":" ~ value); }; But is going to be deprecated for the use of the property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields. Otherwise this looks really good and I do hope to see it in Phobos.
Mar 12 2011
next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 12/03/11 20.44, Jesse Phillips wrote:
 Jonas Drewsen Wrote:

 On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI
already contains this, I could see being able to specifically request one or
the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.

Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.

There should definitely be a simple method based only on an url. I'll put that in.
 And what about properties? They tend to be very nice instead of set methods.
examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :)

D was originally very friendly with properties. Your could can at this moment be written: http.setReceiveHeaderCallback = (string key, string value) { writeln(key ~ ":" ~ value); }; But is going to be deprecated for the use of the property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.

Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me: import std.stdio; alias void delegate() deleg; class T { private deleg tvalue; property void prop(deleg dg) { tvalue = dg; } property deleg prop() { return tvalue; } } void main(string[] args) { T t = new T; t.prop = { writeln("fda"); }; // Seems a bit odd that assigning to a temporary (tvalue) suddently // changes the behaviour. auto tvalue = t.prop; tvalue(); // Works as expected by printing fda t.prop(); // Just returns the delegate! // Shouldn't the property attribute ensure that no () is needed // when using the property t.prop()(); // Works } /Jonas
 Otherwise this looks really good and I do hope to see it in Phobos.

Mar 12 2011
next sibling parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 13/03/11 00.28, Jonathan M Davis wrote:
 On Saturday 12 March 2011 13:51:37 Jonas Drewsen wrote:
 On 12/03/11 20.44, Jesse Phillips wrote:
 Jonas Drewsen Wrote:
 On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...?
 The URI already contains this, I could see being able to specifically
 request one or the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.

Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.

There should definitely be a simple method based only on an url. I'll put that in.
 And what about properties? They tend to be very nice instead of set
 methods. examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :)

D was originally very friendly with properties. Your could can at this moment be written: http.setReceiveHeaderCallback = (string key, string value) { writeln(key ~ ":" ~ value); }; But is going to be deprecated for the use of the property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.

Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me: import std.stdio; alias void delegate() deleg; class T { private deleg tvalue; property void prop(deleg dg) { tvalue = dg; } property deleg prop() { return tvalue; } } void main(string[] args) { T t = new T; t.prop = { writeln("fda"); }; // Seems a bit odd that assigning to a temporary (tvalue) suddently // changes the behaviour. auto tvalue = t.prop; tvalue(); // Works as expected by printing fda t.prop(); // Just returns the delegate! // Shouldn't the property attribute ensure that no () is needed // when using the property t.prop()(); // Works }

property doesn't currently enforce much of anything. Things are in a transitory state with regards to property. Originally, there was no such thing as property and any function which had no parameters and returned a value could be used as a getter and any function which returned nothing and took a single argument could be used as a setter. It was decided to make it more restrictive, so property was added. Eventually, you will _only_ be able to use such functions as property functions if they are marked with property, and you will _have_ to call them with the property syntax and will _not_ be able to call non-property functions with the property syntax. However, at the moment, the compiler doesn't enforce that. It will eventually, but there are several bugs with regards to property functions (they mostly work, but you found one of the cases where they don't), and it probably wouldn't be a good idea to enforce it until more of those bugs have been fixed. - Jonathan M Davis

Okey... nice to hear that this is coming up. Thanks again! /Jonas
Mar 12 2011
prev sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Jonas Drewsen Wrote:

 Just tried the property stuff out but it seems a bit inconsistent. Maybe 
 someone can enlighten me:
 
 import std.stdio;
 
 alias void delegate() deleg;
 
 class T {
    private deleg tvalue;
     property void prop(deleg dg) {
      tvalue = dg;
    }
     property deleg prop() {
      return tvalue;
    }
 }
 
 void main(string[] args) {
    T t = new T;
    t.prop = { writeln("fda"); };
 
    // Seems a bit odd that assigning to a temporary (tvalue) suddently
    // changes the behaviour.
    auto tvalue = t.prop;
    tvalue();     // Works as expected by printing fda
    t.prop();     // Just returns the delegate!
 
    // Shouldn't the  property attribute ensure that no () is needed
    // when using the property
    t.prop()(); // Works
 }
 
 /Jonas

Ah, yes. One of the big reasons for introducing property was because returning delegates could be very confusing in terms if whether the delegate is called or returned from the function. Since the old system has not yet been ripped out property basically does nothing except under some conditions where it will complain you have added a (). So the situation should improve, but I really don't know how or when things will change.
Mar 13 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 12 March 2011 13:51:37 Jonas Drewsen wrote:
 On 12/03/11 20.44, Jesse Phillips wrote:
 Jonas Drewsen Wrote:
 On 11/03/11 22.21, Jesse Phillips wrote:
 I'll make some comments on the API. Do we have to choose Http/Ftp...?
 The URI already contains this, I could see being able to specifically
 request one or the other for performance or so www.google.com works.

That is a good question. The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on. I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request. The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.

Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.

There should definitely be a simple method based only on an url. I'll put that in.
 And what about properties? They tend to be very nice instead of set
 methods. examples below.

Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)? I do like the shorter onHeader/onContent much better though :)

D was originally very friendly with properties. Your could can at this moment be written: http.setReceiveHeaderCallback = (string key, string value) { writeln(key ~ ":" ~ value); }; But is going to be deprecated for the use of the property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.

Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me: import std.stdio; alias void delegate() deleg; class T { private deleg tvalue; property void prop(deleg dg) { tvalue = dg; } property deleg prop() { return tvalue; } } void main(string[] args) { T t = new T; t.prop = { writeln("fda"); }; // Seems a bit odd that assigning to a temporary (tvalue) suddently // changes the behaviour. auto tvalue = t.prop; tvalue(); // Works as expected by printing fda t.prop(); // Just returns the delegate! // Shouldn't the property attribute ensure that no () is needed // when using the property t.prop()(); // Works }

property doesn't currently enforce much of anything. Things are in a transitory state with regards to property. Originally, there was no such thing as property and any function which had no parameters and returned a value could be used as a getter and any function which returned nothing and took a single argument could be used as a setter. It was decided to make it more restrictive, so property was added. Eventually, you will _only_ be able to use such functions as property functions if they are marked with property, and you will _have_ to call them with the property syntax and will _not_ be able to call non-property functions with the property syntax. However, at the moment, the compiler doesn't enforce that. It will eventually, but there are several bugs with regards to property functions (they mostly work, but you found one of the cases where they don't), and it probably wouldn't be a good idea to enforce it until more of those bugs have been fixed. - Jonathan M Davis
Mar 12 2011
prev sibling next sibling parent reply Ary Manzana <ary esperanto.org.ar> writes:
On 3/11/11 12:20 PM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

I *love* it. All APIs should be like yours. One-liners for what you want right now. If it's a little more complex, some more lines. This is perfect. Congratulations!
Mar 11 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 12/03/11 05.30, Ary Manzana wrote:
 On 3/11/11 12:20 PM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

I *love* it. All APIs should be like yours. One-liners for what you want right now. If it's a little more complex, some more lines. This is perfect. Congratulations!

Thank you! Words like these keep up the motivation. /Jonas
Mar 12 2011
prev sibling next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
Hi,

   So I've been working a bit on the etc.curl module. Currently most of 
the HTTP functionality is done and some very simple Ftp.

I would very much like to know if this has a chance of getting in phobos 
if I finish it with the current design. If not then it will be for my 
own project only and doesn't need as much documentation or all the features.

https://github.com/jcd/phobos/tree/curl

I do know that the error handling is currently not good enough... WIP.

/Jonas


On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas

Mar 13 2011
next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 14/03/11 12.10, Johannes Pfau wrote:
 Jonas Drewsen wrote:
 Hi,

    So I've been working a bit on the etc.curl module. Currently most
 of
 the HTTP functionality is done and some very simple Ftp.

 I would very much like to know if this has a chance of getting in
 phobos if I finish it with the current design. If not then it will be
 for my own project only and doesn't need as much documentation or all
 the features.

 https://github.com/jcd/phobos/tree/curl

 I do know that the error handling is currently not good enough... WIP.

 /Jonas


 On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I really like the API. A few comments: You use the internal curl progress meter. According to the documentation (It's a little hidden, look at CURLOPT_NOPROGRESS) the progress meter is likely to removed in future curl versions. The download progress should be easy to reimplement, although you'd have to parse the Content-Length header. Upload shouldn't be to difficult either (One problem: What does curl pass as ultotal/dltotal when chunked encoding is used or the total size is not known?). Then we could also use different delegates for upload/download.

I did see the notice about the future of NOPROGRESS's removal but decided to wrap it anyway. Maybe I should just remove it in an initial version. As you say it is pretty simple to implement ourselves.
 The callback interface suits curl best and I actually like it, but how
 will it interact with streams? As an example: If someone wrote a
 stream/filter that decoded gzip for files it should be usable with
 the http streams as well. But files/ filestreams have a pull
 interface (no callbacks, stream.read() in a loop). So how could a gzip
 stream be written without to much code duplication supporting files and
 the http stuff?

If we take Andrei's stream proposal as the base of a new streaming design then the http would just be another Transport. Files have a pull interface that blocks until data is read. The same could be done for a the http class. What I would really like is for the stream design to support non-blocking as mentioned in the stream proposal. Just have to figure out how the streaming API should behave in such cases I guess.
 Do you plan to add some kind of support for header parsing? I think
 something like what the .net webclient uses
 ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
 would be great. Especially the HeaderCollection supporting headers as
 strings and as data types (for both parsing and formatting), but
 without a class hierarchy for the headers, using templates instead.

It would be nice to be able to get/set headers by string and enums (http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx). But I cannot see that .net is using datatypes or templates for it. Could you give me a pointer please?
 I've written D parsers/formatters for almost all headers in
 rfc2616 (1 or 2 might be missing) and for a few additional commonly
 used headers (Content-Disposition, cookie headers). The parsers are
 written with ragel and are to be used with curl (continuations must be
 removed and the parsers always take 1 line of input, just as you get it
 from curl). Right now only the client side is implemented (no parsers
 for headers which can only be sent from client-->server ). However, I
 need to add some more documentation to the parsers, need to do
 some refactoring and I've got absolutely no time for that in the next 2
 weeks ('abitur' final exams). But if you could wait 2 weeks or if
 you wanted to do the refactoring yourself, I would be happy to
 contribute that code.

That sounds very interesting. I would very much like to see the code and see if fits in.
Mar 14 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 14/03/11 16.40, Johannes Pfau wrote:
 Jonas Drewsen wrote:
 Do you plan to add some kind of support for header parsing? I think
 something like what the .net webclient uses
 ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
 would be great. Especially the HeaderCollection supporting headers as
 strings and as data types (for both parsing and formatting), but
 without a class hierarchy for the headers, using templates instead.

It would be nice to be able to get/set headers by string and enums (http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx). But I cannot see that .net is using datatypes or templates for it. Could you give me a pointer please?

You're right I didn't look close enough at the .net documentation. I thought HttpRequestHeader is a class. What I meant for D was something like this: struct ETagHeader { //Data members bool Weak = false; string Value; //All header structs provide these static string Key = "ETag"; static ETagHeader parse(string value) { //parser logic here } void format(T writer) if (isOutputRange!(T, string)) { if(etag.Weak) writer.put("W/"); assert(etag.Value != ""); writer.put(quote(etag.Value)); } } Then we can offer methods like these: setHeader(T)(T header) if(isHeader(T)) { headers[T.Key] = formatHeader(header); } T getHeader(T type)() if(isHeader(T)) { if(!T.Key in headers) throw Exception(); return T.parse(headers[T.key]); } So user code wouldn't have to deal with header parsing / formatting: auto etag = client.getHeader!ETagHeader(); assert(etag.Weak);

Seems like a very nice addition. I will have a look at your github and probably wait until you have made it ready for consumption before adding it :)
 I've written D parsers/formatters for almost all headers in
 rfc2616 (1 or 2 might be missing) and for a few additional commonly
 used headers (Content-Disposition, cookie headers). The parsers are
 written with ragel and are to be used with curl (continuations must
 be removed and the parsers always take 1 line of input, just as you
 get it from curl). Right now only the client side is implemented (no
 parsers for headers which can only be sent from client-->server ).
 However, I need to add some more documentation to the parsers, need
 to do some refactoring and I've got absolutely no time for that in
 the next 2 weeks ('abitur' final exams). But if you could wait 2
 weeks or if you wanted to do the refactoring yourself, I would be
 happy to contribute that code.

That sounds very interesting. I would very much like to see the code and see if fits in.

Ok, here it is, but it seriously needs to be refactored and documented: https://gist.github.com/869324

Mar 14 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-03-13 22:39, Jonas Drewsen wrote:
 Hi,

 So I've been working a bit on the etc.curl module. Currently most of the
 HTTP functionality is done and some very simple Ftp.

 I would very much like to know if this has a chance of getting in phobos
 if I finish it with the current design. If not then it will be for my
 own project only and doesn't need as much documentation or all the
 features.

 https://github.com/jcd/phobos/tree/curl

 I do know that the error handling is currently not good enough... WIP.

 /Jonas


 On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I thought that the "etc" package was for C bindings and would expect the "curl" module to be placed in std.curl or std.net.curl. -- /Jacob Carlborg
Mar 14 2011
prev sibling next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 25/03/11 10.54, Johannes Pfau wrote:
 Jonas Drewsen wrote:
 Hi,

    So I've been working a bit on the etc.curl module. Currently most
 of
 the HTTP functionality is done and some very simple Ftp.

 I would very much like to know if this has a chance of getting in
 phobos if I finish it with the current design. If not then it will be
 for my own project only and doesn't need as much documentation or all
 the features.

 https://github.com/jcd/phobos/tree/curl

 I do know that the error handling is currently not good enough... WIP.

 /Jonas


 On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I looked at the code again and I got 2 more suggestions: 1.) Would it be useful to have a headersReceived callback which would be called when all headers have been received (when the data callback is called the first time)? I think of a situation where you don't know what data the server will return: a few KB html which you can easily keep in memory or a huge file which you'd have to save to disk. You can only know that if the headers have been received. It would also be possible to do that by just overwriting the headerCallback and looking out for the ContentLength/ContentType header, but I think it should also work with the default headerCallback.

I'm a little confused as to what a headersReceived(string[string] headers) would give you compared to the onReceiveHeader(const(char)[], const(char)[])) callback that exists today in the example. The headersReceived callback would probably lookup the content-length header and set a flag about whether to save content to file or memory. The existing onReceiveHeader could do the same by setting the flag when it receives the content-length field. Or maybe I'm misunderstanding you?
 2.)
 As far as I can see you store the http headers in a case sensitive way.
 (res.headers[key] ~= value;). This means "Content-Length" vs
 "content-length" would produce two entries in the array and it makes
 it difficult to get the header from the associative array. It is maybe
 useful to keep the original casing, but probably not in the array key.

 BTW: According to RFC2616 the only headers which are allowed
 to be included multiple times in the response must consist of comma
 separated lists. So in theory we could keep a simple string[string]
 list and if we see a header twice we can just merge it with a ','.

 http://tools.ietf.org/html/rfc2616#section-4.2
 Relevant part from the RFC:
 ----------------------
     Multiple message-header fields with the same field-name MAY be
     present in a message if and only if the entire field-value for that
     header field is defined as a comma-separated list [i.e., #(values)].
     It MUST be possible to combine the multiple header fields into one
     "field-name: field-value" pair, without changing the semantics of the
     message, by appending each subsequent field-value to the first, each
     separated by a comma. The order in which header fields with the same
     field-name are received is therefore significant to the
     interpretation of the combined field value, and thus a proxy MUST NOT
     change the order of these field values when a message is forwarded.
 ----------------------

I will surely implement this combined value functionality. I also noted that header field names are case insensitive. This means that they could just be stored internally as lower cased and the documentation could specify lowercase for looking up by field name.
 I'm also done with the first pass through the http parsers.
 Documentation is here:
 http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html

 Code here:
 https://gist.github.com/886612
 The http.d file is generated from the http.d.rl file.

This is a nice protocol parser. I would very much like it to be used with the curl API but without it being a dependency. This is already possible now using the onReceiveHeader callback and this would decouple the two. At least until std.protocol.http is in phobos as well - at that point convenience methods could be added :) /Jonas
Mar 27 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 29/03/11 17.31, Johannes Pfau wrote:
 Jonas Drewsen wrote:
 This is a nice protocol parser. I would very much like it to be used
 with the curl API but without it being a dependency. This is already
 possible now using the onReceiveHeader callback and this would
 decouple the two. At least until std.protocol.http is in phobos as
 well - at that point convenience methods could be added :)

 /Jonas

Thanks, I think I'll propose the parser for the new experimental namespace when it's available.

I'm looking forward to that.
 About the headersReceived callback: You're totally right, it can be
 done with the onReceiveHeader callback right now. But I think in the
 common case the user wants the headers in an key/value array. So if the
 user doesn't want to use the onReceiveHeader api, a headersReceived
 callback would probably be convenient. But, as said it's not necessary.

I'll put it on my todo and reconsider when I get to it :)
 Reading the curl documentation showed another small trap:
 CURLOPT_HEADERFUNCTION
 ------------------------------------------------------------
 It's important to note that the callback will be invoked for the
 headers of all responses received after initiating a request and not
 just the final response. This includes all responses which occur during
 authentication negotiation. If you need to operate on only the headers
 from the final response, you will need to collect headers in the
 callback yourself and use HTTP status lines, for example, to delimit
 response boundaries.
 ------------------------------------------------------------

 I think if we store the headers into an array, we should only store the
 headers of the final response. Another question is should all headers
 or only final headers trigger the onReceiveHeader callback? Passing
 only the final headers would require extra work, passing all headers
 should at least be documented.

Yeah... I've discovered this myself as well. The current implementation does as libcurl does it an passes all headers not just for the final subrequest.
 Thinking of this more, this also means the _receiveHeaderCallback is
 not 100% correct, as it expects all lines after the first line to be
 header or empty lines, but it's possible that we get multiple statuslines.
 It still works, the regex doesn't match anything and the code
 ignores that line. But this way, the stored statusline will always be
 the first statusline, which isn't optimal. We'd also need to detect if a
 line is a statusline to reset the headers array if it's used. Seems
 like we have to think about this some more.

My local version already takes care of this. It was the wrong place for parsing status lines and headers anyway. It is now moved to the Http class where it should have been all the time. I have implemented almost all of the features/changes suggested now. The last one I'm currently fighting is the support for "foreach" and async .byLine/.byChunk. I may have to make some changes in the current design to support this with the calling API that I would like to expose. I wonder who could take the step and open a std.experimental package for submissions? Thank you for the feedback!
Mar 30 2011
prev sibling parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 25/03/11 12.07, Johannes Pfau wrote:
 Johannes Pfau wrote:
 Jonas Drewsen wrote:
 Hi,

    So I've been working a bit on the etc.curl module. Currently most
 of
 the HTTP functionality is done and some very simple Ftp.

 I would very much like to know if this has a chance of getting in
 phobos if I finish it with the current design. If not then it will be
 for my own project only and doesn't need as much documentation or all
 the features.

 https://github.com/jcd/phobos/tree/curl

 I do know that the error handling is currently not good enough... WIP.

 /Jonas


 On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I looked at the code again and I got 2 more suggestions: 1.) Would it be useful to have a headersReceived callback which would be called when all headers have been received (when the data callback is called the first time)? I think of a situation where you don't know what data the server will return: a few KB html which you can easily keep in memory or a huge file which you'd have to save to disk. You can only know that if the headers have been received. It would also be possible to do that by just overwriting the headerCallback and looking out for the ContentLength/ContentType header, but I think it should also work with the default headerCallback. 2.) As far as I can see you store the http headers in a case sensitive way. (res.headers[key] ~= value;). This means "Content-Length" vs "content-length" would produce two entries in the array and it makes it difficult to get the header from the associative array. It is maybe useful to keep the original casing, but probably not in the array key. BTW: According to RFC2616 the only headers which are allowed to be included multiple times in the response must consist of comma separated lists. So in theory we could keep a simple string[string] list and if we see a header twice we can just merge it with a ','. http://tools.ietf.org/html/rfc2616#section-4.2 Relevant part from the RFC: ---------------------- Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded. ---------------------- I'm also done with the first pass through the http parsers. Documentation is here: http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html Code here: https://gist.github.com/886612 The http.d file is generated from the http.d.rl file.

I added some code to show how I think this could be used in the HTTP client: https://gist.github.com/886612#file_gistfile1.d Like in the .net webclient we'd need two of these collections: one for received headers and one for headers to be sent.

Thanks! It would be very nice to have in the std.protocol.http in phobos so that the curl stuff could use it. If that happened then std.protocol.{smtp,imap,....} could probably also be built on your framework and be added as support in the curl wrappers. /Jonas
Mar 27 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".
 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");

You'll probably need to justify the existence of a class hierarchy and what overridable methods there are. In particular, since you seem to offer hooks via delegates, probably classes wouldn't be needed at all. (FWIW I would've done the same; I wouldn't want to inherit just to intercept the headers etc.)
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

As discussed, properties may be better here than setXxx and getXxx. The setReceiveCallback hook should take a ubyte[]. The setReceiveHeaderCallback should take a const(char)[]. That way you won't need to copy all headers, leaving safely that option to the client.
 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

setPostData -> setTextPostData, and then changing everything to properties would make it something like textPostData. Or wait, there could be some overloading going on... Anyway, the basic idea is that generally get and post data could be raw bytes, and the user could elect to transfer strings instead.
 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

The callback would take ubyte[].
 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas

This is all very encouraging. I think this API covers nicely a variety of needs. We need to make sure everything interacts well with threads, in particular that one can shut down a transfer (or the entire library) from a thread or callback and have the existing transfer(s) throw an exception immediately. Regarding a range interface, it would be great if you allowed e.g. foreach (line; Http.get("https://mail.google.com").byLine()) { ... } The data transfer should happen concurrently with the foreach code. The type of line is char[] or const(char)[]. Similarly, there would be a byChunk interface that transfers in ubyte[] chunks. Also we need a head() method for the corresponding command. Andrei
Mar 13 2011
next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 13/03/11 23.44, Andrei Alexandrescu wrote:
 On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.

 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited? I'll add a text property as well.
 //
 // GET with custom data receiver delegates
 //
 Http http = new Http("http://www.google.dk");

You'll probably need to justify the existence of a class hierarchy and what overridable methods there are. In particular, since you seem to offer hooks via delegates, probably classes wouldn't be needed at all. (FWIW I would've done the same; I wouldn't want to inherit just to intercept the headers etc.)
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

As discussed, properties may be better here than setXxx and getXxx. The setReceiveCallback hook should take a ubyte[]. The setReceiveHeaderCallback should take a const(char)[]. That way you won't need to copy all headers, leaving safely that option to the client.

I've already replaced the set/get methods with properties and renamed them. Hadn't thought of using const(char)[].. thanks for the hint.
 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

setPostData -> setTextPostData, and then changing everything to properties would make it something like textPostData. Or wait, there could be some overloading going on... Anyway, the basic idea is that generally get and post data could be raw bytes, and the user could elect to transfer strings instead.

I'll make sure both text and byte[]/void[] versions will be available.
 //
 // PUT with data sender delegate
 //
 string msg = "Hello world";
 size_t len = msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l = msg.length;
 data[0..l] = msg[0..$];
 msg.length = 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

The callback would take ubyte[].

Already fixed.
 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas

This is all very encouraging. I think this API covers nicely a variety of needs. We need to make sure everything interacts well with threads, in particular that one can shut down a transfer (or the entire library) from a thread or callback and have the existing transfer(s) throw an exception immediately.

I'll have a look at it.
 Regarding a range interface, it would be great if you allowed e.g.

 foreach (line; Http.get("https://mail.google.com").byLine()) {
 ...
 }

 The data transfer should happen concurrently with the foreach code. The
 type of line is char[] or const(char)[]. Similarly, there would be a
 byChunk interface that transfers in ubyte[] chunks.

 Also we need a head() method for the corresponding command.

 Andrei

That would be neat. What do you mean about concurrent data transfers with foreach? /Jonas
Mar 14 2011
next sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 14/03/11 13.28, Steven Schveighoffer wrote:
 On Mon, 14 Mar 2011 07:20:26 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:

 On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:

 On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?

You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults // provides the .content,
 .headers and .status //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that.

I also think ubyte[] is best, because: 1. It can be used directly. (You can't get an element from a void[] array without casting it to something else first.) 2. There are no assumptions about the type of data contained in the array. (char[] arrays are assumed to be UTF-8 encoded.) 3. ubyte[] arrays are (AFAIK) not scanned by the GC. (void[] arrays may contain pointers and must therefore be scanned.)

This isn't exactly true. arrays *created* as void[] will be scanned. Arrays created as ubyte[] and then cast to void[] will not be scanned. However, it is far too easy while dealing with a void[] array to have it mysteriously flip its bit to scan-able.
 I think the rule of thumb should be: If the array contains raw data of
 unspecified type, but no pointers or references, use ubyte[].

 void[] is very useful for input parameters, however, since all arrays are
 implicitly castable to void[]:

 void writeData(void[] data) { ... }

 writeData("Hello World!");
 writeData([1, 2, 3, 4]);

I think (and this differs from my previous opinion) const(void)[] should be used for input parameters where any array type could be passed in. However, ubyte[] should be used for output parameters and for internal storage. void[] just has too many pitfalls to be used anywhere but where its implicit casting ability is useful. -Steve

const(ubyte)[] for input void[] for output that sounds reasonable. I guess that if everybody can agree on this then the all of phobos (e.g. std.file) should use the same types? /Jonas
Mar 14 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/14/11 10:06 AM, Jonas Drewsen wrote:
 const(ubyte)[] for input
 void[] for output

 that sounds reasonable. I guess that if everybody can agree on this then
 the all of phobos (e.g. std.file) should use the same types?

Move the const from the first to the second line :o). I see no reason why user code can't mess with the buffer once read. Yes, I agree std.file et al should switch to ubyte[]. Andrei
Mar 14 2011
parent Jonas Drewsen <jdrewsen nospam.com> writes:
On 14/03/11 18.46, Andrei Alexandrescu wrote:
 On 3/14/11 10:06 AM, Jonas Drewsen wrote:
 const(ubyte)[] for input
 void[] for output

 that sounds reasonable. I guess that if everybody can agree on this then
 the all of phobos (e.g. std.file) should use the same types?

Move the const from the first to the second line :o). I see no reason why user code can't mess with the buffer once read.

You are right of course. bummer.
 Yes, I agree std.file et al should switch to ubyte[].

 Andrei

Then lets hope someone makes a patch for it. Maybe I'll make it when I'm done with the curl stuff if no one beats me to it. /Jonas
Mar 14 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/14/11 4:36 AM, Jonathan M Davis wrote:
 That's debatable. Some would argue one way, some another. Personally, I'd argue
 ubyte[]. I don't like void[] one bit. Others would agree with me, and yet
others
 would disagree. I don't think that there's really a general agreement on
whether
 void[] or ubyte[] is better when it comes to reading binary data like that.

void[]: "There is a typed array underneath, but I forgot its exact type". Evidence: all array types convert to void[] automatically. ubyte[]: "We're dealing with an array of octets here." Evidence: ubyte[] has no special properties over T[]. All raw data reads should yield ubyte[], not void[]. This is because the user may or may not know that underneath really there's a different type, but the compiler and runtime have no such idea. So the burden of the assumption is on the user. Raw data writes that take arrays could be allowed to accept void[] if implicit conversion from T[] is desirable. Andrei
Mar 14 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/14/11 4:16 AM, Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 Sweet. As has been discussed, often the content is not text so you may
 want to have content return ubyte[] and add a new property such as
 "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

Yah, as per the ensuing discussion.
 As discussed, properties may be better here than setXxx and getXxx. The
 setReceiveCallback hook should take a ubyte[]. The
 setReceiveHeaderCallback should take a const(char)[]. That way you won't
 need to copy all headers, leaving safely that option to the client.

I've already replaced the set/get methods with properties and renamed them. Hadn't thought of using const(char)[].. thanks for the hint.

A good general guideline: make sure that the user could easily and safely use a loop that reads a large http stream (with hooks and all) without allocating one item each pass through the loop.
 Regarding a range interface, it would be great if you allowed e.g.

 foreach (line; Http.get("https://mail.google.com").byLine()) {
 ...
 }

 The data transfer should happen concurrently with the foreach code. The
 type of line is char[] or const(char)[]. Similarly, there would be a
 byChunk interface that transfers in ubyte[] chunks.

 Also we need a head() method for the corresponding command.

 Andrei

That would be neat. What do you mean about concurrent data transfers with foreach?

Assume the body of the loop does some time-consuming processing - like e.g. writing to another HTTP stream. Then your network reads should not wait for that processing. While the user code does something, you should already have the next transfer in flight. Example: a utility that efficiently uses GET from one http source and uses the data to POST it to an http target should be an efficient few-liner. (FTP versions and mixed ones too.) Andrei
Mar 14 2011
parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 14/03/11 18.55, Andrei Alexandrescu wrote:
 On 3/14/11 4:16 AM, Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 Sweet. As has been discussed, often the content is not text so you may
 want to have content return ubyte[] and add a new property such as
 "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

Yah, as per the ensuing discussion.
 As discussed, properties may be better here than setXxx and getXxx. The
 setReceiveCallback hook should take a ubyte[]. The
 setReceiveHeaderCallback should take a const(char)[]. That way you won't
 need to copy all headers, leaving safely that option to the client.

I've already replaced the set/get methods with properties and renamed them. Hadn't thought of using const(char)[].. thanks for the hint.

A good general guideline: make sure that the user could easily and safely use a loop that reads a large http stream (with hooks and all) without allocating one item each pass through the loop.

Makes sense. I'll keep that in mind.
 Regarding a range interface, it would be great if you allowed e.g.

 foreach (line; Http.get("https://mail.google.com").byLine()) {
 ...
 }

 The data transfer should happen concurrently with the foreach code. The
 type of line is char[] or const(char)[]. Similarly, there would be a
 byChunk interface that transfers in ubyte[] chunks.

 Also we need a head() method for the corresponding command.

 Andrei

That would be neat. What do you mean about concurrent data transfers with foreach?

Assume the body of the loop does some time-consuming processing - like e.g. writing to another HTTP stream. Then your network reads should not wait for that processing. While the user code does something, you should already have the next transfer in flight. Example: a utility that efficiently uses GET from one http source and uses the data to POST it to an http target should be an efficient few-liner. (FTP versions and mixed ones too.) Andrei

I get it. Any existing implementation that does this I can have a look at? /Jonas
Mar 14 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/14/11 4:11 PM, Jonas Drewsen wrote:
 On 14/03/11 18.55, Andrei Alexandrescu wrote:
 Assume the body of the loop does some time-consuming processing - like
 e.g. writing to another HTTP stream. Then your network reads should not
 wait for that processing. While the user code does something, you should
 already have the next transfer in flight.

 Example: a utility that efficiently uses GET from one http source and
 uses the data to POST it to an http target should be an efficient
 few-liner. (FTP versions and mixed ones too.)


 Andrei

I get it. Any existing implementation that does this I can have a look at?

Unfortunately not at the moment. I wanted to define such a thing for std.stdio called byLineAsync and byChunkAsync but never got to it. The basic idea is: 1. Define a new range type, e.g. AsyncHttpInputRange 2. Inside that range start a secondary thread that does the actual transfer and passes read buffers to the main thread by means of messages 3. See std.concurrency and the free chapter http://www.informit.com/articles/printerfriendly.aspx?p=1609144 for details 4. Control congestion (too many buffers in flight) with setMaxMailboxSize. 5. Make sure you have a little protocol that stops the secondary thread when the range is destroyed. Andrei
Mar 14 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,
 
 So I've spent some time trying to wrap libcurl for D. There is a lot of
 things that you can do with libcurl which I did not know so I'm starting
 out small.
 
 For now I've created all the declarations for the latest public curl C
 api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?

You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently works
 but before proceeding further down this road I would like to get your
 comments on it.
 
 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that. - Jonathan M Davis
Mar 14 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:

 On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,
 
 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.
 
 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?

You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.
 
 //
 // Simple HTTP GET with sane defaults // provides the .content,
 .headers and .status //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that.

I also think ubyte[] is best, because: 1. It can be used directly. (You can't get an element from a void[] array without casting it to something else first.) 2. There are no assumptions about the type of data contained in the array. (char[] arrays are assumed to be UTF-8 encoded.) 3. ubyte[] arrays are (AFAIK) not scanned by the GC. (void[] arrays may contain pointers and must therefore be scanned.) I think the rule of thumb should be: If the array contains raw data of unspecified type, but no pointers or references, use ubyte[]. void[] is very useful for input parameters, however, since all arrays are implicitly castable to void[]: void writeData(void[] data) { ... } writeData("Hello World!"); writeData([1, 2, 3, 4]); -Lars
Mar 14 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 14 Mar 2011 07:20:26 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:

 On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 On 3/11/11 9:20 AM, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?

You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.
 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults // provides the .content,
 .headers and .status //
 writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that.

I also think ubyte[] is best, because: 1. It can be used directly. (You can't get an element from a void[] array without casting it to something else first.) 2. There are no assumptions about the type of data contained in the array. (char[] arrays are assumed to be UTF-8 encoded.) 3. ubyte[] arrays are (AFAIK) not scanned by the GC. (void[] arrays may contain pointers and must therefore be scanned.)

This isn't exactly true. arrays *created* as void[] will be scanned. Arrays created as ubyte[] and then cast to void[] will not be scanned. However, it is far too easy while dealing with a void[] array to have it mysteriously flip its bit to scan-able.
 I think the rule of thumb should be:  If the array contains raw data of
 unspecified type, but no pointers or references, use ubyte[].

 void[] is very useful for input parameters, however, since all arrays are
 implicitly castable to void[]:

   void writeData(void[] data) { ... }

   writeData("Hello World!");
   writeData([1, 2, 3, 4]);

I think (and this differs from my previous opinion) const(void)[] should be used for input parameters where any array type could be passed in. However, ubyte[] should be used for output parameters and for internal storage. void[] just has too many pitfalls to be used anywhere but where its implicit casting ability is useful. -Steve
Mar 14 2011
prev sibling parent reply Jonas Drewsen <jdrewsen nospam.com> writes:
On 13/03/11 23.44, Andrei Alexandrescu wrote:
 You'll probably need to justify the existence of a class hierarchy and
 what overridable methods there are. In particular, since you seem to
 offer hooks via delegates, probably classes wouldn't be needed at all.
 (FWIW I would've done the same; I wouldn't want to inherit just to
 intercept the headers etc.)

Missed this one in my last reply. Ftp/Http etc. are all inheriting from a Protocol class. The Protocol class defines common settings ( properties) for all protocols e.g. dnsTimeout, connectTimeout, networkInterface, url, port selection. I could make these into a mixin and thereby get rid of the inheritance of course. I think that keeping the Protocol as an abstract base class would benefit e.g. the integration with streams. In that case we could simply create a CurlTransport that contains a reference to a Protocol derived objects (Http,Ftp...). Or would it be better to have specific HttpTransport, FtpTransport? /Jonas
Mar 14 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/14/11 10:38 AM, Jonas Drewsen wrote:
 On 13/03/11 23.44, Andrei Alexandrescu wrote:
 You'll probably need to justify the existence of a class hierarchy and
 what overridable methods there are. In particular, since you seem to
 offer hooks via delegates, probably classes wouldn't be needed at all.
 (FWIW I would've done the same; I wouldn't want to inherit just to
 intercept the headers etc.)

Missed this one in my last reply. Ftp/Http etc. are all inheriting from a Protocol class. The Protocol class defines common settings ( properties) for all protocols e.g. dnsTimeout, connectTimeout, networkInterface, url, port selection. I could make these into a mixin and thereby get rid of the inheritance of course.

Use Occam's razor and the path of least resistence to get the most natural interface.
 I think that keeping the Protocol as an abstract base class would
 benefit e.g. the integration with streams. In that case we could simply
 create a CurlTransport that contains a reference to a Protocol derived
 objects (Http,Ftp...).

 Or would it be better to have specific HttpTransport, FtpTransport?

Count the commonalities and the differences and then make an executive decision. Andrei
Mar 14 2011
prev sibling next sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Jonas Drewsen wrote:
Hi,

   So I've been working a bit on the etc.curl module. Currently most
 of=20
the HTTP functionality is done and some very simple Ftp.

I would very much like to know if this has a chance of getting in
phobos if I finish it with the current design. If not then it will be
for my own project only and doesn't need as much documentation or all
the features.

https://github.com/jcd/phobos/tree/curl

I do know that the error handling is currently not good enough... WIP.

/Jonas


On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http =3D new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg =3D "Hello world";
 size_t len =3D msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l =3D msg.length;
 data[0..l] =3D msg[0..$];
 msg.length =3D 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I really like the API. A few comments: You use the internal curl progress meter. According to the documentation (It's a little hidden, look at CURLOPT_NOPROGRESS) the progress meter is likely to removed in future curl versions. The download progress should be easy to reimplement, although you'd have to parse the Content-Length header. Upload shouldn't be to difficult either (One problem: What does curl pass as ultotal/dltotal when chunked encoding is used or the total size is not known?). Then we could also use different delegates for upload/download. The callback interface suits curl best and I actually like it, but how will it interact with streams? As an example: If someone wrote a stream/filter that decoded gzip for files it should be usable with the http streams as well. But files/ filestreams have a pull interface (no callbacks, stream.read() in a loop). So how could a gzip stream be written without to much code duplication supporting files and the http stuff? Do you plan to add some kind of support for header parsing? I think something like what the .net webclient uses ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=3DVS.100).= aspx ) would be great. Especially the HeaderCollection supporting headers as strings and as data types (for both parsing and formatting), but without a class hierarchy for the headers, using templates instead. I've written D parsers/formatters for almost all headers in rfc2616 (1 or 2 might be missing) and for a few additional commonly used headers (Content-Disposition, cookie headers). The parsers are written with ragel and are to be used with curl (continuations must be removed and the parsers always take 1 line of input, just as you get it from curl). Right now only the client side is implemented (no parsers for headers which can only be sent from client-->server ). However, I need to add some more documentation to the parsers, need to do some refactoring and I've got absolutely no time for that in the next 2 weeks ('abitur' final exams). But if you could wait 2 weeks or if you wanted to do the refactoring yourself, I would be happy to contribute that code. --=20 Johannes Pfau
Mar 14 2011
prev sibling next sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Jonas Drewsen wrote:
 Do you plan to add some kind of support for header parsing? I think
 something like what the .net webclient uses
 ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=3DVS.10=


 would be great. Especially the HeaderCollection supporting headers as
 strings and as data types (for both parsing and formatting), but
 without a class hierarchy for the headers, using templates instead.

It would be nice to be able to get/set headers by string and enums=20 (http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx=

But I cannot see that .net is using datatypes or templates for it.
Could you give me a pointer please?

You're right I didn't look close enough at the .net documentation. I thought HttpRequestHeader is a class. What I meant for D was something like this: struct ETagHeader { //Data members bool Weak =3D false; string Value; //All header structs provide these static string Key =3D "ETag"; static ETagHeader parse(string value) { //parser logic here } void format(T writer) if (isOutputRange!(T, string)) { if(etag.Weak) writer.put("W/"); assert(etag.Value !=3D ""); writer.put(quote(etag.Value)); } } Then we can offer methods like these: setHeader(T)(T header) if(isHeader(T)) { headers[T.Key] =3D formatHeader(header); } T getHeader(T type)() if(isHeader(T)) { if(!T.Key in headers) throw Exception(); return T.parse(headers[T.key]); } So user code wouldn't have to deal with header parsing / formatting: auto etag =3D client.getHeader!ETagHeader(); assert(etag.Weak);
 I've written D parsers/formatters for almost all headers in
 rfc2616 (1 or 2 might be missing) and for a few additional commonly
 used headers (Content-Disposition, cookie headers). The parsers are
 written with ragel and are to be used with curl (continuations must
 be removed and the parsers always take 1 line of input, just as you
 get it from curl). Right now only the client side is implemented (no
 parsers for headers which can only be sent from client-->server ).
 However, I need to add some more documentation to the parsers, need
 to do some refactoring and I've got absolutely no time for that in
 the next 2 weeks ('abitur' final exams). But if you could wait 2
 weeks or if you wanted to do the refactoring yourself, I would be
 happy to contribute that code.

That sounds very interesting. I would very much like to see the code and see if fits in.

Ok, here it is, but it seriously needs to be refactored and documented: https://gist.github.com/869324 --=20 Johannes Pfau
Mar 14 2011
prev sibling next sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Jonas Drewsen wrote:
Hi,

   So I've been working a bit on the etc.curl module. Currently most
 of=20
the HTTP functionality is done and some very simple Ftp.

I would very much like to know if this has a chance of getting in
phobos if I finish it with the current design. If not then it will be
for my own project only and doesn't need as much documentation or all
the features.

https://github.com/jcd/phobos/tree/curl

I do know that the error handling is currently not good enough... WIP.

/Jonas


On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http =3D new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg =3D "Hello world";
 size_t len =3D msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l =3D msg.length;
 data[0..l] =3D msg[0..$];
 msg.length =3D 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I looked at the code again and I got 2 more suggestions: 1.) Would it be useful to have a headersReceived callback which would be called when all headers have been received (when the data callback is called the first time)? I think of a situation where you don't know what data the server will return: a few KB html which you can easily keep in memory or a huge file which you'd have to save to disk. You can only know that if the headers have been received. It would also be possible to do that by just overwriting the headerCallback and looking out for the ContentLength/ContentType header, but I think it should also work with the default headerCallback. 2.) As far as I can see you store the http headers in a case sensitive way. (res.headers[key] ~=3D value;). This means "Content-Length" vs "content-length" would produce two entries in the array and it makes it difficult to get the header from the associative array. It is maybe useful to keep the original casing, but probably not in the array key. BTW: According to RFC2616 the only headers which are allowed to be included multiple times in the response must consist of comma separated lists. So in theory we could keep a simple string[string] list and if we see a header twice we can just merge it with a ','. http://tools.ietf.org/html/rfc2616#section-4.2 Relevant part from the RFC: ---------------------- Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded. ---------------------- I'm also done with the first pass through the http parsers. Documentation is here: http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html Code here: https://gist.github.com/886612 The http.d file is generated from the http.d.rl file.=20 --=20 Johannes Pfau
Mar 25 2011
prev sibling next sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Johannes Pfau wrote:
Jonas Drewsen wrote:
Hi,

   So I've been working a bit on the etc.curl module. Currently most
 of=20
the HTTP functionality is done and some very simple Ftp.

I would very much like to know if this has a chance of getting in
phobos if I finish it with the current design. If not then it will be
for my own project only and doesn't need as much documentation or all
the features.

https://github.com/jcd/phobos/tree/curl

I do know that the error handling is currently not good enough... WIP.

/Jonas


On 11/03/11 16.20, Jonas Drewsen wrote:
 Hi,

 So I've spent some time trying to wrap libcurl for D. There is a lot
 of things that you can do with libcurl which I did not know so I'm
 starting out small.

 For now I've created all the declarations for the latest public curl
 C api. I have put that in the etc.c.curl module.

 On top of that I've created a more D like api as seen below. This is
 located in the 'etc.curl' module. What you can see below currently
 works but before proceeding further down this road I would like to
 get your comments on it.

 //
 // Simple HTTP GET with sane defaults
 // provides the .content, .headers and .status
 //
 writeln( Http.get("http://www.google.com").content );

 //
 // GET with custom data receiver delegates
 //
 Http http =3D new Http("http://www.google.dk");
 http.setReceiveHeaderCallback( (string key, string value) {
 writeln(key ~ ":" ~ value);
 } );
 http.setReceiveCallback( (string data) { /* drop */ } );
 http.perform;

 //
 // POST with some timouts
 //
 http.setUrl("http://www.testing.com/test.cgi");
 http.setReceiveCallback( (string data) { writeln(data); } );
 http.setConnectTimeout(1000);
 http.setDataTimeout(1000);
 http.setDnsTimeout(1000);
 http.setPostData("The quick....");
 http.perform;

 //
 // PUT with data sender delegate
 //
 string msg =3D "Hello world";
 size_t len =3D msg.length; /* using chuncked transfer if omitted */

 http.setSendCallback( delegate size_t(char[] data) {
 if (msg.empty) return 0;
 auto l =3D msg.length;
 data[0..l] =3D msg[0..$];
 msg.length =3D 0;
 return l;
 },
 HttpMethod.put, len );
 http.perform;

 //
 // HTTPS
 //
 writeln(Http.get("https://mail.google.com").content);

 //
 // FTP
 //
 writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
 "./downloaded-file"));


 // ... authenication, cookies, interface select, progress callback
 // etc. is also implemented this way.


 /Jonas


I looked at the code again and I got 2 more suggestions: 1.) Would it be useful to have a headersReceived callback which would be called when all headers have been received (when the data callback is called the first time)? I think of a situation where you don't know what data the server will return: a few KB html which you can easily keep in memory or a huge file which you'd have to save to disk. You can only know that if the headers have been received. It would also be possible to do that by just overwriting the headerCallback and looking out for the ContentLength/ContentType header, but I think it should also work with the default headerCallback. 2.) As far as I can see you store the http headers in a case sensitive way. (res.headers[key] ~=3D value;). This means "Content-Length" vs "content-length" would produce two entries in the array and it makes it difficult to get the header from the associative array. It is maybe useful to keep the original casing, but probably not in the array key. BTW: According to RFC2616 the only headers which are allowed to be included multiple times in the response must consist of comma separated lists. So in theory we could keep a simple string[string] list and if we see a header twice we can just merge it with a ','. http://tools.ietf.org/html/rfc2616#section-4.2 Relevant part from the RFC: ---------------------- Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded. ---------------------- I'm also done with the first pass through the http parsers. Documentation is here: http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html Code here: https://gist.github.com/886612 The http.d file is generated from the http.d.rl file.=20

I added some code to show how I think this could be used in the HTTP client: https://gist.github.com/886612#file_gistfile1.d Like in the .net webclient we'd need two of these collections: one for received headers and one for headers to be sent. --=20 Johannes Pfau
Mar 25 2011
prev sibling parent Johannes Pfau <spam example.com> writes:
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Jonas Drewsen wrote:
This is a nice protocol parser. I would very much like it to be used=20
with the curl API but without it being a dependency. This is already=20
possible now using the onReceiveHeader callback and this would
decouple the two. At least until std.protocol.http is in phobos as
well - at that point convenience methods could be added :)

/Jonas

Thanks, I think I'll propose the parser for the new experimental namespace when it's available. About the headersReceived callback: You're totally right, it can be done with the onReceiveHeader callback right now. But I think in the common case the user wants the headers in an key/value array. So if the user doesn't want to use the onReceiveHeader api, a headersReceived callback would probably be convenient. But, as said it's not necessary. Reading the curl documentation showed another small trap: CURLOPT_HEADERFUNCTION ------------------------------------------------------------ It's important to note that the callback will be invoked for the headers of all responses received after initiating a request and not just the final response. This includes all responses which occur during authentication negotiation. If you need to operate on only the headers from the final response, you will need to collect headers in the callback yourself and use HTTP status lines, for example, to delimit response boundaries. ------------------------------------------------------------ I think if we store the headers into an array, we should only store the headers of the final response. Another question is should all headers or only final headers trigger the onReceiveHeader callback? Passing only the final headers would require extra work, passing all headers should at least be documented. Thinking of this more, this also means the _receiveHeaderCallback is not 100% correct, as it expects all lines after the first line to be header or empty lines, but it's possible that we get multiple statuslines. It still works, the regex doesn't match anything and the code ignores that line. But this way, the stored statusline will always be the first statusline, which isn't optimal. We'd also need to detect if a line is a statusline to reset the headers array if it's used. Seems like we have to think about this some more. --=20 Johannes Pfau
Mar 29 2011