digitalmars.D - Second Round CURL Wrapper Review
- dsimcha <dsimcha yahoo.com> Dec 02 2011
- mta`chrono <chrono mta-international.net> Dec 02 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 03 2011
- Kapps <Kapps NotValidEmail.com> Dec 03 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 02 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 03 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 03 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 04 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 04 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 05 2011
- Somedude <lovelydear mailmetrash.com> Dec 12 2011
- David Nadlinger <see klickverbot.at> Dec 12 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 03 2011
- Jonas Drewsen <jdrewsen nospam.com> Dec 03 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 03 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 03 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 04 2011
- Jacob Carlborg <doob me.com> Dec 04 2011
- "Marco Leise" <Marco.Leise gmx.de> Dec 04 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 04 2011
- "Marco Leise" <Marco.Leise gmx.de> Dec 05 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 05 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 05 2011
- "Vladimir Panteleev" <vladimir thecybershadow.net> Dec 05 2011
- "dsimcha" <dsimcha yahoo.com> Dec 11 2011
- dsimcha <dsimcha yahoo.com> Dec 11 2011
- Somedude <lovelydear mailmetrash.com> Dec 12 2011
- Jacob Carlborg <doob me.com> Dec 12 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 11 2011
- Somedude <lovelydear mailmetrash.com> Dec 12 2011
- bearophile <bearophileHUGS lycos.com> Dec 12 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 12 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 12 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 12 2011
- "Jakob Ovrum" <jakobovrum gmail.com> Dec 12 2011
- "dsimcha" <dsimcha yahoo.com> Dec 12 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 13 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 13 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 14 2011
- Justin C Calvarese <jccalvarese gmail.com> Dec 17 2011
- Justin C Calvarese <jccalvarese gmail.com> Dec 17 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 14 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 15 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 16 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 17 2011
- Justin C Calvarese <technocrat7+d gmail.com> Dec 17 2011
- "Jakob Ovrum" <jakobovrum gmail.com> Dec 17 2011
- "Jakob Ovrum" <jakobovrum gmail.com> Dec 17 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 17 2011
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Dec 17 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 30 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 17 2011
- "Jakob Ovrum" <jakobovrum gmail.com> Dec 17 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 17 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 17 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 17 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 17 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 18 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 18 2011
- Jonathan M Davis <jmdavisProg gmx.com> Dec 18 2011
- Somedude <lovelydear mailmetrash.com> Dec 19 2011
- Somedude <lovelydear mailmetrash.com> Dec 19 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 30 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 30 2011
- "jdrewsen" <jdrewsen nospam.com> Dec 30 2011
I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
Dec 02 2011
2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
Low Level Part --> deimos High Level Part --> std.curl
Dec 02 2011
On 03/12/11 06.12, mta`chrono wrote:2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
Low Level Part --> deimos High Level Part --> std.curl
I would really hate if phobos would be depending on deimos. As I see it either both levels go in or none at all. /Jonas
Dec 03 2011
On 02/12/2011 11:12 PM, mta`chrono wrote:2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
Low Level Part --> deimos High Level Part --> std.curl
Deimos is meant to be something you grab parts of as necessary; as such, it would not (IMO) make too much sense to force something in Phobos to be dependent on it, as now the user has to go and get something else just to use Phobos.
Dec 03 2011
On Sat, 03 Dec 2011 06:26:10 +0200, dsimcha <dsimcha yahoo.com> wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
Yes, please! Finding/building a library compatible with OPTLINK is a needless hassle.Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
I think the docs for the asynchronous functions need a bit more elaboration. For example, I had these questions after reading the documentation: Does the working thread block or buffer when data is available but the main thread isn't reading it? The examples for byLine and byLineAsync are almost identical. Is there any difference in functionality? Is it possible to check if data is available without blocking? If not, what's the point of byLineAsync (why not use byLine)? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 02 2011
On 03/12/11 08.56, Vladimir Panteleev wrote:On Sat, 03 Dec 2011 06:26:10 +0200, dsimcha <dsimcha yahoo.com> wrote:
I think the docs for the asynchronous functions need a bit more elaboration. For example, I had these questions after reading the documentation: Does the working thread block or buffer when data is available but the main thread isn't reading it?
When the all buffers have been filled yes. I'll state that explicitely in the docs.The examples for byLine and byLineAsync are almost identical. Is there any difference in functionality?
As mentioned the async version performs the request in another thread leaving the main thread available for you to do something else. I'll clarify in the docs that it is when you actually call empty/front on the returned range you will get data from the other thread and may be blocking.Is it possible to check if data is available without blocking? If not, what's the point of byLineAsync (why not use byLine)?
No. The .empty property needs to wait in order to have the proper range behavior. But is does seem like a good idea to add a method for this. Just need a nice name for it since I guess this will be needed by future async ranges as well. Maybe range.wait(duration) returning true if data is present after waiting? Going down this road makes me think that selecting on async ranges would be a very nice feature. But that is definitely not something that will get included in the version getting reviewed now. /Jonas
Dec 03 2011
Den 03-12-2011 13:10, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:As mentioned the async version performs the request in another thread leaving the main thread available for you to do something else. I'll clarify in the docs that it is when you actually call empty/front on the returned range you will get data from the other thread and may be blocking.
I can't think of a realistic use case for the current asynchronous API. Basically, all you can do is start a request in the background, but you're neither notified of the request nor can you poll it to check its status? So the only thing you CAN do is ask for a request that you will need in the future, and when that future moment comes, block if necessary to get the result?
The standard example is downloading some content and saving it at the same time. While your main thread saves a chunk to disk (or uploads to another server) the async thread is busy buffering incoming chunks of data. This means that you effectively only wait for the slowest of the two IO operations. If you did it synchronously would worst case have to wait for all everything to be downloaded and the wait for everything to be saved or uploaded. foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin")) { // While writing to file in this thrad // new chunks are downloaded // in the background thread file.write(chunk); } I hope this makes sense. /Jonas
Dec 03 2011
Den 03-12-2011 21:58, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:Den 03-12-2011 13:10, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:As mentioned the async version performs the request in another thread leaving the main thread available for you to do something else. I'll clarify in the docs that it is when you actually call empty/front on the returned range you will get data from the other thread and may be blocking.
I can't think of a realistic use case for the current asynchronous API. Basically, all you can do is start a request in the background, but you're neither notified of the request nor can you poll it to check its status? So the only thing you CAN do is ask for a request that you will need in the future, and when that future moment comes, block if necessary to get the result?
The standard example is downloading some content and saving it at the same time. While your main thread saves a chunk to disk (or uploads to another server) the async thread is busy buffering incoming chunks of data. This means that you effectively only wait for the slowest of the two IO operations. If you did it synchronously would worst case have to wait for all everything to be downloaded and the wait for everything to be saved or uploaded. foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin")) { // While writing to file in this thrad // new chunks are downloaded // in the background thread file.write(chunk); } I hope this makes sense.
Well, this makes sense from a theoretical / high-level perspective, but OS write buffers greatly reduce the practicality of this. In common use cases the speed difference between disk and wire will differ by orders of magnitude as well.
Read/write buffers does indeed help a lot and there have been quite some discussions on this topic earlier in this newsgroups regarding tradeoffs etc. Please have a look at these threads (i dont have any links at hand unfortunately). /Jonas
Dec 04 2011
Den 04-12-2011 12:05, Vladimir Panteleev skrev:On Sun, 04 Dec 2011 12:59:15 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:Den 03-12-2011 21:58, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:The standard example is downloading some content and saving it at the same time. While your main thread saves a chunk to disk (or uploads to another server) the async thread is busy buffering incoming chunks of data. This means that you effectively only wait for the slowest of the two IO operations. If you did it synchronously would worst case have to wait for all everything to be downloaded and the wait for everything to be saved or uploaded. foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin")) { // While writing to file in this thrad // new chunks are downloaded // in the background thread file.write(chunk); } I hope this makes sense.
Well, this makes sense from a theoretical / high-level perspective, but OS write buffers greatly reduce the practicality of this. In common use cases the speed difference between disk and wire will differ by orders of magnitude as well.
Read/write buffers does indeed help a lot and there have been quite some discussions on this topic earlier in this newsgroups regarding tradeoffs etc. Please have a look at these threads (i dont have any links at hand unfortunately).
If you're referring to the discussion which was in the context of copying files, then I have read it. However, it does not apply to typical use cases of curl. The question here is if this example makes any practical sense, and by the looks of everything it does not. Or do you disagree?
The same applies here because all it comes down to in the end is the sizes of buffers. The async ranges simply allows you to fill a specified number buffers in another thread async. Most OSes also have socket buffers that serves the same purpose but async ranges allows you to specify the buffer size without being privileged. The standard max read buffer size for a tcp connection on the ubuntu I have is set to 112640 which is doubled by the kernel to 225280 bytes. A developer may very well like buffers larger that this without needing to set privileged kernel variables. /Jonas
Dec 04 2011
Den 05-12-2011 10:26, Vladimir Panteleev skrev:On Sun, 04 Dec 2011 22:52:13 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:The same applies here because all it comes down to in the end is the sizes of buffers. The async ranges simply allows you to fill a specified number buffers in another thread async. Most OSes also have socket buffers that serves the same purpose but async ranges allows you to specify the buffer size without being privileged. The standard max read buffer size for a tcp connection on the ubuntu I have is set to 112640 which is doubled by the kernel to 225280 bytes. A developer may very well like buffers larger that this without needing to set privileged kernel variables.
I still don't see how this could ever be useful in practice without the ability to poll for whether data is available. Is it possible to write two short programs that use the synchronous and asynchronous APIs in a way that makes a difference?
It is - but as stated it all depends on buffer sizes and IO speed. It is for the same reason that java recommends using BufferedReader around sockets instead of reading directly from the socket stream. Another example that might show another async range advantage and is clearer: auto f1 = byChunkAsync("www.foo.com/file1.txt", 2^^19); auto f2 = byChunkAsync("www.foo.com/file2.txt", 2^^19); auto f3 = byChunkAsync("www.foo.com/file3.txt", 2^^19); // While this iteration goes is done the file2 and file3 are downloaded // in the background threads foreach (l; f1) { writeln(to!string(l)); } // f2 and f3 are probably downloaded now and no waiting is necessary foreach (l; f2) { writeln(to!string(l)); } foreach (l; f3) { writeln(to!string(l)); } Anyway - I _have_ said that I will add a data availability poll functionality so I guess your initial concern is also covered. /Jonas
Dec 05 2011
Le 05/12/2011 22:36, Jonas Drewsen a écrit :Anyway - I _have_ said that I will add a data availability poll functionality so I guess your initial concern is also covered. /Jonas
Just an idea: would it be possible/useful to use the signals/slots mechanism for this kind of synch ?
Dec 12 2011
On 12/12/11 1:18 PM, jdrewsen wrote:On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:Just an idea: would it be possible/useful to use the signals/slots mechanism for this kind of synch ?
This would be most useful if there were some kind of main loop. This is needed because the request is done in another thread and the response should be handled in the main thread. Without an main loop in the main thread or a destinct call to e.g. a poll() method the signal/slot s would not work.
I don't know if you already have a solution in the works, but maybe the future interface I did for Thrift is similar to what you are looking for: http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html David
Dec 12 2011
On 03/12/11 05.26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
Thank you for stepping up again even though I know you have plenty of other things to do. /Jonas
Dec 03 2011
On 03/12/11 05.26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
If there are no licensing issues I think it should. At least until we have a proper package management tool in place that could download the library automatically when needed on windows. /Jonas
Dec 03 2011
On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:As mentioned the async version performs the request in another thread leaving the main thread available for you to do something else. I'll clarify in the docs that it is when you actually call empty/front on the returned range you will get data from the other thread and may be blocking.
I can't think of a realistic use case for the current asynchronous API. Basically, all you can do is start a request in the background, but you're neither notified of the request nor can you poll it to check its status? So the only thing you CAN do is ask for a request that you will need in the future, and when that future moment comes, block if necessary to get the result? Adding a method to query the request's status would certainly be useful. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 03 2011
On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:Den 03-12-2011 13:10, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 13:53:16 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:As mentioned the async version performs the request in another thread leaving the main thread available for you to do something else. I'll clarify in the docs that it is when you actually call empty/front on the returned range you will get data from the other thread and may be blocking.
I can't think of a realistic use case for the current asynchronous API. Basically, all you can do is start a request in the background, but you're neither notified of the request nor can you poll it to check its status? So the only thing you CAN do is ask for a request that you will need in the future, and when that future moment comes, block if necessary to get the result?
The standard example is downloading some content and saving it at the same time. While your main thread saves a chunk to disk (or uploads to another server) the async thread is busy buffering incoming chunks of data. This means that you effectively only wait for the slowest of the two IO operations. If you did it synchronously would worst case have to wait for all everything to be downloaded and the wait for everything to be saved or uploaded. foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin")) { // While writing to file in this thrad // new chunks are downloaded // in the background thread file.write(chunk); } I hope this makes sense.
Well, this makes sense from a theoretical / high-level perspective, but OS write buffers greatly reduce the practicality of this. In common use cases the speed difference between disk and wire will differ by orders of magnitude as well. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 03 2011
On Sun, 04 Dec 2011 12:59:15 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:Den 03-12-2011 21:58, Vladimir Panteleev skrev:On Sat, 03 Dec 2011 21:17:25 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:The standard example is downloading some content and saving it at the same time. While your main thread saves a chunk to disk (or uploads to another server) the async thread is busy buffering incoming chunks of data. This means that you effectively only wait for the slowest of the two IO operations. If you did it synchronously would worst case have to wait for all everything to be downloaded and the wait for everything to be saved or uploaded. foreach(chunk; byChunkAsync("www.abc.com/hugefile.bin")) { // While writing to file in this thrad // new chunks are downloaded // in the background thread file.write(chunk); } I hope this makes sense.
Well, this makes sense from a theoretical / high-level perspective, but OS write buffers greatly reduce the practicality of this. In common use cases the speed difference between disk and wire will differ by orders of magnitude as well.
Read/write buffers does indeed help a lot and there have been quite some discussions on this topic earlier in this newsgroups regarding tradeoffs etc. Please have a look at these threads (i dont have any links at hand unfortunately).
If you're referring to the discussion which was in the context of copying files, then I have read it. However, it does not apply to typical use cases of curl. The question here is if this example makes any practical sense, and by the looks of everything it does not. Or do you disagree? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 04 2011
On 2011-12-03 05:26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
I vote for std.net.curl. -- /Jacob Carlborg
Dec 04 2011
Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:On 2011-12-03 05:26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
I vote for std.net.curl.
Hey you std.net.curl people must be cheating. You were 50% of the etc.curl people in the past, now you are at 90% of them. Or did some change of mind happen in the community recently?
Dec 04 2011
On Sun, 04 Dec 2011 16:33:52 +0200, Marco Leise <Marco.Leise gmx.de> wrote:Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:On 2011-12-03 05:26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
I vote for std.net.curl.
Hey you std.net.curl people must be cheating. You were 50% of the etc.curl people in the past, now you are at 90% of them. Or did some change of mind happen in the community recently?
That's internet polls for you. I think we should keep voting as NG replies, the same thing we do when voting on Phobos contributions. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 04 2011
Am 04.12.2011, 15:36 Uhr, schrieb Vladimir Panteleev <vladimir thecybershadow.net>:On Sun, 04 Dec 2011 16:33:52 +0200, Marco Leise <Marco.Leise gmx.de> wrote:Am 04.12.2011, 14:02 Uhr, schrieb Jacob Carlborg <doob me.com>:On 2011-12-03 05:26, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
I vote for std.net.curl.
Hey you std.net.curl people must be cheating. You were 50% of the etc.curl people in the past, now you are at 90% of them. Or did some change of mind happen in the community recently?
That's internet polls for you. I think we should keep voting as NG replies, the same thing we do when voting on Phobos contributions.
Polling on the NG has two problems: 1. Polls generate a lot of noise here (single word posts). 2. Counting the results must be done manually. Although voting is not anonymous, you can cheat here as well, by using two names and email addresses. I'll have future polls allow only one vote per IP address in any case.
Dec 05 2011
On Mon, 05 Dec 2011 10:12:00 +0200, Marco Leise <Marco.Leise gmx.de> wrote:Polling on the NG has two problems: 1. Polls generate a lot of noise here (single word posts). 2. Counting the results must be done manually.
These are rather minor problems.Although voting is not anonymous, you can cheat here as well, by using two names and email addresses.
One fake vote is unlikely to swerve the result in either direction, and abundant abuse is obvious. Web form polls are considerably easier to cheat, even with IP limitations. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 05 2011
On Sun, 04 Dec 2011 22:52:13 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:The same applies here because all it comes down to in the end is the sizes of buffers. The async ranges simply allows you to fill a specified number buffers in another thread async. Most OSes also have socket buffers that serves the same purpose but async ranges allows you to specify the buffer size without being privileged. The standard max read buffer size for a tcp connection on the ubuntu I have is set to 112640 which is doubled by the kernel to 225280 bytes. A developer may very well like buffers larger that this without needing to set privileged kernel variables.
I still don't see how this could ever be useful in practice without the ability to poll for whether data is available. Is it possible to write two short programs that use the synchronous and asynchronous APIs in a way that makes a difference? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 05 2011
On Mon, 05 Dec 2011 23:36:27 +0200, Jonas Drewsen <jdrewsen nospam.com> wrote:It is - but as stated it all depends on buffer sizes and IO speed. It is for the same reason that java recommends using BufferedReader around sockets instead of reading directly from the socket stream. Another example that might show another async range advantage and is clearer: auto f1 = byChunkAsync("www.foo.com/file1.txt", 2^^19); auto f2 = byChunkAsync("www.foo.com/file2.txt", 2^^19); auto f3 = byChunkAsync("www.foo.com/file3.txt", 2^^19); // While this iteration goes is done the file2 and file3 are downloaded // in the background threads foreach (l; f1) { writeln(to!string(l)); } // f2 and f3 are probably downloaded now and no waiting is necessary foreach (l; f2) { writeln(to!string(l)); } foreach (l; f3) { writeln(to!string(l)); }
Thanks. Perhaps include this example in the documentation? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Dec 05 2011
Here's my review. Remember, review ends on December 16. Overall, this library has massively improved due to the rounds of review it's been put through. I only found a few minor nitpicks. However, a recurring pattern is minor grammar mistakes in the documentation. Please proofread all documentation again. Docs: "The high level API is build" -> "The high level API is built" "LibCurl is licensed under a MIT/X derivate license" -> "LibCurl is licensed under an MIT/X derivative license" AutoConnect: "Connection type used when the url should be used to auto detect protocol." -> "auto detect THE protocol" Why is there a link to curl_easy_set_opt in the byLineAsync and byChunkAsync docs? In onSend: "The length of the void[] specifies the maximum number of bytes that can be send." -> "can be SENT" What is the use case for exposing struct Curl? I prefer if this were unexposed because we'll obviously be unable to provide a replacement if/when the backend to this library is rewritten in pure D. Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail? Code: pragma(lib) basically doesn't work on Linux because the object format doesn't support it. Don't rely on it. Should the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
Dec 11 2011
On 12/11/2011 7:53 PM, dsimcha wrote:Should the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith("ftp://") || url.startsWith("ftps://") ...
Dec 11 2011
On 12/12/11 6:25 AM, jdrewsen wrote:On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis wrote:On Sunday, December 11, 2011 19:55:28 dsimcha wrote:On 12/11/2011 7:53 PM, dsimcha wrote:Should the protocol detection be case-insensitive, i.e. > "ftp://" == "FTP://"?
Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith("ftp://") || url.startsWith("ftps://") ...
And startsWith will take multiple arguments, so the code definitely shouldn't have x.startsWith(y) || x.startsWith(z) || ... It's more verbose and less efficient that way. - Jonathan M Davis
I hadn't noticed that multiple arguments is accepted. Thanks. /Jonas
You can also pass a predicate for case insensitive comparison. Andrei
Dec 12 2011
Le 12/12/2011 01:53, dsimcha a écrit :Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail?
I think it's good to remind that this is a wrapper and not a native D implementation. At least if I was a user, I would like to know upfront. So I would keep curl in the name.
Dec 12 2011
On 2011-12-12 01:53, dsimcha wrote:Here's my review. Remember, review ends on December 16. What is the use case for exposing struct Curl? I prefer if this were unexposed because we'll obviously be unable to provide a replacement if/when the backend to this library is rewritten in pure D. Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail?
That's a good question. When I think about it, I think it should be considered an implementation detail.Code: pragma(lib) basically doesn't work on Linux because the object format doesn't support it. Don't rely on it. Should the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
I think so. -- /Jacob Carlborg
Dec 12 2011
On Sunday, December 11, 2011 19:55:28 dsimcha wrote:On 12/11/2011 7:53 PM, dsimcha wrote:Should the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith("ftp://") || url.startsWith("ftps://") ...
And startsWith will take multiple arguments, so the code definitely shouldn't have x.startsWith(y) || x.startsWith(z) || ... It's more verbose and less efficient that way. - Jonathan M Davis
Dec 11 2011
Le 03/12/2011 05:26, dsimcha a écrit :I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
Yes. This is a must.
Dec 12 2011
Somedude:1. Should libcurl be bundled with DMD on Windows?
Yes. This is a must.
I agree it's better to have this battery included. Bye, bearophile
Dec 12 2011
On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:Le 05/12/2011 22:36, Jonas Drewsen a écrit :Anyway - I _have_ said that I will add a data availability poll functionality so I guess your initial concern is also covered. /Jonas
Just an idea: would it be possible/useful to use the signals/slots mechanism for this kind of synch ?
This would be most useful if there were some kind of main loop. This is needed because the request is done in another thread and the response should be handled in the main thread. Without an main loop in the main thread or a destinct call to e.g. a poll() method the signal/slot s would not work. As a side note I would very much like D to enter a main loop with a task scheduler as part of the runtime. That way co-routines and event handling would be much nicer to use out of the box. For example: // The main function is running a fiber in the main loop. // the main loop terminates when no more fibers are active void main(string[] args) { fiber( { auto content= download("google.com"); write(content); } ); fiber( { sleep(2.0); writeln("Hello in 2 secs"); } ); writeln("first line written"); } The download() yields until data is downloaded and the sleep() yield until the time has passed. You wouldn't notice this when not using fibers because you would only have the main() fiber which terminates the main loop (and the program) when it is done. Optimally the runtime should support replacing the standard mainloop scheduler with your own. /Jonas
Dec 12 2011
On Monday, 12 December 2011 at 00:53:14 UTC, dsimcha wrote:Here's my review. Remember, review ends on December 16. Overall, this library has massively improved due to the rounds of review it's been put through. I only found a few minor nitpicks. However, a recurring pattern is minor grammar mistakes in the documentation. Please proofread all documentation again. Docs: "The high level API is build" -> "The high level API is built" "LibCurl is licensed under a MIT/X derivate license" -> "LibCurl is licensed under an MIT/X derivative license" AutoConnect: "Connection type used when the url should be used to auto detect protocol." -> "auto detect THE protocol"
okWhy is there a link to curl_easy_set_opt in the byLineAsync and byChunkAsync docs?
will fixIn onSend: "The length of the void[] specifies the maximum number of bytes that can be send." -> "can be SENT"
okWhat is the use case for exposing struct Curl? I prefer if this were unexposed because we'll obviously be unable to provide a replacement if/when the backend to this library is rewritten in pure D.
Initially it was not exposed but there were a couple of requests to expose it. Thats why.Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail?
I think it should have curl in the name. I do not believe that a native D network library should have the same API as this module. It is limited by the design of libcurl and should be improved by a native D net library.Code: pragma(lib) basically doesn't work on Linux because the object format doesn't support it. Don't rely on it.
okShould the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
yes Thanks for the feedback /Jonas
Dec 12 2011
On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis wrote:On Sunday, December 11, 2011 19:55:28 dsimcha wrote:On 12/11/2011 7:53 PM, dsimcha wrote:Should the protocol detection be case-insensitive, i.e. "ftp://" == "FTP://"?
Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith("ftp://") || url.startsWith("ftps://") ...
And startsWith will take multiple arguments, so the code definitely shouldn't have x.startsWith(y) || x.startsWith(z) || ... It's more verbose and less efficient that way. - Jonathan M Davis
I hadn't noticed that multiple arguments is accepted. Thanks. /Jonas
Dec 12 2011
On Monday, 12 December 2011 at 12:24:26 UTC, jdrewsen wrote:I think it should have curl in the name. I do not believe that a native D network library should have the same API as this module. It is limited by the design of libcurl and should be improved by a native D net library.
I agree with this, it's pointless to limit ourselves to an interface that was designed with the limitations of curl in mind. Using a generic name for this module gains us absolutely nothing.
Dec 12 2011
On Tuesday, 13 December 2011 at 00:47:26 UTC, David Nadlinger wrote:I don't know if you already have a solution in the works, but maybe the future interface I did for Thrift is similar to what you are looking for: http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html David
Doesn't std.parallelism's task parallelism API work for this? (Roughly speaking a task in std.parallelism == a future in your Thrift API.) If not, what can I do to fix it so that it can? Looking briefly at your API, one thing I notice is the ability to cancel a future. This would be trivial to implement in std.parallelism for tasks that haven't yet started executing, but difficult if not impossible for tasks that are already executing. Does your Thrift API allow cancelling futures that are already executing? If so, how is that accomplished? The TFutureAggregatorRange could be handled by a parallel foreach loop if I understand correctly, though it would look a little different.
Dec 12 2011
On Monday, 12 December 2011 at 14:42:17 UTC, Andrei Alexandrescu wrote:On 12/12/11 6:25 AM, jdrewsen wrote:On Monday, 12 December 2011 at 01:05:25 UTC, Jonathan M Davis wrote:On Sunday, December 11, 2011 19:55:28 dsimcha wrote:On 12/11/2011 7:53 PM, dsimcha wrote:Should the protocol detection be case-insensitive, i.e. > "ftp://" == "FTP://"?
Oh, one more thing: Factor the protocol detection out into a function. You have the same expression cut and pasted everywhere: if(url.startsWith("ftp://") || url.startsWith("ftps://") ...
And startsWith will take multiple arguments, so the code definitely shouldn't have x.startsWith(y) || x.startsWith(z) || ... It's more verbose and less efficient that way. - Jonathan M Davis
I hadn't noticed that multiple arguments is accepted. Thanks. /Jonas
You can also pass a predicate for case insensitive comparison. Andrei
Perfect
Dec 13 2011
On Tuesday, 13 December 2011 at 00:47:26 UTC, David Nadlinger wrote:On 12/12/11 1:18 PM, jdrewsen wrote:On Monday, 12 December 2011 at 10:26:52 UTC, Somedude wrote:Just an idea: would it be possible/useful to use the signals/slots mechanism for this kind of synch ?
This would be most useful if there were some kind of main loop. This is needed because the request is done in another thread and the response should be handled in the main thread. Without an main loop in the main thread or a destinct call to e.g. a poll() method the signal/slot s would not work.
I don't know if you already have a solution in the works, but maybe the future interface I did for Thrift is similar to what you are looking for: http://klickverbot.at/code/gsoc/thrift/docs/thrift.util.future.html David
I don't have any plans for this in the curl stuff. But it is something like your future implementation and a builtin mainloop I think would make async programming much easier in D. /Jonas
Dec 13 2011
On Friday, December 02, 2011 23:26:10 dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
I'd argue that it should be a separate download but that it should definitely be provided.2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
If we're going to follow a policy that wrappers around 3rd party libraries go in etc, then it should be etc.curl. Otherwise, I think that it should be std.net.curl. I'd argue that AutoConnection should be AutoConnect, because it's shorter and just as informative. I'd argue that the acronyms should be in all caps if camelcasing would require that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp. I'd rename T to C or Char in the template parameters, since it's a character type rather than a generic type. Not a huge deal, but I think that it makes the code quicker to understand. Why are some parameters const(char)[] and others string? Why not make them all const(char)[] or even actually templatize them on character type (one per parameter which is a string)? Shorten the URL in your examples. Its long length increases the risk of making the examples too wide. It's a made up URL, so there's no need for it to be that long. Something like foo.bar.com is probably more than enough. Why does byLine use '\x0a' instead of '\n'? '\n' is clear, whereas most people will have to look up what '\x0a' is, so I really think that it should be '\n'. Also, if byLine is going to return an internal range type instead of having an externally defined one, its documentation needs to explain what Char is used for. As it stands, it looks like a pointless template parameter. To do that, the return section should probably say something like "A range of Char[]..." All of this goes for byLineAsync as well. In general, the functions should do a better job of documenting their parameters. Many of them don't document their parameters at all. For instance, while it's fairly easy to guess what keepTerminator and terminator are used for, there is no explanation about them whatsoever in either byLine or byLineAsync's documentation. If any of the range return types of these functions have non-standard functions on them, they need to be properly documented. e.g. wait(Duration) is mentioned in passing in byLineAsync's documentation, but no explanation for its purpose is given. The documentation needs to explain everything the programmer needs to know about the return types. For most ranges, it's enough to say that the return type is a range of X, but if the return type has anything other than the standard range functions, then just saying that the function returns a range of X is not enough. The extra functions must be listed and explained. I'd warn you that the fact that the documentation for Protocol is showing up in the documentation is likely a bug, since Protocol is private - probably the bug that all templates are currently public even if they're marked private. So, either Protocol needs to be public (or maybe protected), or it's going to disappear from the documentation eventually. Another option would be to do something similar to std.container.TotalContainer and declare a struct which is intended only for documentation (probably in a version(StdDdoc) block), and put all of the protocol documenation on _it_ instead of the mixin. Also, Protocol needs a better explanation as to what it's actually for. "Mixin template for all supported curl protocols" doesn't really tell me anything other than the fact that it relates to the supported protocols somehow. Looking at the source, it seems clear enough, but not from the documentation - especially when the documentation seems to imply that it's specific to the Http struct. onReceiveHeader has const(char)[] parameters but mentions char[] in its documentation. The meaning is clear enough, but it would be more correct to say const(char)[] or to just mention the variable names explicitly. It would be nice if Smtp.mailTo was variadic - i.e. mailTo(string[] recipients...). All of the current use cases for it would be exactly the same with the addition of being able to do something like mailTo(str1, str2, str3). The exception type's constructors should look like this: nothrow this(string msg, string file = __FILE__, size_t line = __LINE__, Throwable next = null) { super(msg, file, line, next); } As it stands, the line number will be completely wrong. You can just copy std.string.StringException and rename it. And please use that exact signature. Don't make it take a const(char)[]. Exception requires a string, and with the constructor taking const(char)[], it's going to needlessly idup every string that gets passed into it. The code which calls it can idup if it has a char[] instead of an immutable(int)[]. Having it take a const(char)[] adds no value and is less efficient. The same goes for any other function which is always going to idup its const(char)[] argument. Just make it take a string. Why do CURLoption and CURLcode have CURL in uppercase? That doesn't match the rest of the module. I'd argue that they should be CurlOption and CurlCode. To make matters even weirder, your examples appear to use CurlOption but the declarations use CURLoption. They should be consistent. - Jonathan M Davis
Dec 14 2011
== Quote from jdrewsen (jdrewsen nospam.com)'s articleOn Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:
I'd argue that the acronyms should be in all caps if camelcasing would require that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
(I'm sorry if this comes across as a repost, but I guess I don't know how to post using the mailing list.) I absolutely agree that the "D Style" page should be revised and extended to reflect the current standards that new Phobos modules are expected to adhere to. I found an open pull request that indicated a past effort in this regard: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 (I thought that another document was also floating around that was a proposed new "Style Guide for Phobos", but I don't recall who authored it and I don't remember where it was located.) jcc7
Dec 17 2011
== Quote from jdrewsen (jdrewsen nospam.com)'s articleOn Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:
I'd argue that the acronyms should be in all caps if camelcasing would require that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
(I'm sorry if this comes across as a repost, but I guess I don't know how to post using the mailing list.) I absolutely agree that the "D Style" page should be revised and extended to reflect the current standards that new Phobos modules are expected to adhere to. I found an open pull request that indicated a past effort in this regard: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 (I thought that another document was also floating around that was a proposed new "Style Guide for Phobos", but I don't recall who authored it and I don't remember where it was located.) jcc7
Dec 17 2011
On Monday, December 12, 2011 01:53:13 dsimcha wrote:Actually, that leads to another question: Should this module really be named etc.curl/std.curl/std.net.curl, or should the fact that it currently uses Curl as a backend be relegated to an implementation detail?
It's too specific to curl for curl to be an implementation detail. I expect that if we were designing a general API or one of our own which didn't use curl at all, I wouldn't expect the API to be the same. - Jonathan M Davis
Dec 14 2011
Line# 235 is identical to line# 239. Shouldn't line# 235 be creating an Http
object, not an Ftp object? That mistake definitely makes it look like download
hasn't been properly tested.
You should create template similar to
template isCurlConn(Conn)
{
auto isCurlConn = is(Conn : Http) || is(Conn : Ftp) || is(Conn :
AutoConnection);
}
It would reduce code duplication.
The functions which have a template parameter T = char really need to have
that parameter properly explained in the documentation. From the
documentation, I would have thought that it was any character type, but it's
char or ubyte. There is no hint whatsoever in the documentation that ubyte
would work. All of the parameters need to be made clear. That goes for all
functions.
Is there a reason that the functions which have a template argument T which
can be either a char or a ubyte can't work with immutable char? Glancing at
_basicHttp, I don't see any reason why T couldn't be immutable char. Yes, it
would require casting to immutable(char)[], but you're already casting to
char[], and the data being returned appears to be unique such that it could be
safely cast to immutable. That being the case, I'd encourage you to not only
make it work with immutable char but to make immutable char the default
instead of char.
Once you've fixed the exception types like I requested in my first review, you
can and should use enforceEx!CurlException(cond, msg) instead of enforce(cond,
new CurlException(msg)).
it may not matter, since we're dealing with strings here, but I'd argue that
!empty should be used rather than length > 0 (e.g. line# 475). In stuff other
than strings it definitely can be more efficient, and even with strings, it may
be, since I think that checking whether the length is 0 (as empty does) is
slightly more efficient than checking whether it's greater than 0.
Please make sure that opening braces on on their own line. That's the way that
the rest of Phobos does it and it's one of the few formatting rules that we've
been trying to use consistently throughout Phobos. For the most part, you get
it right, but not everywhere - nested functions in particular seem to have
braces on the same line for some reason.
It's not a big deal, but you should use auto more. For instance, lines #626,
#638, and #640 don't use auto when they could.
I have no idea why you keep putting parens around uses of the in operator, but
it's not necessary and makes the code harder to read IMHO. It's certainly not
required that you change that, but I'd appreciate it if you did.
I see that regexes are used in the module. Please make sure that they still
work correctly with the new std.regex. They probably do, but it's not 100%
backwards compatible.
byLine's template constraint needs to verify that the types of Terminator and
Char are valid. As it is, I could try and pass it something like an Http
struct if I wanted to. In general, _all_ templates need to verify that their
arguments are of the appropriate type for _all_ of their arguments.
It's odd that popFront (line #780) is not safe when empty and front are. Does
findSplit or findSplitAfter prevent it?
With a name like Char on byLine, I'd expect it to take any char type, but not
only do you not verify the type (as previously mentioned), but you instantiate
_basicHttp with it, which works with char and ubyte, not any char. If Char is
going to take char and ubyte specifically, then Char is a bad name for it.
I'd suggest changing line# 823 to
auto result = get(url, isFtpUrl(url) ? FTP() : Http());
It would save several lines of code.
Line# 883 should be using +=.
This would be easier if I could comment on the code directly, but I don't seem
to be able to with the link that dsimcha provided (probably because it's blob;
there's probably another link which would allow direct commenting, but oh
well). I'll comment more on the implementation later, but bed beckons...
- Jonathan M Davis
Dec 15 2011
Please make sure that you remove trailing whitespace from the file. A lot of
the lines have trailing whitespace. Also, make sure that you don't have any
tabs in the file. There are a few places where you used tabs.
Line# 916 claims that the code there won't work and that it needs to be fixed,
so please fix it before it goes into Phobos (assuming that it's voted in).
On line# 1051, just do
auto arr = new Char[](bufferSize);
There's no reason to create it and then assign to its length property.
The braces in the block follow line# 1131 need to be fixed. The indentation is
wrong on some of them, and not all of them are on their own line.
You should see if the cast(void[])'s can be removed from byLineAsync. They
might still be necessary, but null was given a type with the last release, so
it might work without the cast now. The same goes with byChunkAsync.
Same with line# 1215 as 1051. There's no point in wasting a line of code by
declaring an array and the assigning to its length variable. The code is
probably slightly less efficient that way too. Just allocate a new array of the
correct length.
On line#1297, the braces need to be fixed. Same with line# 1308 and line# 1334.
Just check over your braces in general to make sure that they're consistent
and are on their own line unless you're dealing with something like a one line
lambda.
And as I mentioned before, all of these enforces which take a new
CurlException, should be changed to use enforceEx after you've fixed
CurlException's constructor.
The documentation on Protocol's isValid is wrong. It claims that isValid is
true if the instance is stoppend and invalid. Wouldn't that mean that the
object _wasn't_ valid?
There are quite a few places where you're concatenating several strings and
format would make the code much cleaner (e.g. Protocol's netInterface
function). Unless you're avoiding format in an attempt to allow functions to
be pure (since unfortunately, format can't currently be pure), you really
should be using format.
I'd argue that it's better to use empty than check whether a string is equal
to "". e.g. domain.empty instead of domain != "" on line# 1538.
There bodies of Http's constructor, static opCall, and dup functions are all
very similar - especially the constructor and static opCall. You should look
at refactoring those so that they don't have to duplicate so much code. Maybe
it can't be reasonably done easily, but it would certainly be better if that
code duplication could be reduced.
Change setTimeCondition to take a SysTime. DateTime is not intended for
timestamps. It's intended for calendar-based time, not the system time. Also,
SysTime makes it easier to convert to time_t. In general, the fact that you're
looking to deal with a time_t is a sign that you should be using SysTime. This
is especially true, since DateTime is going to give the wrong time_t in
general, since it has no time zone. As it stands, the timestamp is assumed to
be in UTC. That's just begging for bugs. Users won't expect that. Definitely
make it a SysTime. And once that's done, line# 2016 would disappear, and line#
2017 would become
p.curl.set(CurlOption.timevalue, timestamp.toUnixTime());
On line# 2247, why do you use format with no format string? There should be no
~ or to!string in a call to format. It's just creating unnecessary memory
allocations. It should be something more like
format("%s%s(%s.", code, reason, majorVersion, minorVersion);
Though I find that '(' to be rather odd, since it has no matching closing
paren, so one may need to be added. Regardless, that line shouldn't be
concatenating strings or use conv.to. format takes care of all of that and
should do it with fewer memory allocations.
Pointers really should have their * next to the type not floating in space like
on line #2274. As it is, the code looks like a multiplication. Unlike in
C/C++, the * is clearly associated with the type. e.g.
int* a, b;
creates an int* and an int in C, but it creates int* and int* in D. Separating
the * from the type has a tendancy to make the code harder to understand.
Ftp's consturctor and static opCall have a code duplication problem similar to
that of Http.
Make encoding take a string. Any time that you are _always_ going to idup a
parameter which is a character array, make it a string, _not_ const. That way,
if it's a string, no idup is necessary, and if it's char[], then it can be
iduped when it's passed in, and iduping only occurs when it's actually
necessary.
In Smtp's constructor, yo ushould probably create a variable for the toLowered
url. e.g.
auto lowered = url.toLower();
That way, you avoid having to lower it twice if the else branch of the if-else
is taken.
Is p.curl.perform a property? If so, line# 2455 does nothing. If not, then it
needs parens. The curl module should be able to be built with -property. I
believe that Phobos as a whole is currently being built with tha flag. If not,
it will be soon, and this module will need to do that if it's merged into
Phobos.
If you can, please use a static foreach with EnumMembers!CurlOption in Curl's
dup. And if you can't, because you're not clearing all of them, then put all
of the ones that you _are_ clearing in an array (or TypeTuple if you want to
avoid the allocation for the array) and iterate over them with a foreach. e.g.
foreach(option; TypeTuple!(CurlOption.file, CurlOptions.writefunction))
copy.clear(option);
You should be able to cut down on the number of lines by doing that
(_especially_ if you're actually clearing all of them and can use
EnumMembers!CurlOption).
Line#2639 is another case where format should be used. In general, if you're
doing more than one ~, consider using format unless you're trying to make a
function pure.
I'd suggest renaming Message to something like CurlMessage. The odds of a name
clash with Message are likely high. And while they won't be able to use
Message, since it's private, it will still affect whether the full path for
Message must be given (e.g. my.module.Message instead of Message). CurlMessage
is far less likely to clash.
Rename DATA to Data. DATA does not follow Phobos' naming conventions. Granted,
it's private, but it's completely off. Type names should be pascal cased.
Why isn't empty a property on #3056? Sure, it's not a range, but it would be
more consistent with everything else if it were a property.
In general, you should use when converting between TickDuration and Duration,
but in some cases, no conversion should be necessary. For instance, line# 985
shouldn't need to do any converting. Duration's opOpAssign should be able to
handle a TickDuration.
Overall, the design looks fairly solid, and the code looks pretty good, but
there are definitely a number of minor issues which should be addressed before
this code makes it into Phobos. The biggest involve the functions' parameters
(such as using SysTime, not DateTime, and using string in some places where
you're using const(char)[] in order to avoid unnecessary idups). And those
definitely need to be sorted out sooner rather than later, since they affect
the
API and will break code if they're changed later.
- Jonathan M Davis
Dec 16 2011
On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:On Friday, December 02, 2011 23:26:10 dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
I'd argue that it should be a separate download but that it should definitely be provided.2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
If we're going to follow a policy that wrappers around 3rd party libraries go in etc, then it should be etc.curl. Otherwise, I think that it should be std.net.curl. I'd argue that AutoConnection should be AutoConnect, because it's shorter and just as informative.
AutoConnect sounds like a command to connection automatically which might be confusing since that is not what it does. Therefore I went with AutoConnection which I still believe is better.I'd argue that the acronyms should be in all caps if camelcasing would require that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
I like Http better as personal taste. But if HTTP is preferred then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.I'd rename T to C or Char in the template parameters, since it's a character type rather than a generic type. Not a huge deal, but I think that it makes the code quicker to understand.
I've named it T because it can also be ubyte. In that case C for Char is confusing.Why are some parameters const(char)[] and others string? Why not make them all const(char)[] or even actually templatize them on character type (one per parameter which is a string)?
The parameters that are const(char)[] will accept both string and char[] which is what I want because libcurl make an internal copy of it anyway. If I typed it as string then you would have to idup a char[] for the parameter without any reason. The parameters that are string is internally passed to functions (in other std modules) that only accept strings. I could accept const(char)[] and idup it in order to have consistant parameter types but that gives an unnecessary overhead.Shorten the URL in your examples. Its long length increases the risk of making the examples too wide. It's a made up URL, so there's no need for it to be that long. Something like foo.bar.com is probably more than enough.
Actually the long urls point to a google appspot app that fits the example. I've done it this way so that people can just copy/paste the example and have something working. Not too many people have a test HTTP POST server setup for a quick test AFAIK.Why does byLine use '\x0a' instead of '\n'? '\n' is clear, whereas most people will have to look up what '\x0a' is, so I really think that it should be '\n'.
I looked at how it was done elsewhere in phobos and did like that: http://dlang.org/phobos/std_stdio.html#byLine But I agree with you that \n is better and will change it.Also, if byLine is going to return an internal range type instead of having an externally defined one, its documentation needs to explain what Char is used for. As it stands, it looks like a pointless template parameter. To do that, the return section should probably say something like "A range of Char[]..." All of this goes for byLineAsync as well.
Ok.In general, the functions should do a better job of documenting their parameters. Many of them don't document their parameters at all. For instance, while it's fairly easy to guess what keepTerminator and terminator are used for, there is no explanation about them whatsoever in either byLine or byLineAsync's documentation.
Again I looked at phobos to see what was current the style: http://dlang.org/phobos/std_stdio.html#byLine I'll include some more detail though.If any of the range return types of these functions have non-standard functions on them, they need to be properly documented. e.g. wait(Duration) is mentioned in passing in byLineAsync's documentation, but no explanation for its purpose is given. The documentation needs to explain everything the programmer needs to know about the return types. For most ranges, it's enough to say that the return type is a range of X, but if the return type has anything other than the standard range functions, then just saying that the function returns a range of X is not enough. The extra functions must be listed and explained.
This is what is currently documented: "If no data is available and the main thread access the range it will block until data becomes available. An exception to this is the $(D wait(Duration)) method on the range. This method will wait at maximum for the specified duration and return true if data is available." I guess that explains it?I'd warn you that the fact that the documentation for Protocol is showing up in the documentation is likely a bug, since Protocol is private - probably the bug that all templates are currently public even if they're marked private. So, either Protocol needs to be public (or maybe protected), or it's going to disappear from the documentation eventually. Another option would be to do something similar to std.container.TotalContainer and declare a struct which is intended only for documentation (probably in a version(StdDdoc) block), and put all of the protocol documenation on _it_ instead of the mixin.
Ok. didn't know about the private template bug. I really think it is a bug in ddoc that it should be able to insert mixed in docs ie. the Protocol documentation should be mixed into the Http documentation. Anyway - I don't see the TotalContainer docs showing in the generated html? Another workaround is to manually copy the Protocol docs into the HTTP/FTP/SMTP structs inside a version(StdDdoc).Also, Protocol needs a better explanation as to what it's actually for. "Mixin template for all supported curl protocols" doesn't really tell me anything other than the fact that it relates to the supported protocols somehow. Looking at the source, it seems clear enough, but not from the documentation - especially when the documentation seems to imply that it's specific to the Http struct.
I'll remove the HTTP hint and improve the doc.onReceiveHeader has const(char)[] parameters but mentions char[] in its documentation. The meaning is clear enough, but it would be more correct to say const(char)[] or to just mention the variable names explicitly.
ok.It would be nice if Smtp.mailTo was variadic - i.e. mailTo(string[] recipients...). All of the current use cases for it would be exactly the same with the addition of being able to do something like mailTo(str1, str2, str3).
okThe exception type's constructors should look like this: nothrow this(string msg, string file = __FILE__, size_t line = __LINE__, Throwable next = null) { super(msg, file, line, next); } As it stands, the line number will be completely wrong. You can just copy std.string.StringException and rename it. And please use that exact signature. Don't make it take a const(char)[]. Exception requires a string, and with the constructor taking const(char)[], it's going to needlessly idup every string that gets passed into it. The code which calls it can idup if it has a char[] instead of an immutable(int)[]. Having it take a const(char)[] adds no value and is less efficient. The same goes for any other function which is always going to idup its const(char)[] argument. Just make it take a string.
okWhy do CURLoption and CURLcode have CURL in uppercase? That doesn't match the rest of the module. I'd argue that they should be CurlOption and CurlCode. To make matters even weirder, your examples appear to use CurlOption but the declarations use CURLoption. They should be consistent.
ok Thanks for the comments. -Jonas
Dec 17 2011
--f46d04083875582ee704b44b4685 Content-Type: text/plain; charset=ISO-8859-1 On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <jdrewsen nospam.com> wrote:On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:...
I'd argue that the acronyms should be in all caps if camelcasing wouldrequire that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
I like Http better as personal taste. But if HTTP is preferred then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
I absolutely agree that the "D Style" page should be revised and extended to reflect the current standards that new Phobos modules are expected to adhere to. I found an open pull request that indicates a past effort in this regard: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 (I thought that another document was also floating around that was a proposed new "Style Guide for Phobos", but I don't recall who authored it and I don't remember where it was located.) jcc7 --f46d04083875582ee704b44b4685 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <span dir=3D"ltr"><<a href=3D"= mailto:jdrewsen nospam.com">jdrewsen nospam.com</a>></span> wrote:<br><d= iv class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:= 0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class=3D"im">On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M= Davis wrote:<br> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div class=3D"im"><blockquote class=3D= "gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding= -left:1ex"> <br></blockquote></div></blockquote></blockquote><div>...=A0</div><blockquo= te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so= lid;padding-left:1ex"><div class=3D"im"> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> I'd argue that the acronyms should be in all caps if camelcasing would = require that the first letter of the acronym be capitalized and all lower c= ase if camelcasing would require that the first letter of the acronym be lo= wer case. We're not completely consistent in how we using acronyms in n= ames in Phobos, but I believe that we primarily follow that rule when putti= ng them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP = rather than Http, Ftp, and Smtp.<br> </blockquote> <br></div> I like Http better as personal taste. But if HTTP is preferred then I'l= l do that. Actually I think I will make a pull request to extend the <a hre= f=3D"http://dlang.org/dstyle.html" target=3D"_blank">dlang.org/dstyle.html<= /a> doc with and example that we can point to when someone asks about styli= ng. I've spend too much time restyling because there is no such style d= oc already and people are complaining about style in reviews anyway.</block= quote> <div><br></div><div>I absolutely agree that the "D Style" page sh= ould be revised and extended to reflect the current standards that new Phob= os modules are expected to adhere to.=A0</div><div><br></div><div>I found a= n open pull request that indicates a past effort in this regard:</div> <div><a href=3D"https://github.com/D-Programming-Language/d-programming-lan= guage.org/pull/16">https://github.com/D-Programming-Language/d-programming-= language.org/pull/16</a> </div><div><br></div><div>(I thought that another document was also floatin= g around that was a proposed new "Style Guide for Phobos", but I = don't recall who authored it and I don't remember where it was loca= ted.)</div> <div><br></div><div>jcc7</div></div> --f46d04083875582ee704b44b4685--
Dec 17 2011
The docs for onReceiveHeader, http://freeze.steamwinter.com/D/web/phobos/etc_curl.html#onReceiveHeader explicitly says that the const string parameters are not valid after the function returns. This is all well and good; but a minor improvement would be to change the signature to: property void onReceiveHeader(void delegate(in char[] key, in char[] value) callback); to add `scope` to the parameters. This documents the fact that the constant strings shouldn't be escaped as-is in both code and documentation. I suspect many other instances of `const(char)[]` could be changed to `in char[]` to better document how they use the parameters (I also personally think it looks better as it is more concise).
Dec 17 2011
On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:AutoConnect sounds like a command to connection automatically which might be confusing since that is not what it does. Therefore I went with AutoConnection which I still believe is better.
How about something completely different then, like URLInfer? Also, some instances in the docs of "url" in lowercase should probably be changed to uppercase.
Dec 17 2011
On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis wrote:Line# 235 is identical to line# 239. Shouldn't line# 235 be creating an Http object, not an Ftp object? That mistake definitely makes it look like download hasn't been properly tested.
It has been tested as you can see by the unittest just below the function. It just did not fail because libcurl requires the same setup for this download situation for both ftp and http. Will fix it of course.You should create template similar to template isCurlConn(Conn) { auto isCurlConn = is(Conn : Http) || is(Conn : Ftp) || is(Conn : AutoConnection); } It would reduce code duplication.
okThe functions which have a template parameter T = char really need to have that parameter properly explained in the documentation. From the documentation, I would have thought that it was any character type, but it's char or ubyte. There is no hint whatsoever in the documentation that ubyte would work. All of the parameters need to be made clear. That goes for all functions.
okIs there a reason that the functions which have a template argument T which can be either a char or a ubyte can't work with immutable char? Glancing at _basicHttp, I don't see any reason why T couldn't be immutable char. Yes, it would require casting to immutable(char)[], but you're already casting to char[], and the data being returned appears to be unique such that it could be safely cast to immutable. That being the case, I'd encourage you to not only make it work with immutable char but to make immutable char the default instead of char.
Is is indeed unique and can be cast to immutable. I'll add that option. Why do you think immutable char as the default is better than char? I know that the return type in that case would be string and not char[] - but why is that better?Once you've fixed the exception types like I requested in my first review, you can and should use enforceEx!CurlException(cond, msg) instead of enforce(cond, new CurlException(msg)).
okit may not matter, since we're dealing with strings here, but I'd argue that !empty should be used rather than length > 0 (e.g. line# 475). In stuff other than strings it definitely can be more efficient, and even with strings, it may be, since I think that checking whether the length is 0 (as empty does) is slightly more efficient than checking whether it's greater than 0.
okPlease make sure that opening braces on on their own line. That's the way that the rest of Phobos does it and it's one of the few formatting rules that we've been trying to use consistently throughout Phobos. For the most part, you get it right, but not everywhere - nested functions in particular seem to have braces on the same line for some reason.
When I look through the code it seems ok with regards to braces on own line. Maybe by nested functions you mean delegates? The delegates I use do indeed have braces on same line which is ok afaik.It's not a big deal, but you should use auto more. For instance, lines #626, #638, and #640 don't use auto when they could.
I'm a bit in doubt. On one hand it is great to make everything auto in order to make it easy to change type and to remove redundancy. One the other hand it is very convenient to be able to see the type when you read the code. I know that a clever editor would be able to figure out the type for me. But I also read code in normal editors for diffs etc. or on github. Anyway - I've changed as much as possible to auto now.I have no idea why you keep putting parens around uses of the in operator, but it's not necessary and makes the code harder to read IMHO. It's certainly not required that you change that, but I'd appreciate it if you did.
Bad habit. Will remove.I see that regexes are used in the module. Please make sure that they still work correctly with the new std.regex. They probably do, but it's not 100% backwards compatible.
okbyLine's template constraint needs to verify that the types of Terminator and Char are valid. As it is, I could try and pass it something like an Http struct if I wanted to. In general, _all_ templates need to verify that their arguments are of the appropriate type for _all_ of their arguments.
okIt's odd that popFront (line #780) is not safe when empty and front are. Does findSplit or findSplitAfter prevent it?
exactlyWith a name like Char on byLine, I'd expect it to take any char type, but not only do you not verify the type (as previously mentioned), but you instantiate _basicHttp with it, which works with char and ubyte, not any char. If Char is going to take char and ubyte specifically, then Char is a bad name for it.
I'll verify the type. For byLine it will only accept char types.I'd suggest changing line# 823 to auto result = get(url, isFtpUrl(url) ? FTP() : Http());
Not possible. get() is a template function where the template parameter is connection type ie. one of FTP or HTTP. isFtpUrl(url) is evaluated at runtime and not compile time.It would save several lines of code. Line# 883 should be using +=.
okThis would be easier if I could comment on the code directly, but I don't seem to be able to with the link that dsimcha provided (probably because it's blob; there's probably another link which would allow direct commenting, but oh well). I'll comment more on the implementation later, but bed beckons... - Jonathan M Davis
Thanks for this second round of comments /Jonas
Dec 17 2011
On 12/2/11 10:26 PM, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
Yes.2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
std.net.curlCode: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
I have a few comments. They are not critical for library's admission (apologies I didn't make the deadline) and could be integrated before the commit. 1. Tables must be centered (this is a note to self) 2. Replace d-p-l.org in examples with dlang.org 3. "Compared to using libcurl directly this module provides a simpler API for performing common tasks." -> "Compared to using libcurl directly this module allows simpler client code for common uses, requires no unsafe operations, and integrates better with the rest of the language." 4. From the high level ops it seems there's no way to issue a PUT/POST and then direct the result to a file, or read the result with a foreach. 5. Also there seems to be no way to issue a PUT/POST from a range. 6. s/Examples:/Example:/ 7. s/HTTP http = HTTP("www.d-p-l.org");/auto http = HTTP("www.d-p-l.org");/ 8. "receives a header or a data buffer" -> "receives a header and a data buffer, respectively" 9. "In this simple example, the headers are writting to stdout and the data is ignored." How can we make it to stop the request? A sentence about that would be great. 10. "Finally the HTTP request is started by calling perform()." -> "Finally the HTTP request is effected by calling perform(), which is synchronous." 11. Use InferredProtocol instead of AutoConnect(ion). The type is exactly what it claims. If "inferred" is too long, "AutoProtocol" would be fine too. But it's not the connection that is automated. The word must not be a verb, i.e. InferProtocol/DeduceProtocol would not be good. 12. The example for InferredProtocol nee AutoConnection does not show any interesting example, e.g. ftp. 13. I think download() and upload() are good convenience functions but are very limited. Such operations can last arbitrarily long and it's impossible to stop them. Therefore, they will foster poor program design (e.g. program hangs until you kill it etc). We should keep these for convenience in scripts, and we already have byXxx() for downloading, but we don't have anything for uploading. We should offer output ranges for uploading sync/async. Maybe in the next minor release. 14. Parameter doc for put(), post(), del() etc. is messed up. 15. I don't think "connection" is the right term. For example we have T[] get(Conn = AutoConnection, T = char)(const(char)[] url, Conn conn = Conn()); The connection is established only during the call to get(), so conn is not really a connection - more like a sort of a bundle of connection parameters. But then it does clean up the connection once established, so I'm unsure... 16. For someone who'd not versed in HTTP options, the example string content = options("d-programming-language.appspot.com/testUrl2", "something"); isn't terribly informative. 17. In byLine: "A range of lines is returned when the request is complete." This suggests the last byte has been read as the client gets to the first, which is not the case. "Data will be read on demand as the foreach loop makes progress." Similar for byChunk. 18. In byXxxAsync wait(Duration) should be cross-referenced to its definition. ========================= Overall: I think this is a valuable addition to Phobos, but I have the feeling we don't have a good scenario for interrupting connections. For example, if someone wants to offer a "Cancel" button, the current API does not give them a robust option to do so: all functions that transfer data are potentially blocking, and there's no thread-shared way to interrupt a connection in a thread from another thread. Such functionality may be a pure addition later, so I think we can commit this as is. Please use std.net.curl. Thanks, Andrei
Dec 17 2011
On Saturday, 17 December 2011 at 22:25:07 UTC, Andrei Alexandrescu wrote:On 12/2/11 10:26 PM, dsimcha wrote:I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows?
Yes.2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab )
std.net.curl
okCode: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
I have a few comments. They are not critical for library's admission (apologies I didn't make the deadline) and could be integrated before the commit. 1. Tables must be centered (this is a note to self) 2. Replace d-p-l.org in examples with dlang.org
ok3. "Compared to using libcurl directly this module provides a simpler API for performing common tasks." -> "Compared to using libcurl directly this module allows simpler client code for common uses, requires no unsafe operations, and integrates better with the rest of the language."
ok4. From the high level ops it seems there's no way to issue a PUT/POST and then direct the result to a file, or read the result with a foreach.
Use byLine/Chunk with a connection initialized with post/put data: auto http = HTTP(); http.postData = "hello world"; foreach (line; byLine("dlang.org/there", KeepTerminator.no, '\n', http)) writeln(line); Same deal for directing to file using download()5. Also there seems to be no way to issue a PUT/POST from a range.
Correct. It will be a later extension.6. s/Examples:/Example:/
ok7. s/HTTP http = HTTP("www.d-p-l.org");/auto http = HTTP("www.d-p-l.org");/
ok8. "receives a header or a data buffer" -> "receives a header and a data buffer, respectively"
ok9. "In this simple example, the headers are writting to stdout and the data is ignored." How can we make it to stop the request? A sentence about that would be great.
ok10. "Finally the HTTP request is started by calling perform()." -> "Finally the HTTP request is effected by calling perform(), which is synchronous."
ok11. Use InferredProtocol instead of AutoConnect(ion). The type is exactly what it claims. If "inferred" is too long, "AutoProtocol" would be fine too. But it's not the connection that is automated. The word must not be a verb, i.e. InferProtocol/DeduceProtocol would not be good.
ok12. The example for InferredProtocol nee AutoConnection does not show any interesting example, e.g. ftp.
improved13. I think download() and upload() are good convenience functions but are very limited. Such operations can last arbitrarily long and it's impossible to stop them. Therefore, they will foster poor program design (e.g. program hangs until you kill it etc). We should keep these for convenience in scripts, and we already have byXxx() for downloading, but we don't have anything for uploading. We should offer output ranges for uploading sync/async. Maybe in the next minor release.
Using the same method as for post it is possible to set timeouts. auto http = HTTP(); http.dataTimeout = dur!"minutes"(2); foreach (line; byLine("dlang.org/there", KeepTerminator.no, '\n', http)) writeln(line); Regarding uploading ranges this will be handled by the same extension as for POST/PUT output ranges.14. Parameter doc for put(), post(), del() etc. is messed up.
ok15. I don't think "connection" is the right term. For example we have T[] get(Conn = AutoConnection, T = char)(const(char)[] url, Conn conn = Conn()); The connection is established only during the call to get(), so conn is not really a connection - more like a sort of a bundle of connection parameters. But then it does clean up the connection once established, so I'm unsure...
But you can provide an existing connection to use has that is already connected to the server. Hence the name connection.16. For someone who'd not versed in HTTP options, the example string content = options("d-programming-language.appspot.com/testUrl2", "something"); isn't terribly informative.
ok - improved17. In byLine: "A range of lines is returned when the request is complete." This suggests the last byte has been read as the client gets to the first, which is not the case. "Data will be read on demand as the foreach loop makes progress." Similar for byChunk.
This will change when multi API is supported as well.18. In byXxxAsync wait(Duration) should be cross-referenced to its definition.
ok========================= Overall: I think this is a valuable addition to Phobos, but I have the feeling we don't have a good scenario for interrupting connections. For example, if someone wants to offer a "Cancel" button, the current API does not give them a robust option to do so: all functions that transfer data are potentially blocking, and there's no thread-shared way to interrupt a connection in a thread from another thread. Such functionality may be a pure addition later, so I think we can commit this as is. Please use std.net.curl.
They can abort from the callback handlers - but otherwise it is a limitation of the libcurl easy API. To actually do this the multi extension would have to be added. Thank you for the comments and sorry for the late reply. /Jonas
Dec 30 2011
On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum wrote:On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:AutoConnect sounds like a command to connection automatically which might be confusing since that is not what it does. Therefore I went with AutoConnection which I still believe is better.
How about something completely different then, like URLInfer? Also, some instances in the docs of "url" in lowercase should probably be changed to uppercase.
Will change case. Don't know if the URLInfer name is better though. /Jonas
Dec 17 2011
On Saturday, 17 December 2011 at 22:25:10 UTC, jdrewsen wrote:On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum wrote:On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:AutoConnect sounds like a command to connection automatically which might be confusing since that is not what it does. Therefore I went with AutoConnection which I still believe is better.
How about something completely different then, like URLInfer? Also, some instances in the docs of "url" in lowercase should probably be changed to uppercase.
Will change case. Don't know if the URLInfer name is better though. /Jonas
Andrei is right, a noun is better. I'm with InferredProtocol.
Dec 17 2011
On Saturday, December 17, 2011 12:56:02 jdrewsen wrote:On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M DavisI'd argue that the acronyms should be in all caps if camelcasing would require that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
I like Http better as personal taste. But if HTTP is preferred then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
I actually have such a pull request, but it's been languishing because there are some items in it that need to be discussed. I keep forgetting to bring them up in the Phobos news group so that they can be decided and we can move on with that. https://github.com/D-Programming-Language/d-programming-language.org/pull/16I'd rename T to C or Char in the template parameters, since it's a character type rather than a generic type. Not a huge deal, but I think that it makes the code quicker to understand.
I've named it T because it can also be ubyte. In that case C for Char is confusing.
Yeah. I got that after reading the source, but the needs to be clearer in the documentation. Normally, when something is templated the way that that is, it's going to be templated on character type. And something which can be either ubyte or char but nothing else is rather abnormal. It needs to be clearer. And it's possible that at some point, it really should be changed to be any character type and ubyte rather than just char or ubyte (depending on how that affect efficiency - basically if it's more efficient to make it work with dchar from the get-go rather then get a string and convert it, then it's more desirable to make it work with wchar and dchar, but if you're just going to have to duplicate it or convert it anyway, then you might as well just restrict it to char and ubyte).Why are some parameters const(char)[] and others string? Why not make them all const(char)[] or even actually templatize them on character type (one per parameter which is a string)?
The parameters that are const(char)[] will accept both string and char[] which is what I want because libcurl make an internal copy of it anyway. If I typed it as string then you would have to idup a char[] for the parameter without any reason. The parameters that are string is internally passed to functions (in other std modules) that only accept strings. I could accept const(char)[] and idup it in order to have consistant parameter types but that gives an unnecessary overhead.
Okay. In general, functions should either take string or take a range of dchar. Taking a range of dchar is the most flexible, so it's generally the best (though this often results in specializations within the template for narrow strings - std.algorithm does that a lot). If you need a string spefically (e.g. all of the std.string functions specifically operate on strings), then taking an array that's templated on character type is generally better, because it's more flexible. If you need a string internally, then do _not_ use const(char)[]. or const(C) []. When you do that, you're forced to idup the string (or to!string is forced to idup it). It's far better to either take string (in which case the caller will idup if it's necessary) or to not use const. That way, the string is copied only when it actually needs to be copied. That does sometimes result in functions which take string for some parameters and const(char)[] for others, which looks odd to the programmer calling it, but that's life I guess. It's less of an issue when ranges are used, however, because then they're all ranges and it's just how theyre treated internally, I guess. I don't know how well you've done with this in general without looking over the code again. I know that you've generally taken const(char)[] instead of string, and sometimes you've iduped that. And those parameters needs to be fixed to take string and use to!string so that the iduping isn't always necessary. Also, since in most cases, you're ultimately passing the strings to the C curl API, the benefit of operating on ranges of dchar instead of strings is debatable, so it's not necessarily a problem that you're not taking ranges of dchar. You do, however, need to make sure that you use string rather than const(char)[] in places where you're iduping so that we can avoid such unnecessary allocations.Shorten the URL in your examples. Its long length increases the risk of making the examples too wide. It's a made up URL, so there's no need for it to be that long. Something like foo.bar.com is probably more than enough.
Actually the long urls point to a google appspot app that fits the example. I've done it this way so that people can just copy/paste the example and have something working. Not too many people have a test HTTP POST server setup for a quick test AFAIK.
I thought that it was fake. Well, if you can, use a shorter URL. Like dlang instead of d-programming-language now that we havethat domain.Why does byLine use '\x0a' instead of '\n'? '\n' is clear, whereas most people will have to look up what '\x0a' is, so I really think that it should be '\n'.
I looked at how it was done elsewhere in phobos and did like that: http://dlang.org/phobos/std_stdio.html#byLine But I agree with you that \n is better and will change it.
std.stdio should be changed then IMHO.In general, the functions should do a better job of documenting their parameters. Many of them don't document their parameters at all. For instance, while it's fairly easy to guess what keepTerminator and terminator are used for, there is no explanation about them whatsoever in either byLine or byLineAsync's documentation.
Again I looked at phobos to see what was current the style: http://dlang.org/phobos/std_stdio.html#byLine I'll include some more detail though.
Phobos' documentation can definitely be improved in some cases, and I think that this is one of them.If any of the range return types of these functions have non-standard functions on them, they need to be properly documented. e.g. wait(Duration) is mentioned in passing in byLineAsync's documentation, but no explanation for its purpose is given. The documentation needs to explain everything the programmer needs to know about the return types. For most ranges, it's enough to say that the return type is a range of X, but if the return type has anything other than the standard range functions, then just saying that the function returns a range of X is not enough. The extra functions must be listed and explained.
This is what is currently documented: "If no data is available and the main thread access the range it will block until data becomes available. An exception to this is the $(D wait(Duration)) method on the range. This method will wait at maximum for the specified duration and return true if data is available." I guess that explains it?
I really think that you need to state specifically that the returned range has an the additional wait(Duration) function on it and explain directly what it's for and how it's used. Also, put it in the example. The way it states it by saying "An exception to this is..." makes the fact that wait(Duration) is mentioned at all seem like an afterthought, and while it does give some explanation for it, I really don't think that it's clear enough how it's used and what it does. It's enough to start to get the idea, but not enough to actually use it. An example would probably be the biggest help in that regard, but I do think that the function needs to be explicitly documented with a description of what it does rather than seemingly mentioning it as an afterthought - especially since it also raises the question of what other non- standard functions the range has that aren't mentioned.I'd warn you that the fact that the documentation for Protocol is showing up in the documentation is likely a bug, since Protocol is private - probably the bug that all templates are currently public even if they're marked private. So, either Protocol needs to be public (or maybe protected), or it's going to disappear from the documentation eventually. Another option would be to do something similar to std.container.TotalContainer and declare a struct which is intended only for documentation (probably in a version(StdDdoc) block), and put all of the protocol documenation on _it_ instead of the mixin.
Ok. didn't know about the private template bug. I really think it is a bug in ddoc that it should be able to insert mixed in docs ie. the Protocol documentation should be mixed into the Http documentation. Anyway - I don't see the TotalContainer docs showing in the generated html? Another workaround is to manually copy the Protocol docs into the HTTP/FTP/SMTP structs inside a version(StdDdoc).
We have at least 2 bugs here. One, that templates are always public, and two, that the documentation isn't mixed in. As for TotalContainer, you appear to be right. I'm in such a habit of reading the code rather than the docs, that I missed that. It's using /* rather than /**, so it doesn't end up in the ddoc. The best would be if the functions were on their approriate types with StdDoc, but at present, that means duplicating all of that documentation, which would kind of suck. So, it's up to you. From the perspective of those reading the documentation, it would definitely be better if it were on the appropriate types though, so I would be inclined to argue that they should go on the types and that we can remove them when the mixin bug is fixed, but I don't know what others would think about that. At minimum, it needs to be more clearly explained that Protocol is a template mixin that is being mixed into Http, Ftp, and Smtp, and those have all of Protocol's functions. - Jonathan M Davis
Dec 17 2011
On Saturday, December 17, 2011 23:10:00 jdrewsen wrote:On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis wrote:Line# 235 is identical to line# 239. Shouldn't line# 235 be creating an Http object, not an Ftp object? That mistake definitely makes it look like download hasn't been properly tested.
It has been tested as you can see by the unittest just below the function. It just did not fail because libcurl requires the same setup for this download situation for both ftp and http. Will fix it of course.
Well, when there's code like that that is clearly wrong, it makes it look like it's either not being tested by the unit tests or that the unit tests haven't been run. I guess that this is just a case where the unit tests are unable to catch the problem.Is there a reason that the functions which have a template argument T which can be either a char or a ubyte can't work with immutable char? Glancing at _basicHttp, I don't see any reason why T couldn't be immutable char. Yes, it would require casting to immutable(char)[], but you're already casting to char[], and the data being returned appears to be unique such that it could be safely cast to immutable. That being the case, I'd encourage you to not only make it work with immutable char but to make immutable char the default instead of char.
Is is indeed unique and can be cast to immutable. I'll add that option. Why do you think immutable char as the default is better than char? I know that the return type in that case would be string and not char[] - but why is that better?
strings can't be sliced with impunity, because their elements are immutable. With char[] and const(char)[], you have to worry about the elements changing, so you're often forced to duplicate the string rather than simply slice it. As such, string is far preferrable to char[] or const(char)[] in the general case. It's even better when there's a choice between them, but if you have to choose, you should almost always choose string (there are exceptions though - e.g. buffers like in std.stdio.ByLine where it reuses its char[] buffer rather than allocating a new one).Please make sure that opening braces on on their own line. That's the way that the rest of Phobos does it and it's one of the few formatting rules that we've been trying to use consistently throughout Phobos. For the most part, you get it right, but not everywhere - nested functions in particular seem to have braces on the same line for some reason.
When I look through the code it seems ok with regards to braces on own line. Maybe by nested functions you mean delegates? The delegates I use do indeed have braces on same line which is ok afaik.
In general, just make sure that braces are on their own line unless you're dealing with a one line lambda or something similar. You do have nested functions or delegates which don't do that.It's not a big deal, but you should use auto more. For instance, lines #626, #638, and #640 don't use auto when they could.
I'm a bit in doubt. On one hand it is great to make everything auto in order to make it easy to change type and to remove redundancy. One the other hand it is very convenient to be able to see the type when you read the code. I know that a clever editor would be able to figure out the type for me. But I also read code in normal editors for diffs etc. or on github. Anyway - I've changed as much as possible to auto now.
Just in general, it's best practice to use auto unless you can't. There _are_ exceptions to that rule, but that's almost always the way that it should be.I'd suggest changing line# 823 to auto result = get(url, isFtpUrl(url) ? FTP() : Http());
Not possible. get() is a template function where the template parameter is connection type ie. one of FTP or HTTP. isFtpUrl(url) is evaluated at runtime and not compile time.
Yeah. I missed that. Stuff like that makes me wish that we had some kind of static ternary operator, but it would probably complicate the language too much. In some cases though, you might be able to use alias to fix the problem (static if the alias and then use the alias in the function call so that the entire function call isn't duplicated). - Jonathan M Davis
Dec 17 2011
On Saturday, December 17, 2011 23:51:08 Jakob Ovrum wrote:On Saturday, 17 December 2011 at 22:25:10 UTC, jdrewsen wrote:On Saturday, 17 December 2011 at 20:38:46 UTC, Jakob Ovrum wrote:On Saturday, 17 December 2011 at 11:56:04 UTC, jdrewsen wrote:AutoConnect sounds like a command to connection automatically which might be confusing since that is not what it does. Therefore I went with AutoConnection which I still believe is better.
How about something completely different then, like URLInfer? Also, some instances in the docs of "url" in lowercase should probably be changed to uppercase.
Will change case. Don't know if the URLInfer name is better though. /Jonas
Andrei is right, a noun is better. I'm with InferredProtocol.
You inferred protocol is better. Still annoyingly long, but it's a clearer name. - JonathnM m Davis
Dec 17 2011
On Saturday, December 17, 2011 09:21:42 Justin C Calvarese wrote:On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <jdrewsen nospam.com> wrote:On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:...
I'd argue that the acronyms should be in all caps if camelcasing wouldrequire that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
I like Http better as personal taste. But if HTTP is preferred then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
I absolutely agree that the "D Style" page should be revised and extended to reflect the current standards that new Phobos modules are expected to adhere to. I found an open pull request that indicates a past effort in this regard: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 (I thought that another document was also floating around that was a proposed new "Style Guide for Phobos", but I don't recall who authored it and I don't remember where it was located.)
Yeah. I need to get back to that. Portions of it were completely agreed upon and others need further discussion, and I keep forgetting to get those discussed among the Phobos devs and get those questions settled. - Jonathan m Davis
Dec 17 2011
On Sunday, 18 December 2011 at 01:30:10 UTC, Jonathan M Davis wrote:On Saturday, December 17, 2011 09:21:42 Justin C Calvarese wrote:On Sat, Dec 17, 2011 at 5:56 AM, jdrewsen <jdrewsen nospam.com> wrote:On Thursday, 15 December 2011 at 07:46:56 UTC, Jonathan M Davis wrote:...
I'd argue that the acronyms should be in all caps if camelcasing wouldrequire that the first letter of the acronym be capitalized and all lower case if camelcasing would require that the first letter of the acronym be lower case. We're not completely consistent in how we using acronyms in names in Phobos, but I believe that we primarily follow that rule when putting them in symbol names. So, for instance, it would be HTTP, FTP, and SMTP rather than Http, Ftp, and Smtp.
I like Http better as personal taste. But if HTTP is preferred then I'll do that. Actually I think I will make a pull request to extend the dlang.org/dstyle.html doc with and example that we can point to when someone asks about styling. I've spend too much time restyling because there is no such style doc already and people are complaining about style in reviews anyway.
I absolutely agree that the "D Style" page should be revised and extended to reflect the current standards that new Phobos modules are expected to adhere to. I found an open pull request that indicates a past effort in this regard: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 (I thought that another document was also floating around that was a proposed new "Style Guide for Phobos", but I don't recall who authored it and I don't remember where it was located.)
Yeah. I need to get back to that. Portions of it were completely agreed upon and others need further discussion, and I keep forgetting to get those discussed among the Phobos devs and get those questions settled. - Jonathan m Davis
Nice to see that something is in progress. Please also add a note about acronyms in names then. /Jonas
Dec 18 2011
On Sunday, 18 December 2011 at 01:27:45 UTC, Jonathan M Davis wrote:On Saturday, December 17, 2011 23:10:00 jdrewsen wrote:On Thursday, 15 December 2011 at 08:51:29 UTC, Jonathan M Davis wrote:Line# 235 is identical to line# 239. Shouldn't line# 235 be creating an Http object, not an Ftp object? That mistake definitely makes it look like download hasn't been properly tested.
It has been tested as you can see by the unittest just below the function. It just did not fail because libcurl requires the same setup for this download situation for both ftp and http. Will fix it of course.
Well, when there's code like that that is clearly wrong, it makes it look like it's either not being tested by the unit tests or that the unit tests haven't been run. I guess that this is just a case where the unit tests are unable to catch the problem.Is there a reason that the functions which have a template argument T which can be either a char or a ubyte can't work with immutable char? Glancing at _basicHttp, I don't see any reason why T couldn't be immutable char. Yes, it would require casting to immutable(char)[], but you're already casting to char[], and the data being returned appears to be unique such that it could be safely cast to immutable. That being the case, I'd encourage you to not only make it work with immutable char but to make immutable char the default instead of char.
Is is indeed unique and can be cast to immutable. I'll add that option. Why do you think immutable char as the default is better than char? I know that the return type in that case would be string and not char[] - but why is that better?
strings can't be sliced with impunity, because their elements are immutable. With char[] and const(char)[], you have to worry about the elements changing, so you're often forced to duplicate the string rather than simply slice it. As such, string is far preferrable to char[] or const(char)[] in the general case. It's even better when there's a choice between them, but if you have to choose, you should almost always choose string (there are exceptions though - e.g. buffers like in std.stdio.ByLine where it reuses its char[] buffer rather than allocating a new one).Please make sure that opening braces on on their own line. That's the way that the rest of Phobos does it and it's one of the few formatting rules that we've been trying to use consistently throughout Phobos. For the most part, you get it right, but not everywhere - nested functions in particular seem to have braces on the same line for some reason.
When I look through the code it seems ok with regards to braces on own line. Maybe by nested functions you mean delegates? The delegates I use do indeed have braces on same line which is ok afaik.
In general, just make sure that braces are on their own line unless you're dealing with a one line lambda or something similar. You do have nested functions or delegates which don't do that.It's not a big deal, but you should use auto more. For instance, lines #626, #638, and #640 don't use auto when they could.
I'm a bit in doubt. On one hand it is great to make everything auto in order to make it easy to change type and to remove redundancy. One the other hand it is very convenient to be able to see the type when you read the code. I know that a clever editor would be able to figure out the type for me. But I also read code in normal editors for diffs etc. or on github. Anyway - I've changed as much as possible to auto now.
Just in general, it's best practice to use auto unless you can't. There _are_ exceptions to that rule, but that's almost always the way that it should be.I'd suggest changing line# 823 to auto result = get(url, isFtpUrl(url) ? FTP() : Http());
Not possible. get() is a template function where the template parameter is connection type ie. one of FTP or HTTP. isFtpUrl(url) is evaluated at runtime and not compile time.
Yeah. I missed that. Stuff like that makes me wish that we had some kind of static ternary operator, but it would probably complicate the language too much. In some cases though, you might be able to use alias to fix the problem (static if the alias and then use the alias in the function call so that the entire function call isn't duplicated).
A static ternary operator wouldn't work in this case since isFtpUrl(url) cannot be evaluated at compile time. /Jonas
Dec 18 2011
On Sunday, December 18, 2011 21:49:16 jdrewsen wrote:A static ternary operator wouldn't work in this case since isFtpUrl(url) cannot be evaluated at compile time.
Ah. That would be true. Still, it's the sort of thing which begs for a ternary operator but which just can't use it due to technical limitations in the language. It may be completely impractical to try and overcome those technical limitations, but it's still frustrating sometimes. Still, at least we _have_ static if, unlike C++. - Jonathan M Davis
Dec 18 2011
Le 03/12/2011 05:26, dsimcha a écrit :I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
A bit late, but following Andrei's important remark that some operations are blocking, I also noticed that curl-multi API is not covered. http://curl.haxx.se/libcurl/c/libcurl-multi.html Is there a reason why ? Would it break the current API to add it later ?
Dec 19 2011
Le 19/12/2011 19:05, Somedude a écrit :Le 03/12/2011 05:26, dsimcha a écrit :I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
A bit late, but following Andrei's important remark that some operations are blocking, I also noticed that curl-multi API is not covered. http://curl.haxx.se/libcurl/c/libcurl-multi.html Is there a reason why ? Would it break the current API to add it later ?
It looks like curl_multi_perform's behavior is more interesting than curl_easy_perform. * From the tutorial: http://curl.haxx.se/libcurl/c/libcurl-tutorial.html "curl_multi_perform(3) is asynchronous. It will only execute as little as possible and then return back control to your program. It is designed to never block. If it returns CURLM_CALL_MULTI_PERFORM you better call it again soon, as that is a signal that it still has local data to send or remote data to receive. "
Dec 19 2011
On Monday, 19 December 2011 at 18:14:29 UTC, Somedude wrote:Le 19/12/2011 19:05, Somedude a écrit :Le 03/12/2011 05:26, dsimcha a écrit :I volunteered ages ago to manage the review for the second round of Jonas Drewsen's CURL wrapper. After the first round it was decided that, after a large number of minor issues were fixed, a second round would be necessary. Significant open issues: 1. Should libcurl be bundled with DMD on Windows? 2. etc.curl, std.curl, or std.net.curl? (We had a vote a while back but it was buried deep in a thread and a lot of people may have missed it: http://www.easypolls.net/poll.html?p=4ebd3219011eb0e4518d35ab ) Code: https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d Docs: http://freeze.steamwinter.com/D/web/phobos/etc_curl.html For those of you on Windows, a libcurl binary built by DMC is available at http://gool.googlecode.com/files/libcurl_7.21.7.zip. Review starts now and ends on December 16, followed by one week of voting.
A bit late, but following Andrei's important remark that some operations are blocking, I also noticed that curl-multi API is not covered. http://curl.haxx.se/libcurl/c/libcurl-multi.html Is there a reason why ? Would it break the current API to add it later ?
It looks like curl_multi_perform's behavior is more interesting than curl_easy_perform. * From the tutorial: http://curl.haxx.se/libcurl/c/libcurl-tutorial.html "curl_multi_perform(3) is asynchronous. It will only execute as little as possible and then return back control to your program. It is designed to never block. If it returns CURLM_CALL_MULTI_PERFORM you better call it again soon, as that is a signal that it still has local data to send or remote data to receive. "
I am aware of the multi_perform API. The multi API actually uses easy handles and therefore it would be natural to extend the wrapper with multi support later on. It will allow for read() and write() methods on the HTTP/FTP structs as well. For now I want to limit the scope and stick to the easy handles. /Jonas
Dec 30 2011
On Friday, 16 December 2011 at 09:42:50 UTC, Jonathan M Davis wrote:Please make sure that you remove trailing whitespace from the file. A lot of the lines have trailing whitespace. Also, make sure that you don't have any tabs in the file. There are a few places where you used tabs.
okLine# 916 claims that the code there won't work and that it needs to be fixed, so please fix it before it goes into Phobos (assuming that it's voted in).
okOn line# 1051, just do auto arr = new Char[](bufferSize); There's no reason to create it and then assign to its length property.
okThe braces in the block follow line# 1131 need to be fixed. The indentation is wrong on some of them, and not all of them are on their own line.
okYou should see if the cast(void[])'s can be removed from byLineAsync. They might still be necessary, but null was given a type with the last release, so it might work without the cast now. The same goes with byChunkAsync.
Still doesn't work.Same with line# 1215 as 1051. There's no point in wasting a line of code by declaring an array and the assigning to its length variable. The code is probably slightly less efficient that way too. Just allocate a new array of the correct length.
okOn line#1297, the braces need to be fixed. Same with line# 1308 and line# 1334. Just check over your braces in general to make sure that they're consistent and are on their own line unless you're dealing with something like a one line lambda.
okAnd as I mentioned before, all of these enforces which take a new CurlException, should be changed to use enforceEx after you've fixed CurlException's constructor.
okThe documentation on Protocol's isValid is wrong. It claims that isValid is true if the instance is stoppend and invalid. Wouldn't that mean that the object _wasn't_ valid?
okThere are quite a few places where you're concatenating several strings and format would make the code much cleaner (e.g. Protocol's netInterface function). Unless you're avoiding format in an attempt to allow functions to be pure (since unfortunately, format can't currently be pure), you really should be using format.
okI'd argue that it's better to use empty than check whether a string is equal to "". e.g. domain.empty instead of domain != "" on line# 1538.
okThere bodies of Http's constructor, static opCall, and dup functions are all very similar - especially the constructor and static opCall. You should look at refactoring those so that they don't have to duplicate so much code. Maybe it can't be reasonably done easily, but it would certainly be better if that code duplication could be reduced.
okChange setTimeCondition to take a SysTime. DateTime is not intended for timestamps. It's intended for calendar-based time, not the system time. Also, SysTime makes it easier to convert to time_t. In general, the fact that you're looking to deal with a time_t is a sign that you should be using SysTime. This is especially true, since DateTime is going to give the wrong time_t in general, since it has no time zone. As it stands, the timestamp is assumed to be in UTC. That's just begging for bugs. Users won't expect that. Definitely make it a SysTime. And once that's done, line# 2016 would disappear, and line# 2017 would become p.curl.set(CurlOption.timevalue, timestamp.toUnixTime());
okOn line# 2247, why do you use format with no format string? There should be no ~ or to!string in a call to format. It's just creating unnecessary memory allocations. It should be something more like format("%s%s(%s.", code, reason, majorVersion, minorVersion); Though I find that '(' to be rather odd, since it has no matching closing paren, so one may need to be added. Regardless, that line shouldn't be concatenating strings or use conv.to. format takes care of all of that and should do it with fewer memory allocations.
ok - ')' was missingPointers really should have their * next to the type not floating in space like on line #2274. As it is, the code looks like a multiplication. Unlike in C/C++, the * is clearly associated with the type. e.g. int* a, b; creates an int* and an int in C, but it creates int* and int* in D. Separating the * from the type has a tendancy to make the code harder to understand.
If that's the D style I'll do that.Ftp's consturctor and static opCall have a code duplication problem similar to that of Http.
okMake encoding take a string. Any time that you are _always_ going to idup a parameter which is a character array, make it a string, _not_ const. That way, if it's a string, no idup is necessary, and if it's char[], then it can be iduped when it's passed in, and iduping only occurs when it's actually necessary.
okIn Smtp's constructor, yo ushould probably create a variable for the toLowered url. e.g.
auto lowered = url.toLower(); That way, you avoid having to lower it twice if the else branch of the if-else is taken.
okIs p.curl.perform a property? If so, line# 2455 does nothing. If not, then it needs parens. The curl module should be able to be built with -property. I believe that Phobos as a whole is currently being built with tha flag. If not, it will be soon, and this module will need to do that if it's merged into Phobos.
okIf you can, please use a static foreach with EnumMembers!CurlOption in Curl's dup. And if you can't, because you're not clearing all of them, then put all of the ones that you _are_ clearing in an array (or TypeTuple if you want to avoid the allocation for the array) and iterate over them with a foreach. e.g. foreach(option; TypeTuple!(CurlOption.file, CurlOptions.writefunction)) copy.clear(option); You should be able to cut down on the number of lines by doing that (_especially_ if you're actually clearing all of them and can use EnumMembers!CurlOption).
Nice. I actually found out that you can do: with (CurlOption) foreach(option; TypeTuple!(file, writefunction)) copy.clear(option); Which is pretty neat when you have to list as many options as in the curl wrapper.Line#2639 is another case where format should be used. In general, if you're doing more than one ~, consider using format unless you're trying to make a function pure.
okI'd suggest renaming Message to something like CurlMessage. The odds of a name clash with Message are likely high. And while they won't be able to use Message, since it's private, it will still affect whether the full path for Message must be given (e.g. my.module.Message instead of Message). CurlMessage is far less likely to clash.
okRename DATA to Data. DATA does not follow Phobos' naming conventions. Granted, it's private, but it's completely off. Type names should be pascal cased.
okWhy isn't empty a property on #3056? Sure, it's not a range, but it would be more consistent with everything else if it were a property.
okIn general, you should use when converting between TickDuration and Duration, but in some cases, no conversion should be necessary. For instance, line# 985 shouldn't need to do any converting. Duration's opOpAssign should be able to handle a TickDuration.
okOverall, the design looks fairly solid, and the code looks pretty good, but there are definitely a number of minor issues which should be addressed before this code makes it into Phobos. The biggest involve the functions' parameters (such as using SysTime, not DateTime, and using string in some places where you're using const(char)[] in order to avoid unnecessary idups). And those definitely need to be sorted out sooner rather than later, since they affect the API and will break code if they're changed later. - Jonathan M Davis
Thanks for the comments /Jonas
Dec 30 2011
On Saturday, 17 December 2011 at 20:01:04 UTC, Jakob Ovrum wrote:The docs for onReceiveHeader, http://freeze.steamwinter.com/D/web/phobos/etc_curl.html#onReceiveHeader explicitly says that the const string parameters are not valid after the function returns. This is all well and good; but a minor improvement would be to change the signature to: property void onReceiveHeader(void delegate(in char[] key, in char[] value) callback); to add `scope` to the parameters. This documents the fact that the constant strings shouldn't be escaped as-is in both code and documentation. I suspect many other instances of `const(char)[]` could be changed to `in char[]` to better document how they use the parameters (I also personally think it looks better as it is more concise).
I agree. Actually I thought that 'in' was just an alias of 'const' but now I see that it is actually an alias of 'const scope'.
Dec 30 2011









Jonas Drewsen <jdrewsen nospam.com> 