www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - uploading with curl

reply "Gleb" <s4mmael gmail.com> writes:
Hello guys,

I'm trying to use curl library to satisfy my file transfer needs
under Windows 7. I've spent all the day and the most of
functionality I have already tried works like a charm. But I have
a few issues with "upload" function.

First of all, if I try to use something like:
    auto client = FTP("192.168.110.58");
or:
    upload!FTP("file.zip", "192.168.110.58");
curl wrapper does not understand we are trying to use
ftp-protocol and uses http instead, returning something like:
    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
lang="en">
     <head>
      <title>400 - Bad Request</title>
     </head>
     <body>
      <h1>400 - Bad Request</h1>
     </body>
    </html>
Not a big deal, I'll use "ftp://xx.xx.xx.xx" format everywhere
below.

Here is the code I'm trying to use to upload a local file to the
ftp-host with an authentication:
    auto client = FTP();
    client.setAuthentication("login", "pass");
    upload!FTP("file.zip", "ftp://192.168.110.58/file.zip",
client);
This will pass the authentication but won't upload the file.

Then I decided to take a look to the code of std.net.curl.upload
function and use a low-level API the same way to find the
solution. Here is what I got:
    auto f = new std.stream.BufferedFile("file.zip", FileMode.In);
    scope (exit) f.close();
    auto client = FTP("ftp://192.168.110.58");
    client.verbose(true);
    client.setAuthentication("login", "pass");
    client.onSend = (void[] data)
    {
       return f.read(cast(ubyte[])data);
    };
    client.contentLength = cast(size_t)f.size;
    client.perform();
It's basically the same as "upload" function. This authenticates
correctly, gets directory listing and then nothing happens:

    > QUIT
    < 221 Goodbye.

And it looks correct for me, why should it upload any file!?

So I decided to replace the last line of the code with the
following:
    client.addCommand("epsv");
    client.addCommand("stor file.zip");
    client.perform();
Here are the results:
    > epsv
    < 229 Entering Extended Passive Mode (|||12761|)
    > stor file.zip
    * FTP response timeout

    * Timeout was reached
    > QUIT
    * server response timeout

    std.net.curl.CurlTimeoutException std\net\curl.d(3333):
6B2BD8C6 on handle 166CB60
This way the file was created on the server, but it's empty. It
looks like client.onSend statements are never executed.
Unfortunately, I didn't managed to find out why, it's somewhere
in object "private RefCounted!Impl p" (curl.d:1956), but I didn't
find where it is.

So, what am I doing wrong? Does std.net.curl.upload work for you
correctly? How do I upload a file to the ftp-host with
authentication? What is "RefCounted!Impl p" and where do I find
it's "p.curl.perform()" method?

P.S.: Firewall is not the case, other ftp clients and
std.net.curl.download function work fine, rules for the program
are created (just in case).

Thank you in advance, any advice is really appreciated.
Apr 06 2012
parent reply "Jonas Drewsen" <jdrewsen nospam.com> writes:
Answers below:

On Friday, 6 April 2012 at 13:24:03 UTC, Gleb wrote:
 Hello guys,

 I'm trying to use curl library to satisfy my file transfer needs
 under Windows 7. I've spent all the day and the most of
 functionality I have already tried works like a charm. But I 
 have
 a few issues with "upload" function.

 First of all, if I try to use something like:
    auto client = FTP("192.168.110.58");
 or:
    upload!FTP("file.zip", "192.168.110.58");
 curl wrapper does not understand we are trying to use
 ftp-protocol and uses http instead, returning something like:
    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 
 Transitional//EN"

 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
 lang="en">
     <head>
      <title>400 - Bad Request</title>
     </head>
     <body>
      <h1>400 - Bad Request</h1>
     </body>
    </html>
 Not a big deal, I'll use "ftp://xx.xx.xx.xx" format everywhere
 below.
This is not how it should be. Github url with fixes below...
 Here is the code I'm trying to use to upload a local file to the
 ftp-host with an authentication:
    auto client = FTP();
    client.setAuthentication("login", "pass");
    upload!FTP("file.zip", "ftp://192.168.110.58/file.zip",
 client);
 This will pass the authentication but won't upload the file.

 Then I decided to take a look to the code of std.net.curl.upload
 function and use a low-level API the same way to find the
 solution. Here is what I got:
    auto f = new std.stream.BufferedFile("file.zip", 
 FileMode.In);
    scope (exit) f.close();
    auto client = FTP("ftp://192.168.110.58");
    client.verbose(true);
    client.setAuthentication("login", "pass");
    client.onSend = (void[] data)
    {
       return f.read(cast(ubyte[])data);
    };
    client.contentLength = cast(size_t)f.size;
    client.perform();
 It's basically the same as "upload" function. This authenticates
 correctly, gets directory listing and then nothing happens:

    > QUIT
    < 221 Goodbye.

 And it looks correct for me, why should it upload any file!?
I can reproduce this and included the fix for it.
 So I decided to replace the last line of the code with the
 following:
    client.addCommand("epsv");
    client.addCommand("stor file.zip");
    client.perform();
 Here are the results:
    > epsv
    < 229 Entering Extended Passive Mode (|||12761|)
    > stor file.zip
    * FTP response timeout

    * Timeout was reached
    > QUIT
    * server response timeout

    std.net.curl.CurlTimeoutException std\net\curl.d(3333):
 6B2BD8C6 on handle 166CB60
 This way the file was created on the server, but it's empty. It
 looks like client.onSend statements are never executed.
 Unfortunately, I didn't managed to find out why, it's somewhere
 in object "private RefCounted!Impl p" (curl.d:1956), but I 
 didn't
 find where it is.

 So, what am I doing wrong? Does std.net.curl.upload work for you
 correctly? How do I upload a file to the ftp-host with
 authentication? What is "RefCounted!Impl p" and where do I find
 it's "p.curl.perform()" method?
RefCount!Impl p is a reference counted struct Impl defined above the declaration in the source file. The p.curl is an instance of the Curl struct defined later in the source file. That struct has a perform() method.
 P.S.: Firewall is not the case, other ftp clients and
 std.net.curl.download function work fine, rules for the program
 are created (just in case).

 Thank you in advance, any advice is really appreciated.
You have identified a couple of bugs. A corrected version of curl.d is located at https://github.com/jcd/phobos/blob/curlfixes/std/net/curl.d I've created a pull requests to get it upstream. /Jonas
Apr 07 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/7/12 3:14 PM, Jonas Drewsen wrote:
 I've created a pull requests to get it upstream.
Merged: https://github.com/D-Programming-Language/phobos/pull/528 Andrei
Apr 07 2012
prev sibling parent reply "Gleb" <s4mmael gmail.com> writes:
Woks perfectly! Thanks a lot!
Apr 09 2012
parent reply "Gleb" <s4mmael gmail.com> writes:
Gentlemen,

While working with curl library, I would like to discuss a few 
more issues I faced with, if you don't mind.

Here is the code sample I use:

     auto f = new std.stream.BufferedFile("file.zip", FileMode.In);
     scope (exit) f.close();
     auto client = FTP();
     client.verbose(true);
     client.url = "ftp://192.168.110.58/file.zip";
     client.setAuthentication("user", "pass");
     client.contentLength = cast(size_t)f.size;
     client.handle.set(CurlOption.upload, 1L);
     client.onSend = (void[] data)
     {
         return f.read(cast(ubyte[])data);
     };
     client.perform();

1. TIMEOUTS

The code sample will work great if file.zip is small.
If the file is big enough, we will get the following result:

    * Connecting to 192.168.110.58 (192.168.110.58) port 1468
    > TYPE I
    < 200 Type set to I
    > STOR file.zip
    < 150 Opening BINARY mode data connection for file.zip
    * Operation timed out after 120105 milliseconds with 687751168 
bytes received

    * Timeout was reached 
std.net.curl.CurlTimeoutException std\net\curl.d(3348): Timeout 
was reached on handle 16BCB60

So we were disconnected from server because of the timeout. To 
avoid this behavior I use the following:

    client.dataTimeout(dur!"weeks"(10));

I'm not sure if we really need this kind of timeout so I use the 
really big value. Anyway, I believe the default value of 120 
seconds is not enough for practical usage of file transfer 
protocol. Maybe it's suitable for HTTP only?

By the way, for some reason I can't use the value more then 10 
weeks. In this case we can't even connect to server:


    *   Trying 192.168.110.58...
    * connected

    * server response timeout

    * Timeout was reached
    std.net.curl.CurlTimeoutException std\net\curl.d(3348): 
Timeout was reached on handle 164CB60

Of course, even bigger value is not necessary, but I'm not sure 
this behavior is correct.


2. PROGRESSBAR

I've added the following to the above example:
     client.onProgress = delegate int(size_t dlTotal, size_t 
dlNow, size_t ulTotal, size_t ulNow)
     {
         return 0;
     };

When compiled and run we will see the following:

     > STOR file.zip
     < 150 Opening BINARY mode data connection for file.zip
     * We are completely uploaded and fine
     * Remembering we are in dir ""
     < 226 Transfer complete

     > QUIT
     std.net.curl.CurlException std\net\curl.d(3365): Progress 
callback called on cleaned up Curl instance

It looks like onProgress was called when the file was uploaded 
and the data connection was closed. What am I doing wrong? Should 
I return any other value when ulNow == ulTotal? In code examples 
in the documentation onProgress does not return any value, but it 
must.

Maybe it would be better to use some simple progress bar by 
default when curl.verbose was true? At least something like this:

     static const bStr = replicate("\b", 20);
     static float perc;
     client.onProgress = delegate int(size_t dlTotal, size_t 
dlNow, size_t ulTotal, size_t ulNow)
     {
         perc = ulTotal ? cast(real)ulNow/cast(real)ulTotal*100 : 
0;
         writef("%s* %.2f%c uploaded", bStr, perc, '%');
         return 0;
     };


Thank you in advance for your opinion on this issues.
Apr 17 2012
parent reply "Jonas Drewsen" <jdrewsen nospam.com> writes:
On Tuesday, 17 April 2012 at 07:07:06 UTC, Gleb wrote:
 Gentlemen,

 While working with curl library, I would like to discuss a few 
 more issues I faced with, if you don't mind.

 Here is the code sample I use:

     auto f = new std.stream.BufferedFile("file.zip", 
 FileMode.In);
     scope (exit) f.close();
     auto client = FTP();
     client.verbose(true);
     client.url = "ftp://192.168.110.58/file.zip";
     client.setAuthentication("user", "pass");
     client.contentLength = cast(size_t)f.size;
     client.handle.set(CurlOption.upload, 1L);
     client.onSend = (void[] data)
     {
         return f.read(cast(ubyte[])data);
     };
     client.perform();

 1. TIMEOUTS

 The code sample will work great if file.zip is small.
 If the file is big enough, we will get the following result:

    * Connecting to 192.168.110.58 (192.168.110.58) port 1468
    > TYPE I
    < 200 Type set to I
    > STOR file.zip
    < 150 Opening BINARY mode data connection for file.zip
    * Operation timed out after 120105 milliseconds with 
 687751168 bytes received

    * Timeout was reached 
 std.net.curl.CurlTimeoutException std\net\curl.d(3348): Timeout 
 was reached on handle 16BCB60

 So we were disconnected from server because of the timeout. To 
 avoid this behavior I use the following:

    client.dataTimeout(dur!"weeks"(10));

 I'm not sure if we really need this kind of timeout so I use 
 the really big value. Anyway, I believe the default value of 
 120 seconds is not enough for practical usage of file transfer 
 protocol. Maybe it's suitable for HTTP only?

 By the way, for some reason I can't use the value more then 10 
 weeks. In this case we can't even connect to server:


    *   Trying 192.168.110.58...
    * connected

    * server response timeout

    * Timeout was reached
    std.net.curl.CurlTimeoutException std\net\curl.d(3348): 
 Timeout was reached on handle 164CB60

 Of course, even bigger value is not necessary, but I'm not sure 
 this behavior is correct.
This is one of the many reasons I believe we should do our own network library. Curl only support timeouts for connecting and for the entire transfer. What you really want is better control with sane defaults: DNS lookup timeout, connect timeout, read/write activity timeout and timout for the entire operation. Default "entire operation" timeout should be infinity and the rest just some qualified guesses. Anyway... I think the current default timeouts are how they should be. Regarding the 10 week limit you're mentioning please see the docs for the Duration type: http://dlang.org/phobos/core_time.html#Duration
 2. PROGRESSBAR

 I've added the following to the above example:
     client.onProgress = delegate int(size_t dlTotal, size_t 
 dlNow, size_t ulTotal, size_t ulNow)
     {
         return 0;
     };

 When compiled and run we will see the following:

     > STOR file.zip
     < 150 Opening BINARY mode data connection for file.zip
     * We are completely uploaded and fine
     * Remembering we are in dir ""
     < 226 Transfer complete

     > QUIT
     std.net.curl.CurlException std\net\curl.d(3365): Progress 
 callback called on cleaned up Curl instance

 It looks like onProgress was called when the file was uploaded 
 and the data connection was closed. What am I doing wrong? 
 Should I return any other value when ulNow == ulTotal? In code 
 examples in the documentation onProgress does not return any 
 value, but it must.

 Maybe it would be better to use some simple progress bar by 
 default when curl.verbose was true? At least something like 
 this:

     static const bStr = replicate("\b", 20);
     static float perc;
     client.onProgress = delegate int(size_t dlTotal, size_t 
 dlNow, size_t ulTotal, size_t ulNow)
     {
         perc = ulTotal ? cast(real)ulNow/cast(real)ulTotal*100 
 : 0;
         writef("%s* %.2f%c uploaded", bStr, perc, '%');
         return 0;
     };


 Thank you in advance for your opinion on this issues.
You should always return 0 if you do not want to abort the job. What you're describing sounds like a bug and I'll have a look at it. I think that including the progress in the verbose mode per default will generate too much noise. /Jonas
Apr 17 2012
parent reply "Gleb" <s4mmael gmail.com> writes:
Jonas, thanks for your answer.

On Tuesday, 17 April 2012 at 20:03:57 UTC, Jonas Drewsen wrote:
 This is one of the many reasons I believe we should do our own
 network library. Curl only support timeouts for connecting and
 for the entire transfer. What you really want is better control
 with sane defaults: DNS lookup timeout, connect timeout,
 read/write activity timeout and timout for the entire operation.

 Default "entire operation" timeout should be infinity and the
 rest just some qualified guesses.

 Anyway... I think the current default timeouts are how they
 should be.
There are DNS timeout and connect timeout in current std.net.curl. The entire operation timeout is infinity means there is no such timeout - this is what curl library offers. There is also some kind of read/write activity "timeout" in the library because the connection will be dropped if ftp-server shutdowns suddenly during uploading. That's why I don't see why we have to have the data connection timeout which does not allow the big files to be downloaded or uploaded.
 Regarding the 10 week limit you're mentioning please see the 
 docs
 for the Duration type:
 http://dlang.org/phobos/core_time.html#Duration
Thank's for the link! I've read it twice but unfortunately the reason of 10 weeks limit is still not clear for me. Moreover I can't understand why would dataTimeout influences connectTimeout in such way if the value I use is more then 10 weeks.
 What you're describing sounds like a bug and I'll have a look at
 it.
Thank you very much! Your contribution to D's curl library support is outstanding.
Apr 19 2012
parent reply "Jonas Drewsen" <jdrewsen nospam.com> writes:
On Thursday, 19 April 2012 at 07:26:44 UTC, Gleb wrote:
 Jonas, thanks for your answer.

 On Tuesday, 17 April 2012 at 20:03:57 UTC, Jonas Drewsen wrote:
 This is one of the many reasons I believe we should do our own
 network library. Curl only support timeouts for connecting and
 for the entire transfer. What you really want is better control
 with sane defaults: DNS lookup timeout, connect timeout,
 read/write activity timeout and timout for the entire 
 operation.

 Default "entire operation" timeout should be infinity and the
 rest just some qualified guesses.

 Anyway... I think the current default timeouts are how they
 should be.
There are DNS timeout and connect timeout in current std.net.curl. The entire operation timeout is infinity means there is no such timeout - this is what curl library offers. There is also some kind of read/write activity "timeout" in the library because the connection will be dropped if ftp-server shutdowns suddenly during uploading. That's why I don't see why we have to have the data connection timeout which does not allow the big files to be downloaded or uploaded.
Sorry for the delayed answer. If the ftp server shuts down the tcp connection is broken and you will get notified immediately. This has nothing to do with timeouts (except maybe tcp timeouts in some cases which are normally handled by the OS). Most other libraries have a default timeout afaik and that makes sense to me. As stated in my last reply the real solution would to have an activity timeout which would make it reasonable to set an infinite timeout for the entire transfer.
 Regarding the 10 week limit you're mentioning please see the 
 docs
 for the Duration type:
 http://dlang.org/phobos/core_time.html#Duration
Thank's for the link! I've read it twice but unfortunately the reason of 10 weeks limit is still not clear for me. Moreover I can't understand why would dataTimeout influences connectTimeout in such way if the value I use is more then 10 weeks.
I've not read the code for Duration but my guess is that it wraps around and becomes a negative duration if you exceed the limit. This would of course make it timeout immediately.
 What you're describing sounds like a bug and I'll have a look 
 at
 it.
Thank you very much! Your contribution to D's curl library support is outstanding.
Just trying to make D a better place to be as so many others do :) /Jonas
Apr 30 2012
parent "Gleb" <s4mmael gmail.com> writes:
Jonas, thank you for your answer.
May 03 2012