www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Multithreaded HTTP Download

reply Mike McKee <volomike gmail.com> writes:
Looking at this example that's an all-Bash technique, someone has 
figured out on OSX (and Unix, Linux, FreeBSD, etc.) how to use 
/dev/tcp and download a zip file in a multithreaded way (they use 
only two threads, but you get the point):

http://www.linuxquestions.org/questions/programming-9/bash-and-netcat-stripping-http-header-758911-print/

(scroll down to Gnashley's post)

The technique is basically to establish 2 or more TCP socket 
handles to the remote web server. Then, using an HTTP 1.1 GET, 
pull from both simultaneously. But first, disregard the preamble 
bytes. Then, as you loop and collect bytes, the first byte that 
arrives, disregard the ones from the other handles.

How could I achieve something like that in D? (Note, I'm using 
OSX.)

In general I'm trying to see if I can make a command line zip 
file downloader that downloads faster than Curl for my Qt/C++ 
application installer script for OSX.
Nov 27 2015
next sibling parent reply Mike McKee <volomike gmail.com> writes:
Hey guys, as it turns out, someone on stackoverflow.com pointed 
out in a Perl version of this question that the Bash example that 
was given is really buggy and doesn't make sense. They say that 
trying to download a single file using two socket handles will 
not speed up the download. So, this may or may not be possible. 
Your thoughts?
Nov 27 2015
parent reply tcak <1ltkrs+3wyh1ow7kzn1k sharklasers.com> writes:
On Saturday, 28 November 2015 at 07:05:55 UTC, Mike McKee wrote:
 Hey guys, as it turns out, someone on stackoverflow.com pointed 
 out in a Perl version of this question that the Bash example 
 that was given is really buggy and doesn't make sense. They say 
 that trying to download a single file using two socket handles 
 will not speed up the download. So, this may or may not be 
 possible. Your thoughts?
So, I open one TCP socket to server, and it starts sending me data. Your internet connection speed is max 10Gb/s. You calculate the download speed, and it is at its max. Opening another TCP socket to server wouldn't make any difference. The only case that would make sense is if the server limits the upload speed of each TCP socket. Unless you are in this position, I do not expect to see any difference by opening multiple sockets and requesting different parts of same file.
Nov 28 2015
parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Saturday, 28 November 2015 at 10:46:11 UTC, tcak wrote:
 The only case that would make sense is if the server limits the 
 upload speed of each TCP socket. Unless you are in this 
 position, I do not expect to see any difference by opening 
 multiple sockets and requesting different parts of same file.
I used to live in China for some time, and access to international servers was dog-slow. It also was very unpredictable. Sometimes a connection would be ok (80 kb/s) then suddenly drop to < 3 kb/s. Other times it wouldn't start at all. Having seen that behavior I build a multi-part downloader that opened around 40 connections and restarted a connection if it went below 1Kb/10sec. Worked very well. With it I got up to 1Mb/sec. Which is pretty fast considering I used to stare at 9Kb/s download dialogs. I really hated all those programs that needed to download something and expected the connection to be flawless and the throughput to be infinite.
Nov 28 2015
prev sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Saturday, 28 November 2015 at 06:40:49 UTC, Mike McKee wrote:
 How could I achieve something like that in D? (Note, I'm using 
 OSX.)
I did it with vibe.d and http byte ranges.
 In general I'm trying to see if I can make a command line zip 
 file downloader that downloads faster than Curl for my Qt/C++ 
 application installer script for OSX.
Unless you are on a poor connection, what makes you think you can beat curl? You could use a CDN though.
Nov 28 2015
parent Mike McKee <volomike gmail.com> writes:
After weighing options, I'll use a CDN to get the faster 
download, and stick with curl rather than recoding it in D.
Nov 28 2015