www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Network I/O and streaming in D2

reply Justin Johansson <no spam.com> writes:
As a parallel thread to the current one on std.xml, I've started
this thread to seek a similar discussion on requirements etc..

Recapping what Andrei said over there :-

"If you want to work on the top most important item, probably
networking would come ahead. We badly need http and ftp
streaming libraries. I'm thinking libcurl would be a good choice
as a backend (not interface).  For D integration, it would be great
to  integrate networking with std.stdio.File - e.g. creating
File("http://xyz.org") would just connect to the thing and
allow streaming, ranges, everything. Adam Ruppe has a
lower-level networking protocol that also hooks into
std.stdio.File, which would be very important to have too."

Yes, I agree that libcurl might be a choice as a backend.  Is
its license okay?

 From http://curl.haxx.se/docs/copyright.html

"Curl and libcurl are true Open Source/Free Software and meet all 
definitions as such. It means that you are free to modify and 
redistribute all contents of the curl distributed archives. You may also 
freely use curl and libcurl in your commercial projects.

Curl and libcurl are licensed under a MIT/X derivate license, see below"

<curl-license>
COPYRIGHT AND PERMISSION NOTICE

Copyright (c) 1996 - 2010, Daniel Stenberg, <daniel haxx.se>.

All rights reserved.

Permission to use, copy, modify, and distribute this software for any 
purpose
with or without fee is hereby granted, provided that the above copyright
notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY 
RIGHTS. IN
NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR 
THE USE
OR OTHER DEALINGS IN THE SOFTWARE.

Except as contained in this notice, the name of a copyright holder shall not
be used in advertising or otherwise to promote the sale, use or other 
dealings
in this Software without prior written authorization of the copyright 
holder.
</curl-license>

Cheers
Justin Johansson
Jun 29 2010
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
 
  From http://curl.haxx.se/docs/copyright.html

Looks ok to me.
Jun 29 2010
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
   From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

It doesn't look okay for Phobos to me. The MIT/new BSD license is not BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.
Jun 29 2010
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Robert Jacques wrote:
 On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 
 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
   From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

It doesn't look okay for Phobos to me. The MIT/new BSD license is not BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.

I emailed Daniel Stenberg, the author, about that. His reply is: ================================================================= On Mon, 10 May 2010, Walter Bright wrote:
 Hello, I'm Walter Bright, the lead developer on the D programming language. 

runtime networking support on it. The only issue, though, is the clause in the license "provided that the above copyright notice and this permission notice appear in all copies". Does this include binaries? Or does it just apply to the source code? Thanks for your interest in libcurl and your question! The copyright notice is only for the source code, and possibly in the documentation. It is NOT for the binaries. I hope you'll find libcurl to do what you need, and I hope you'll discover that the curl-library is a fine list for help and assistance if or when you're in need! -- / daniel.haxx.se ====================================================================== That's good enough for me.
Jun 29 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Robert Jacques wrote:
 On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright 
 <newshound2 digitalmars.com> wrote:

 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
   From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

It doesn't look okay for Phobos to me. The MIT/new BSD license is not BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.

I emailed Daniel Stenberg, the author, about that. His reply is: ================================================================= On Mon, 10 May 2010, Walter Bright wrote: > Hello, I'm Walter Bright, the lead developer on the D programming language. libcurl is the best networking library available, and we'd like to base the D runtime networking support on it. The only issue, though, is the clause in the license "provided that the above copyright notice and this permission notice appear in all copies". Does this include binaries? Or does it just apply to the source code? Thanks for your interest in libcurl and your question! The copyright notice is only for the source code, and possibly in the documentation. It is NOT for the binaries. I hope you'll find libcurl to do what you need, and I hope you'll discover that the curl-library is a fine list for help and assistance if or when you're in need!

Awesome squared! Andrei
Jun 29 2010
prev sibling next sibling parent Justin Johansson <no spam.com> writes:
Walter Bright wrote:
 Robert Jacques wrote:
 On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright 
 <newshound2 digitalmars.com> wrote:

 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
   From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

It doesn't look okay for Phobos to me. The MIT/new BSD license is not BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.

I emailed Daniel Stenberg, the author, about that. His reply is: ================================================================= On Mon, 10 May 2010, Walter Bright wrote: > Hello, I'm Walter Bright, the lead developer on the D programming language. libcurl is the best networking library available, and we'd like to base the D runtime networking support on it. The only issue, though, is the clause in the license "provided that the above copyright notice and this permission notice appear in all copies". Does this include binaries? Or does it just apply to the source code? Thanks for your interest in libcurl and your question! The copyright notice is only for the source code, and possibly in the documentation. It is NOT for the binaries. I hope you'll find libcurl to do what you need, and I hope you'll discover that the curl-library is a fine list for help and assistance if or when you're in need!

You ripper Walter!
Jun 30 2010
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2010-06-30 04.39, Walter Bright wrote:
 Robert Jacques wrote:
 On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend. Is
 its license okay?
 From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

It doesn't look okay for Phobos to me. The MIT/new BSD license is not BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.

I emailed Daniel Stenberg, the author, about that. His reply is: ================================================================= On Mon, 10 May 2010, Walter Bright wrote: > Hello, I'm Walter Bright, the lead developer on the D programming language. libcurl is the best networking library available, and we'd like to base the D runtime networking support on it. The only issue, though, is the clause in the license "provided that the above copyright notice and this permission notice appear in all copies". Does this include binaries? Or does it just apply to the source code? Thanks for your interest in libcurl and your question! The copyright notice is only for the source code, and possibly in the documentation. It is NOT for the binaries. I hope you'll find libcurl to do what you need, and I hope you'll discover that the curl-library is a fine list for help and assistance if or when you're in need!

Then it seems that he wants the Boost license and not the MIT license. -- Jacob Carlborg
Jun 30 2010
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
Robert Jacques wrote:
 Great, but if binary are excluded from the libcurl license, then 
 binaries don't have any license and are unusable.

That is not my understanding. The binary is not excluded from the license, it is just excluded from the attribution requirement of the license.
 Besides, legally I 
 don't think he can change the interpretation of his license without 
 changing its text. I'd recommend kindly asking him to grant you/D the 
 right to use libcurl under the BOOST license, for legal reasons.

Legally, he has every right to change and clarify his license, and since the email is after his license, it supercedes it. I believe his email to me would suffice for any dispute. IANAL. But I agree that changing the license to Boost would be better.
Jun 30 2010
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

That's the traditional way to do it, but I'm not so sure it's the right way for the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?
Jun 29 2010
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Walter Bright Wrote:

 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

That's the traditional way to do it, but I'm not so sure it's the right way for the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?

And in either case, if the buffer is empty a get() operation will block, correct?
Jun 30 2010
parent reply Walter Bright <newshound2 digitalmars.com> writes:
Sean Kelly wrote:
 Walter Bright Wrote:
 
 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?

And in either case, if the buffer is empty a get() operation will block, correct?

I'm no expert in network programming, but yes. I suspect there should be a settable timeout which should throw, too.
Jun 30 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
Walter Bright Wrote:

 Sean Kelly wrote:
 Walter Bright Wrote:
 
 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?

And in either case, if the buffer is empty a get() operation will block, correct?

I'm no expert in network programming, but yes. I suspect there should be a settable timeout which should throw, too.

I've been thinking about this and I think this is probably what the majority of apps want anyway. It isn't scalable, but server apps are a whole 'nother ball of wax.
Jun 30 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sean Kelly wrote:
 Walter Bright Wrote:
 
 Sean Kelly wrote:
 Walter Bright Wrote:

 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?


settable timeout which should throw, too.

I've been thinking about this and I think this is probably what the majority of apps want anyway. It isn't scalable, but server apps are a whole 'nother ball of wax.

To obtain asynchronous operation, an app can spawn a secondary thread using blocking I/O and passing stuff as messages. Indeed defining many secondary threads does become a scalability issue. Andrei
Jun 30 2010
parent reply Sean Kelly <sean invisibleduck.org> writes:
Andrei Alexandrescu Wrote:

 Sean Kelly wrote:
 Walter Bright Wrote:
 
 Sean Kelly wrote:
 Walter Bright Wrote:

 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?


settable timeout which should throw, too.

I've been thinking about this and I think this is probably what the majority of apps want anyway. It isn't scalable, but server apps are a whole 'nother ball of wax.

To obtain asynchronous operation, an app can spawn a secondary thread using blocking I/O and passing stuff as messages. Indeed defining many secondary threads does become a scalability issue.

Yeah, this is where Java was 10 years ago. It's an easy model to program for but scales terribly, assuming you're talking about kernel threads. If the threads are replaced with fibers and context-switching takes place behind the scenes when a read or write is issued it's actually a pretty cool programming model that should scale quite well. So I suppose it's not a bad model to expose to users, since we'll eventually be 64-bit and fibers will probably be used by spawn() at some point.
Jun 30 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Sean Kelly wrote:
 So I suppose it's not a bad model to
 expose to users, since we'll eventually be 64-bit and fibers will probably be
 used by spawn() at some point.

The 64 bit address space eliminates the stack size problems with threads. It's also why I think supporting stack chaining is a waste of time.
Jun 30 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

That's the traditional way to do it, but I'm not so sure it's the right way for the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?

I initially also thought that a file (or socket etc.) should be a range, but then I realized that that would overload roles. It's best to have a handle/ranges architecture in which the handle (e.g. File) is responsible for opening, closing, and managing the connection, and several ranges are responsible for fetching data in various ways (by character, by chunk, by line etc.) BTW I'm virtually done implemented readf. I only need to parse floating-point numbers and strtod won't work. Walter, could you please email me your strtod implementation? Thanks. The current issue with readf (and other similar formatted read routines) is that they require a primitive peek() that looks up the current character in a stream without swallowing it. This is not a FILE* primitive, but can be simulated (slow) by using getc() and ungetc(). Fortunately on GNU's I/O libs peek() is easy to define (actually they have an internal routine by that name), and on Windows dmd uses Walter's I/O library, which again has fast peek(). Andrei
Jun 30 2010
parent Walter Bright <newshound2 digitalmars.com> writes:
Steven Schveighoffer wrote:
 I really think D needs to replace FILE * with it's own buffering 
 scheme.  That way we can control the underlying buffering and have 
 access to it.  We can also take advantage of D features that aren't 
 available in the underlying code, such as thread local storage to avoid 
 taking global locks.

This isn't done because mixing D and C I/O is a desirable property.
Jul 01 2010
prev sibling next sibling parent Adam Ruppe <destructionator gmail.com> writes:
My network thing is very simple: it opens a socket, then wraps it up
in a File struct, via FILE*. Then, you can treat it the same way.

Simple code, and I've been meaning to commit it to phobos for linux at
least, but stuff keeps coming up and I haven't gotten around to it
yet.

Now, there's some controversy on if std.stdio.File should rely on
FILE*, but that's really an implementation detail that we can fix up
later. I've been worrying about that which delays my plans to commit
even more, but I don't think we should right now. Some is better than
none here.


For curl, I had to use it for a personal project last week (needed SSL
support which I don't otherwise have - I've implemented simple HTTP
code on top of the File interface, but no encryption there, so
unsuitable for this task.) Here's the code:
http://arsdnet.net/dcode/curl.d

It is in my typical style of only porting and exposing the bare
minimum to do the job I cared about doing, but maybe it is a good
starting point for others.

The function that does work is:

string curl(string url, string data = null, string contentType =
"application/x-www-form-urlencoded")

If data is null, it does a GET of the url, otherwise a POST with the
given contentType. It returns the data received as a string.
Jun 29 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 29 Jun 2010 22:39:03 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Robert Jacques wrote:
 On Tue, 29 Jun 2010 18:26:51 -0400, Walter Bright  
 <newshound2 digitalmars.com> wrote:

 Justin Johansson wrote:
 Yes, I agree that libcurl might be a choice as a backend.  Is
 its license okay?
   From http://curl.haxx.se/docs/copyright.html

Looks ok to me.

BOOST compatible. In particular: "this permission notice appear in all copies" which includes binary copies.

I emailed Daniel Stenberg, the author, about that. His reply is: ================================================================= On Mon, 10 May 2010, Walter Bright wrote: > Hello, I'm Walter Bright, the lead developer on the D programming language. libcurl is the best networking library available, and we'd like to base the D runtime networking support on it. The only issue, though, is the clause in the license "provided that the above copyright notice and this permission notice appear in all copies". Does this include binaries? Or does it just apply to the source code? Thanks for your interest in libcurl and your question! The copyright notice is only for the source code, and possibly in the documentation. It is NOT for the binaries. I hope you'll find libcurl to do what you need, and I hope you'll discover that the curl-library is a fine list for help and assistance if or when you're in need!

Great, but if binary are excluded from the libcurl license, then binaries don't have any license and are unusable. Besides, legally I don't think he can change the interpretation of his license without changing its text. I'd recommend kindly asking him to grant you/D the right to use libcurl under the BOOST license, for legal reasons.
Jun 30 2010
prev sibling next sibling parent Adam Ruppe <destructionator gmail.com> writes:
On 6/30/10, Sean Kelly <sean invisibleduck.org> wrote:
 I've been thinking about this and I think this is probably what the majority
 of apps want anyway.  It isn't scalable, but server apps are a whole 'nother
 ball of wax.

Blocking calls are convenient for simple apps, since you just call the read and write functions and don't worry about the packets. For servers, they are still pretty useful. You can use the select() call on unix to wait for any one of a set of connections to be ready for you, and when it is, you then call the same blocking read/write functions. Since you know ahead of time that they are ready, it doesn't actually wait. I imagine you could do the same with threads, but I've never actually tried it. Does this scale well? Honestly, I don't know. Every network server I've ever written has only had a handful of concurrent users anyway.
Jun 30 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 30 Jun 2010 13:13:33 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Walter Bright wrote:
 Adam Ruppe wrote:
 My network thing is very simple: it opens a socket, then wraps it up
 in a File struct, via FILE*. Then, you can treat it the same way.

right way for the future. Wouldn't it be better to have an interface to it that is a range, rather than pretend it's a FILE* ?

I initially also thought that a file (or socket etc.) should be a range, but then I realized that that would overload roles. It's best to have a handle/ranges architecture in which the handle (e.g. File) is responsible for opening, closing, and managing the connection, and several ranges are responsible for fetching data in various ways (by character, by chunk, by line etc.) BTW I'm virtually done implemented readf. I only need to parse floating-point numbers and strtod won't work. Walter, could you please email me your strtod implementation? Thanks. The current issue with readf (and other similar formatted read routines) is that they require a primitive peek() that looks up the current character in a stream without swallowing it. This is not a FILE* primitive, but can be simulated (slow) by using getc() and ungetc(). Fortunately on GNU's I/O libs peek() is easy to define (actually they have an internal routine by that name), and on Windows dmd uses Walter's I/O library, which again has fast peek().

I really think D needs to replace FILE * with it's own buffering scheme. That way we can control the underlying buffering and have access to it. We can also take advantage of D features that aren't available in the underlying code, such as thread local storage to avoid taking global locks. -Steve
Jul 01 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 01 Jul 2010 12:31:19 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Steven Schveighoffer wrote:
 I really think D needs to replace FILE * with it's own buffering  
 scheme.  That way we can control the underlying buffering and have  
 access to it.  We can also take advantage of D features that aren't  
 available in the underlying code, such as thread local storage to avoid  
 taking global locks.

This isn't done because mixing D and C I/O is a desirable property.

It can still be this way, just have a special D buffer implementation that outputs to a FILE * or reads from it. But I shouldn't be *required* to deal with FILE * when I open a file or network socket and only ever use it in D. Even though it's desirable, there are considerable drawbacks that need to be justified. -Steve
Jul 01 2010