digitalmars.D - Working with files over 2GB in D2
- dsimcha <dsimcha yahoo.com> Oct 16 2009
- Jeremie Pelletier <jeremiep gmail.com> Oct 16 2009
- Jeremie Pelletier <jeremiep gmail.com> Oct 16 2009
- dsimcha <dsimcha yahoo.com> Oct 16 2009
- Frank Benoit <keinfarbton googlemail.com> Oct 16 2009
- Jeremie Pelletier <jeremiep gmail.com> Oct 16 2009
- Frank Benoit <keinfarbton googlemail.com> Oct 17 2009
- Christopher Wright <dhasenan gmail.com> Oct 17 2009
- Jeremie Pelletier <jeremiep gmail.com> Oct 17 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Oct 17 2009
- dsimcha <dsimcha yahoo.com> Oct 17 2009
- language_fan <foo bar.com.invalid> Oct 17 2009
Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
Oct 16 2009
dsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
What platform are you using? You should report your issue on bugzilla. I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
Oct 16 2009
Jeremie Pelletier wrote:dsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
What platform are you using? You should report your issue on bugzilla. I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
I meant SetFilePointerEx :x
Oct 16 2009
== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
Mostly Linux. Everything seems to be working on Windows, though I haven't tested it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
Oct 16 2009
dsimcha schrieb:== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
Mostly Linux. Everything seems to be working on Windows, though I haven't tested it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
Tango has full support for that. On linux platform, there are two C APIs, one up to 2GB and one for LFS - Large File Support.
Oct 16 2009
Frank Benoit wrote:dsimcha schrieb:== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
Tango has full support for that. On linux platform, there are two C APIs, one up to 2GB and one for LFS - Large File Support.
I just had a quick peek at std.stdio, it is using the C standard library for file I/O on every platform. Phobos should support the CreateFile related APIs on windows and LFS on linux to get around quirks like that 2Gb limitation. Jeremie
Oct 16 2009
Jeremie Pelletier schrieb:Frank Benoit wrote:dsimcha schrieb:== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
haven't tested it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
Tango has full support for that. On linux platform, there are two C APIs, one up to 2GB and one for LFS - Large File Support.
I just had a quick peek at std.stdio, it is using the C standard library for file I/O on every platform. Phobos should support the CreateFile related APIs on windows and LFS on linux to get around quirks like that 2Gb limitation. Jeremie
In Tango search for "__USE_LARGEFILE64" to find the relevant places. Not only other functions are used, also types and structures are different.
Oct 17 2009
language_fan wrote:Sat, 17 Oct 2009 10:58:15 +0200, Frank Benoit thusly wrote:In Tango search for "__USE_LARGEFILE64" to find the relevant places. Not only other functions are used, also types and structures are different.
I think there was some talk about merging Tango and Phobos, but now since Tango has been abandoned (no D2 port is planned it seems), would it make sense to rewrite those parts of Tango that are missing in Phobos, and license them using a more liberal practical license?
Abandoned?! Nobody has abandoned Tango. Tango hasn't been ported to D2 because it's too much of a moving target.
Oct 17 2009
Christopher Wright wrote:language_fan wrote:Sat, 17 Oct 2009 10:58:15 +0200, Frank Benoit thusly wrote:In Tango search for "__USE_LARGEFILE64" to find the relevant places. Not only other functions are used, also types and structures are different.
I think there was some talk about merging Tango and Phobos, but now since Tango has been abandoned (no D2 port is planned it seems), would it make sense to rewrite those parts of Tango that are missing in Phobos, and license them using a more liberal practical license?
Abandoned?! Nobody has abandoned Tango. Tango hasn't been ported to D2 because it's too much of a moving target.
I think he took the april 1st post seriously about tango moving to python :)
Oct 17 2009
dsimcha wrote:== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
Mostly Linux. Everything seems to be working on Windows, though I haven't tested it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
No, but I'd be glad to fix any bugs you may find in std.stdio. I fixed a couple myself, but it looks there are more to go. Andrei
Oct 17 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articledsimcha wrote:== Quote from Jeremie Pelletier (jeremiep gmail.com)'s articledsimcha wrote:Does anyone know how to work with huge (2GB+) files in D2? std.stream has overflow bugs (I haven't isolated them yet) and can't return their size correctly, std.stdio.File throws a ConvOverflowError in seek() because fseek() apparently takes an int when it should take a long, and std.file only supports reading the whole file, which I can't do in 2GB address space. It appears none of the file I/O on Phobos has been tested on huge files (until now).
I had similar issues on windows when using stdio's fseek and ftell, I had no problems using GetFilePointerEx, you could try that while it is fixed. Jeremie
Mostly Linux. Everything seems to be working on Windows, though I haven't tested it that thoroughly. I will file Bugzillas eventually, but I'm still trying to understand some of these issues, i.e. to what extent they're limitations vs. real bugs. What I'm really interested in knowing is: 1. To what extent is the fact that working with 2GB+ files a platform limitation rather than a real bug? (I vaguely understand that it has to do with files being indexed by signed ints, but I don't know the details of how it's implemented on each platform and what is different between platforms.) 2. Does anyone know of a method of doing file I/O in D2 that is well-tested with files above 2GB?
couple myself, but it looks there are more to go. Andrei
Yeah, I've filed a few Bugzillas. I really didn't anticipate large file support not being there and need it badly pronto, but would be willing to help out to make that happen.
Oct 17 2009
Sat, 17 Oct 2009 10:58:15 +0200, Frank Benoit thusly wrote:In Tango search for "__USE_LARGEFILE64" to find the relevant places. Not only other functions are used, also types and structures are different.
I think there was some talk about merging Tango and Phobos, but now since Tango has been abandoned (no D2 port is planned it seems), would it make sense to rewrite those parts of Tango that are missing in Phobos, and license them using a more liberal practical license?
Oct 17 2009









Jeremie Pelletier <jeremiep gmail.com> 