www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memory Mapped File Access

reply Robert <robert.muench robertmuench.de> writes:
Hi, has anyone played around with D and memory mapped files on Windows / Linux?

A friend of mine and I want to use D to develop a D native 
database-system. Yes, sounds crazy and it will take long and we haven't 
done a lot yet. So don't expect anything to look at soon :-)

Thanks Robert.
May 28 2010
next sibling parent reply Bane <branimir.milosavljevic gmail.com> writes:
Robert Wrote:

 Hi, has anyone played around with D and memory mapped files on Windows / Linux?
 
 A friend of mine and I want to use D to develop a D native 
 database-system. Yes, sounds crazy and it will take long and we haven't 
 done a lot yet. So don't expect anything to look at soon :-)
 
 Thanks Robert.
 
MMapped files work just fine, I played/am playing with them. I greet your idea to learn how to build database, its great way to spend time for people programming that stuff. And you are right - trying to make operational database like this is crazy, crazy idea. It will require from you HUGE investment in time & learning to make it remotely reliable and usable. Here I talk about ACID compliance. If you are trying to build something that works *mostly* of the time and saves key -> value pairs in file then it is much simpler. SQLite is a great project you could learn a lot from. It has tons of useful docs about making DB, its open source, its been around for 10 years and its probably better job then you could ever do.
May 28 2010
parent reply Robert <robert.muench robertmuench.de> writes:
On 2010-05-28 13:06:04 +0200, Bane <branimir.milosavljevic gmail.com> said:

 MMapped files work just fine, I played/am playing with them.
I posted before I saw that there is a MmFile class :-) So, I have the first questions: 1. How can I expand the size of a MMF after it was created? 2. If I specify 100GB file-size will it always be written once to disk even if there is nothing in it? Or does the OS use sparse-files as well?
 I greet your idea to learn how to build database, its great way to 
 spend time for people programming that stuff.
 
 And you are right - trying to make operational database like this is 
 crazy, crazy idea. It will require from you HUGE investment in time & 
 learning to make it remotely reliable and usable. Here I talk about 
 ACID compliance. If you are trying to build something that works 
 *mostly* of the time and saves key -> value pairs in file then it is 
 much simpler.
Ok, I need to be fair. We are quite good at these things. Anyone remembering Adimens from the Atari (later for Windows as well)? It was/is a ACID compliant SQL database with row-level locking etc. And, it was written by my friend and sold more than 500.000 times. Yes, we are crazy... but chances are high we will get something done. I need to get some practice with D but shouldn't be that hard.
 SQLite is a great project you could learn a lot from. It has tons of 
 useful docs about making DB, its open source, its been around for 10 
 years and its probably better job then you could ever do.
Yep, I use it since several years. A great piece of software. -- Robert
May 28 2010
parent reply Robert <robert.muench robertmuench.de> writes:
On 2010-05-28 23:41:46 +0200, Robert <robert.muench robertmuench.de> said:

 1. How can I expand the size of a MMF after it was created?
Replying to myself: Simple, close the file, and re-open with new size. Old content is kept. This should be added to the docs, as it's not totally clear.
 2. If I specify 100GB file-size will it always be written once to disk 
 even if there is nothing in it? Or does the OS use sparse-files as well?
Current things I found out: Filesize is used as soon as things are flushed to disk. So MMF need to grow in chunks to be real useful. -- Robert M. Münch http://www.robertmuench.de
May 29 2010
next sibling parent div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert wrote:
 On 2010-05-28 23:41:46 +0200, Robert <robert.muench robertmuench.de> said:
 
 1. How can I expand the size of a MMF after it was created?
Replying to myself: Simple, close the file, and re-open with new size. Old content is kept. This should be added to the docs, as it's not totally clear.
 2. If I specify 100GB file-size will it always be written once to disk
 even if there is nothing in it? Or does the OS use sparse-files as well?
Current things I found out: Filesize is used as soon as things are flushed to disk. So MMF need to grow in chunks to be real useful. -- Robert M. Münch http://www.robertmuench.de
NTFS supports sparse files: http://msdn.microsoft.com/en-us/library/aa365566(v=VS.85).aspx Not sure how you're going to get that to play with phobos memory mapped file. I'm guessing you'd be better off writing your own accessor so you can explicitly support each target platform. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFMAQpLT9LetA9XoXwRAlD/AKDLkw+FHiLCaTGz8oSBzgTKCZFf4ACg0Uyg CHTCPA6RL4vGMq3JVJad/Hk= =Lz1M -----END PGP SIGNATURE-----
May 29 2010
prev sibling next sibling parent BLS <windevguy hotmail.de> writes:
On 29/05/2010 10:17, Robert wrote:
 On 2010-05-28 23:41:46 +0200, Robert <robert.muench robertmuench.de> said:

 1. How can I expand the size of a MMF after it was created?
Replying to myself: Simple, close the file, and re-open with new size. Old content is kept. This should be added to the docs, as it's not totally clear.
 2. If I specify 100GB file-size will it always be written once to disk
 even if there is nothing in it? Or does the OS use sparse-files as well?
Current things I found out: Filesize is used as soon as things are flushed to disk. So MMF need to grow in chunks to be real useful. -- Robert M. Münch http://www.robertmuench.de
Hi Robert, I think the std.mmfile documentation in Phobos is not up to date.. AFAIK there is a unMap method. Have posted a new msg regarding this topic. Regarding chunks.. please see prev. msg. Bjoern
May 29 2010
prev sibling parent reply Robert <robert.muench robertmuench.de> writes:
On 2010-05-29 10:17:34 +0200, Robert <robert.muench robertmuench.de> said:

 Replying to myself: Simple, close the file, and re-open with new size. 
 Old content is kept. This should be added to the docs, as it's not 
 totally clear.
This could become problematic if the app holds some slices to the MMF and the 2nd open maps to a different memory address. In this case all references to the MMF are invalid. I saw that it's possible to specify an explicit address but it seems not to be ensured that this address is used by the OS. Going to do some tests to see if this holds or not. Returned values from the MMF (even index offsets) of course stay the same and are valid. -- Robert M. Münch http://www.robertmuench.de
May 29 2010
parent Robert <robert.muench robertmuench.de> writes:
On 2010-05-29 21:58:59 +0200, Robert <robert.muench robertmuench.de> said:

 This could become problematic if the app holds some slices to the MMF 
 and the 2nd open maps to a different memory address. In this case all 
 references to the MMF are invalid.
In this case a "core.exception.RangeError ...: Range violation" is thrown. At least this makes it possible to identify illegal references.
 I saw that it's possible to specify an explicit address but it seems 
 not to be ensured that this address is used by the OS.
 
 Going to do some tests to see if this holds or not. Returned values 
 from the MMF (even index offsets) of course stay the same and are valid.
Even if this is guaranteed it won't help, old references are invalid. Which is IMO a good thing. So, conclusion: It's possible to close a MMF while there are still references to it. But, those references become invalid and throw an exception. Hence this situation can be identified and handled. -- Robert M. Münch http://www.robertmuench.de
May 30 2010
prev sibling next sibling parent reply BLS <windevguy hotmail.de> writes:
On 28/05/2010 09:28, Robert wrote:
 Hi, has anyone played around with D and memory mapped files on Windows /
 Linux?

 A friend of mine and I want to use D to develop a D native
 database-system. Yes, sounds crazy and it will take long and we haven't
 done a lot yet. So don't expect anything to look at soon :-)

 Thanks Robert.
Hi Robert, in opposite to Bane I think this Job is doable and makes perfectly sense. In fact the Suneido programming system is using a memory mapped file to create a modern (and used in practice) database. Database lines of code are remarkable less.. See yourself.. I would also say that D is the perfect language to implement such a system.. Cookbook for the Suneido DB C++, MMAP file, slightly modified BTree indexing system( IMHO Skiplists are preferable) Boehm GC in SVN (regular download uses home brewed GC), Memory chunk support) Features: C/S database, ATOMIC, RAL (relational algebra... following C.J.Date... set theories) instead of SQL Link > http://www.suneido.com/index.php?option=com_content&task=view&id=49&Itemid=1 Limits : Database size, but I think the size-limit is acceptable on 64 bit engines. Number of concurrent access without hassle. 35-50 users HTH Bjoern Just this. I would choose Phobos MMAP over Tango MMAP. compare it by yourself.
May 29 2010
parent Robert <robert.muench robertmuench.de> writes:
On 2010-05-29 17:21:18 +0200, BLS <windevguy hotmail.de> said:

 in opposite to Bane I think this Job is doable and makes perfectly sense.
Hi Bjoern, thanks to support this idea. IMO used in a smart way MMF make a lot of things simpler. Especially with D's array slices I see a perfect match.
 In fact the Suneido programming system is using a memory mapped file to 
 create a modern (and used in practice) database.
 Database lines of code are remarkable less.. See yourself..
Thanks for the link.
 Just this. I would choose Phobos MMAP over Tango MMAP.  compare it by yourself.
I'm currently using D2 with Phobos. So far it works very well. -- Robert M. Münch http://www.robertmuench.de
May 29 2010
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Robert Wrote:

 Hi, has anyone played around with D and memory mapped files on Windows / Linux?
 
 A friend of mine and I want to use D to develop a D native 
 database-system. Yes, sounds crazy and it will take long and we haven't 
 done a lot yet. So don't expect anything to look at soon :-)
Andrei and I had talked a while back about adding memory-mapped file support to the GC and then it fell off the radar while we worked on other things. I'll see if I can remember how it was to work.
May 29 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/29/2010 10:40 AM, Sean Kelly wrote:
 Robert Wrote:

 Hi, has anyone played around with D and memory mapped files on
 Windows / Linux?

 A friend of mine and I want to use D to develop a D native
 database-system. Yes, sounds crazy and it will take long and we
 haven't done a lot yet. So don't expect anything to look at soon
 :-)
Andrei and I had talked a while back about adding memory-mapped file support to the GC and then it fell off the radar while we worked on other things. I'll see if I can remember how it was to work.
The basic idea is that the only way to handle memory-mapped files safely is to let the garbage collector close them. This is because in any other case you'd have dangling pointers. So the idea is that druntime should provide a safe means for mapping a file to memory and an unsafe means of closing a file. Safe code should be able to count on the garbage collector to close memory-mapped files that have no pointers referring to them. Andrei
May 29 2010
parent reply Robert <robert.muench robertmuench.de> writes:
On 2010-05-29 18:30:13 +0200, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Andrei and I had talked a while back about adding memory-mapped file
 support to the GC and then it fell off the radar while we worked on
 other things.  I'll see if I can remember how it was to work.
The basic idea is that the only way to handle memory-mapped files safely is to let the garbage collector close them. This is because in any other case you'd have dangling pointers.
Ok, that makes sense. On the other hand I will use a very simple rule-of-thumb: As long as the app runs the file is open. Only if the app terminates the file gets closed. Which implies that the reference to the MMF is a global but this shouldn't be a problem.
 So the idea is that druntime should provide a safe means for mapping a 
 file to memory and an unsafe means of closing a file.
Why should it provide an unsafe way of closing a MMF file?
 Safe code should be able to count on the garbage collector to close 
 memory-mapped files that have no pointers referring to them.
Have you sketched any ideas how to use the GC for MMF? -- Robert M. Münch http://www.robertmuench.de
May 29 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/29/2010 02:52 PM, Robert wrote:
 On 2010-05-29 18:30:13 +0200, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 Andrei and I had talked a while back about adding memory-mapped file
 support to the GC and then it fell off the radar while we worked on
 other things. I'll see if I can remember how it was to work.
The basic idea is that the only way to handle memory-mapped files safely is to let the garbage collector close them. This is because in any other case you'd have dangling pointers.
Ok, that makes sense. On the other hand I will use a very simple rule-of-thumb: As long as the app runs the file is open. Only if the app terminates the file gets closed.
That might be the case if no collection ensues, but if a collection does, the runtime should attempt to reclaim available resources.
 Which implies that the reference to the MMF is a global but this
 shouldn't be a problem.

 So the idea is that druntime should provide a safe means for mapping a
 file to memory and an unsafe means of closing a file.
Why should it provide an unsafe way of closing a MMF file?
For applications concerned with deterministic closing of a file (e.g. following writing) and that are willing to take the unsafety risk.
 Safe code should be able to count on the garbage collector to close
 memory-mapped files that have no pointers referring to them.
Have you sketched any ideas how to use the GC for MMF?
There's nothing to sketch, really. The runtime tracks the opened memory-mapped files and the memory ranges associated with them. Upon a collections, if a file's memory has been successfully freed, the file can be safely closed. Andrei Andrei
May 29 2010
parent Robert <robert.muench robertmuench.de> writes:
On 2010-05-29 23:23:06 +0200, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 There's nothing to sketch, really. The runtime tracks the opened 
 memory-mapped files and the memory ranges associated with them. Upon a 
 collections, if a file's memory has been successfully freed, the file 
 can be safely closed.
Is this something that needs to be integrated into the deeper D levels or is this something I could plug on-top if it? I'm not yet that familiar with the GC internals. -- Robert M. Münch http://www.robertmuench.de
May 29 2010
prev sibling parent BLS <windevguy hotmail.de> writes:
On 29/05/2010 21:52, Robert wrote:
 Ok, that makes sense. On the other hand I will use a very simple
 rule-of-thumb: As long as the app runs the file is open. Only if the app
 terminates the file gets closed.

 Which implies that the reference to the MMF is a global but this
 shouldn't be a problem.
From the docs.. --- File is closed when the object instance is deleted. --- Maybe we can have a more precise doc. please.
May 29 2010
prev sibling parent Bane <branimir.milosavljevic gmail.com> writes:
 Ok, I need to be fair. We are quite good at these things. Anyone 
 remembering Adimens from the Atari (later for Windows as well)? It 
 was/is a ACID compliant SQL database with row-level locking etc.
 
 And, it was written by my friend and sold more than 500.000 times.
 
 Yes, we are crazy... but chances are high we will get something done. I 
 need to get some practice with D but shouldn't be that hard.
Then this is a whole new ball game. I did that scare-to-double-think-it so you wouldn't start too ambitiously, but it seems you have enough experience to know exactly what you are getting into. I think D is perfect for the job. Much less lines of code than C family, same or more power. I hope your project will make you famous, along with D :)
May 29 2010