www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Memory leak - only with large data set

reply "Roger" <rop411 gmail.com> writes:
Hi,

We developing a D dll (v 2.060) called (pinvoked) by a C# client. 
The DLL is responsible for filtering large amounts of data. It 
consists of some functions (for example GetRetailSale below), 
which filters the data based on some criteria (defined in state). 
The cached list consists of about 500000 items 
(ExtendedOrderItemDTO) and when we use a search criteria that 
results in a large amount of items we get a memory leak of about 
90MB on each run. With results containing fewer items, there 
seems to be no leak at all.

The cached list of orderitems (Cache.OrderItems), is shared and 
is initialized when the dll is first loaded and is then never 
modified.

Any ideas?


public static shared class Cache
{
   private static shared ExtendedOrderItemDTO[] _orderItems;
   public static  property ExtendedOrderItemDTO[] OrderItems()
   {
     return cast(ExtendedOrderItemDTO[]) _orderItems;
   }
}


export extern(C) wchar* GetRetailSale(State state, bool 
includeZeroResults)
{
   ExtendedOrderItemDTO[] 
retailersOrderItems=GetRetailersOrderItems(state);
   // code...
   return toUTFz!(wchar*)(message);
}

public ExtendedOrderItemDTO[] GetRetailersOrderItems(State state)
{
	auto result=filter!( (ExtendedOrderItemDTO x) =>
	  x._SaleDate >= state.StatePeriod.From &&
	  x._SaleDate <= state.StatePeriod.To &&
	  (
	  state.Brand.Level == 0 ||
	  (state.Brand.Level == 1 && state.Brand.ID == x.BrandID) ||
	  (state.Brand.Level == 2 && state.Brand.ID == x.VariantID) ||
	  (state.Brand.Level == 3 && state.Brand.ID == x.ProductID)
	) &&
	(
	  state.Channel.Level == 0 ||
	  (state.Channel.Level == 1 && state.Channel.ID == x.Channel) ||
	  (state.Channel.Level == 2 && state.Channel.ID == x.SubChannel)
	) &&
	(
	  state.ProductType == 2 ||
	  (state.ProductType == 0 && x.IsFMC == true) ||
	  (state.ProductType == 1 && x.IsFMC == false)	
	)
	)(Cache.OrderItems);

	return array(result);
}

public struct ExtendedOrderItemDTO
{
	public int ID;
	public eInvoiceType InvoiceType;
	public long VirtualStickCount;
	public wchar* Currency;
	public wchar* SaleDate;
	public double Price;

	public long ProductID;
	public bool IsFMC;

	public long PointOfSaleID;

	public int RetailerID;
	public int SubRetailerID;

	public int Channel;
	public int SubChannel;

	public int BrandID;
	public int VariantID;
	
	public Date _SaleDate;
}
Sep 14 2012
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Roger:

 The cached list consists of about 500000 items 
 (ExtendedOrderItemDTO) and when we use a search criteria that 
 results in a large amount of items we get a memory leak of 
 about 90MB on each run.

This is interesting. Currently the D GC is conservative, this means that some pointers coming (inbound) randomly inside the large arrays can keep them alive. Until there is more precision (and the Summer Of Code has produced something), having 64 but pointers reduces this problem. So are you using a 32 bit compilation? Are you able to compile it at 64 bit? I think you are using Windows, and while the 64 bit Windows DMD2 is coming (it already compiles Phobos2) it's not yet usable.
   private static shared ExtendedOrderItemDTO[] _orderItems;
   public static  property ExtendedOrderItemDTO[] OrderItems()
   {
     return cast(ExtendedOrderItemDTO[]) _orderItems;
   }

Generally in D method/function names start with a lowercase, unlike C#. What's the purpose of the cast? Generally it's better to minimize casts. And maybe this is enough in your case (not tested), and a bit safer: return cast()_orderItems;
 	auto result=filter!( (ExtendedOrderItemDTO x) =>
 	  x._SaleDate >= state.StatePeriod.From &&
 	  x._SaleDate <= state.StatePeriod.To &&
 	  (
 	  state.Brand.Level == 0 ||
 	  (state.Brand.Level == 1 && state.Brand.ID == x.BrandID) ||
 	  (state.Brand.Level == 2 && state.Brand.ID == x.VariantID) ||
 	  (state.Brand.Level == 3 && state.Brand.ID == x.ProductID)
 	) &&
 	(
 	  state.Channel.Level == 0 ||
 	  (state.Channel.Level == 1 && state.Channel.ID == x.Channel) 
 ||
 	  (state.Channel.Level == 2 && state.Channel.ID == 
 x.SubChannel)
 	) &&
 	(
 	  state.ProductType == 2 ||
 	  (state.ProductType == 0 && x.IsFMC == true) ||
 	  (state.ProductType == 1 && x.IsFMC == false)	
 	)
 	)(Cache.OrderItems);

 	return array(result);

That's the biggest filtering lambda I have seen so far :o) It's an interesting example of real-world code that I've never seen in Haskell (or D). In D there is the (nestable) with() construct that sometimes helps reduce the size of similar jungles. Maybe here using UFCS helps readability a bit: auto result = Cache.OrderItems.filter!((ExtendedOrderItemDTO x) => ... )(); Bye, bearophile
Sep 14 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
http://stackoverflow.com/a/8796226/541686
Sep 14 2012
prev sibling parent "Roger" <rop411 gmail.com> writes:
Thanks you for your feedback. Yes, we are using win32 on windows. 
Sounds nice with a 64 bit compiler! Do you know when it will be 
available?

We tried to manually manage the memory by using: 
GC.free(items.ptr);
Where "items" is the large array. It worked when pinvoked from a 
(nonthreaded) windows application. The memory was released. 
However in a threaded web application environment it failed.
Sep 18 2012