www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - interest in writing ints (to BitArray)?

  I've always been interested in trying to get more compression 
out of data. However the approach depending on circumstances can 
result in general purpose extra compression or not.

  Regardless. A while back in 2013 I wrote an appended function 
that was part of my BitArray extensions for reading/writing ints. 
These you would specify the range of the data (min/max) and it 
drops unnecessary bits when possible. Great for compacting, not 
so great for totally random access.


Crop: This is the simplest, if you know your range is less than 
16, then simply reducing the output to 4 bits lets you store your 
values safely.

Orig: Original concept of encoding most significant bits and 
going down, if a bit can't possibly be 1, it is skipped. If we 
say 10 is the max (1010) if the upper most bit of a number is 1, 
the following 0 is never considered.

7 -> 0111
8 -> 100
9 -> 101
10-> 11

V2: Experimenting if we go from least significant to most 
significant, if upper bits would make a number too large 
(exceeding the max) it (and following bits) are dropped. This 
results in at least as good if not better encoding than before.

2 -> 0100
3 -> 110
4 -> 001
5 -> 101
6 -> 011
7 -> 111
8 -> 0001

This table assumes every combination is tested with it's range.
1 combination included as no ranges won't write anything, but can 
still return the expected limited values, offering 100% 
compression in these cases.

Comb    Crop    Orig    V2
0/1:    0       0       0
2:      2       2       2
3:      6       5       5
4:      8       8       8
5:      15      13      12
6:      18      16      16
7:      21      20      20
8:      24      24      24
9:      36      33      29
10:     40      36      34
11:     44      40      39
12:     48      44      44
13:     52      50      49
14:     56      54      54
15:     60      59      59
16:     64      64      64
Nov 20 2016