dysfun@treehouse.systems ("gaytabase") wrote:
so what is it we're actually maintaining here? it's keys and refcounts, but it's essentially a sparse mapping over a subset of the transaction id space.
for sparse data, things get annoying and you need to start allocating quite readily. you have essentially two options:
- tree
- list
now, on the basis that our transaction ids are nondecreasing, i argue a list makes more sense. in particular, we require quick access to the last item in the list to potentially increment the highest transaction's refcount (in case and beyond there, shrug we will have to do a certain amount of searching.
however, we want to keep maintainance to a minimum. otherwise we replace an expensive operation with a merely less expensive operation.
so a doubly linked list is out. what can we do? an array list? well, possibly if we add chunking.
chunking enables you to amortise the cost of malloc/free. this is going to be a huge thing for getting the average cost down (albeit not the worst case cost). in this case it serves the extra purpose of also limiting a rearrangement within a chunk to a small number of items. double win!