Pointer Swizzling: ObjectStore & QuickStore
A main system design contribution in OODBMS, and ObjectStore does it in
a clever way. QuickStore is a responsible academic study of the technique,
comparing it to the obvious alternative.
-
Problem: C++ contains pointers. Persistent C++ stores pointers on disk,
somehow. At fetch time, must fetch the disk objects into memory and turn
stored refs into valid memory pointers. This is called "swizzling" a disk
pointer to a (virtual) memory pointer.
-
Can be done with hash indices over the buffer pool, and special compilation
(persistent-object refs turn into function calls that do hash lookups to
map disk ptr to memory pointer; each time you follow a persistent ptr in
C++ you call a function and traverse a hash index).
-
ObjectStore & QuickStore take advantage of VM protection to do this
better
-
Persistent pointers are stored as normal C++ memory refs -- even on
disk!
Mapping
(static) (dynamic)
disk page -----------> VM address -----------> physical mem address (buffer
pool)
-
Buffer pool is a file, virtual memory is mapped to it via mmap()
-
mapping from VM to physical mem done by OS
-
mapping from disk page to VM start done with a QuickStore main-mem table:
(virtual address range, physical disk location, pointer in memory, flags)
-
flags include access type (R/W), whether page is X-locked, whether page
has been read in already
-
binary tree index over virtual address ranges
-
hashtable index over physical disk location
Disk page contains mapping object, and each persistent object has
a bitmap object
-
meta-object is a header on the page that points to the mapping
object and the bitmap object
-
mapping object is a table of (virtual addr, disk addr) pairs for
all pointers from this page to elsewhere
-
bitmap object is a table of locations of pointers within the objects
on the page (obtained via type analysis)
Pointer Swizzling
-
VM page is access-protected before first read
-
Upon reading a page, mapping object is fetched
-
for each entry in mapping object
-
check main-mem table to see if the entry's page is already mapped
-
if not, map now
-
try to allocate the old VM frame if possible
-
if not possible, allocate a new VM frame and swizzle pointers to
old frame
(found via bitmap object)
-
access protect the frame that's allocated
-
if any of the pointers in the mapping object were changed (placed elsewhere
in VM), go to bitmap object and swizzle all pointers on the page.
note that swizzling may be necessary even if this read did not result
in the relocation in VM
-
Subsequent traversals of that pointer work as fast as normal C++ refs;
database is out of the way!
-
Mapping of VM frame to page must remain valid for duration of xact.
-
When a buffer pool page is evicted, the VM frame is protected again to
initiate a fetch on the next ref.
Example: linked list.
Paper gives a brief overview of OO7 benchmark (the main OODBMS benchmark)
Pros:
-
Pointer traversal is as fast as it ever was
-
Allows "legacy" C++ code to manipulate persistent data
Cons:
-
Places some limits on amount of data one xact can access at a time, or
from one page.
-
Initial faulting cost higher; shows up in access-once workloads.
Transactional fallout:
-
Concurrency control via page-level locking
-
Logging of page diffs
Client writes go to log by commit time, then (lazily) to DB