diff options
| author | Akinori Ito <aito@eie.yz.yamagata-u.ac.jp> | 2001-11-15 00:32:13 +0000 | 
|---|---|---|
| committer | Akinori Ito <aito@eie.yz.yamagata-u.ac.jp> | 2001-11-15 00:32:13 +0000 | 
| commit | 85da7ee692072c643939e9f4b24fbd1e74e64e70 (patch) | |
| tree | 9fc63298cf968fa560a9e3cf9b6c84516032fca8 /gc/doc/tree.html | |
| parent | Updates from 0.2.1 into 0.2.1-inu-1.5 (diff) | |
| download | w3m-85da7ee692072c643939e9f4b24fbd1e74e64e70.tar.gz w3m-85da7ee692072c643939e9f4b24fbd1e74e64e70.zip | |
Update to w3m-0.2.1-inu-1.6.
Diffstat (limited to '')
| -rw-r--r-- | gc/doc/tree.html | 198 | 
1 files changed, 198 insertions, 0 deletions
| diff --git a/gc/doc/tree.html b/gc/doc/tree.html new file mode 100644 index 0000000..89c515d --- /dev/null +++ b/gc/doc/tree.html @@ -0,0 +1,198 @@ +<HTML> +<HEAD> +    <TITLE>  Two-Level Tree Structure for Fast Pointer Lookup</TITLE> +    <AUTHOR> Hans-J. Boehm, Silicon Graphics</author> +</HEAD> +<BODY> +<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1> +<P> +The conservative garbage collector described +<A HREF="gc.html">here</a> uses a 2-level tree +data structure to aid in fast pointer identification. +This data structure is described in a bit more detail here, since +<OL> +<LI> Variations of the data structure are more generally useful. +<LI> It appears to be hard to understand by reading the code. +<LI> Some other collectors appear to use inferior data structures to +solve the same problem. +<LI> It is central to fast collector operation. +</ol> +A candidate pointer is divided into three sections, the <I>high</i>, +<I>middle</i>, and <I>low</i> bits.  The exact division between these +three groups of bits is dependent on the detailed collector configuration. +<P> +The high and middle bits are used to look up an entry in the table described +here.  The resulting table entry consists of either a block descriptor +(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>) +identifying the layout of objects in the block, or an indication that this +address range corresponds to the middle of a large block, together with a +hint for locating the actual block descriptor.  Such a hint consist +of a displacement that can be subtracted from the middle bits of the candidate +pointer without leaving the object. +<P> +In either case, the block descriptor (<TT>struct hblkhdr</tt>) +refers to a table of object starting addresses (the <TT>hb_map</tt> field). +The starting address table is indexed by the low bits if the candidate pointer. +The resulting entry contains a displacement to the beginning of the object, +or an indication that this cannot be a valid object pointer. +(If all interior pointer are recognized, pointers into large objects +are handled specially, as appropriate.) + +<H2>The Tree</h2> +<P> +The rest of this discussion focuses on the two level data structure +used to map the high and middle bits to the block descriptor. +<P> +The high bits are used as an index into the <TT>GC_top_index</tt> (really +<TT>GC_arrays._top_index</tt>) array.  Each entry points to a +<TT>bottom_index</tt> data structure.  This structure in turn consists +mostly of an array <TT>index</tt> indexed by the middle bits of +the candidate pointer.  The <TT>index</tt> array contains the actual +<TT>hdr</tt> pointers.  +<P> +Thus a pointer lookup consists primarily of a handful of memory references, +and can be quite fast: +<OL> +<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in +<TT>GC_top_index</tt>, based on the high bits of the candidate pointer. +<LI> The appropriate <TT>hdr</tt> pointer is looked up in the +<TT>bottom_index</tt> structure, based on the middle bits. +<LI> The block layout map pointer is retrieved from the <TT>hdr</tt> +structure.  (This memory reference is necessary since we try to share +block layout maps.) +<LI> The displacement to the beginning of the object is retrieved from the +above map. +</ol> +<P> +In order to conserve space, not all <TT>GC_top_index</tt> entries in fact +point to distinct <TT>bottom_index</tt> structures.  If no address with +the corresponding high bits is part of the heap, then the entry points +to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting +only of NULL <TT>hdr</tt> pointers. +<P> +<TT>Bottom_index</tt> structures contain slightly more information than +just <TT>hdr</tt> pointers.  The <TT>asc_link</tt> field is used to link +all <TT>bottom_index</tt> structures in ascending order for fast traversal. +This list is pointed to be <TT>GC_all_bottom_indices</tt>. +It is maintained with the aid of <TT>key</tt> field that contains the +high bits corresponding to the <TT>bottom_index</tt>. + +<H2>64 bit addresses</h2> +<P> +In the case of 64 bit addresses, this picture is complicated slightly +by the fact that one of the index structures would have to be huge to +cover the entire address space with a two level tree.  We deal with this +by turning <TT>GC_top_index</tt> into a chained hash table, instead of +a simple array.  This adds a <TT>hash_link</tt> field to the +<TT>bottom_index</tt> structure. +<P> +The "hash function" consists of dropping the high bits.  This is cheap to +compute, and guarantees that there will be no collisions if the heap +is contiguous and not excessively large. + +<H2>A picture</h2> +<P> +The following is an ASCII diagram of the data structure. +This was contributed by Dave Barrett several years ago. +<PRE> + +		Data Structure used by GC_base in gc3.7: +			      21-Apr-94 +			  +			 + + +    63                  LOG_TOP_SZ[11]  LOG_BOTTOM_SZ[10]   LOG_HBLKSIZE[13] +   +------------------+----------------+------------------+------------------+ + p:|                  |   TL_HASH(hi)  |                  |   HBLKDISPL(p)   | +   +------------------+----------------+------------------+------------------+ +    \-----------------------HBLKPTR(p)-------------------/ +    \------------hi-------------------/  +                      \______ ________/ \________ _______/ \________ _______/ +                             V                   V                  V +                             |                   |                  | +           GC_top_index[]    |                   |                  |  + ---      +--------------+   |                   |                  |   +  ^       |              |   |                   |                  |    +  |       |              |   |                   |                  |    + TOP      +--------------+<--+                   |                  |       + _SZ   +-<|      []      | *                     |                  |      +(items)|  +--------------+  if 0 < bi< HBLKSIZE  |                  |     +  |    |  |              | then large object     |                  |     +  |    |  |              | starts at the bi'th   |                  |     +  v    |  |              | HBLK before p.        |             i    |     + ---   |  +--------------+                       |          (word-  |     +       v                                         |         aligned) |     +   bi= |GET_BI(p){->hash_link}->key==hi          |                  |    +       v                                         |                  |     +       |   (bottom_index)  \ scratch_alloc'd     |                  |     +       |   ( struct  bi )  / by get_index()      |                  |     + ---   +->+--------------+                       |                  |     +  ^       |              |                       |                  | +  ^       |              |                       |                  | + BOTTOM   |              |   ha=GET_HDR_ADDR(p)  |                  | +_SZ(items)+--------------+<----------------------+          +-------+ +  |   +--<|   index[]    |                                  |          +  |   |   +--------------+                      GC_obj_map: v               +  |   |   |              |              from      / +-+-+-----+-+-+-+-+  ---  +  v   |   |              |              GC_add   < 0| | |     | | | | |   ^   + ---  |   +--------------+             _map_entry \ +-+-+-----+-+-+-+-+   |   +      |   |   asc_link   |                          +-+-+-----+-+-+-+-+ MAXOBJSZ +      |   +--------------+                      +-->| | |  j  | | | | |  +1    +      |   |     key      |                      |   +-+-+-----+-+-+-+-+   |   +      |   +--------------+                      |   +-+-+-----+-+-+-+-+   |  +      |   |  hash_link   |                      |   | | |     | | | | |   v  +      |   +--------------+                      |   +-+-+-----+-+-+-+-+  --- +      |                                         |   |<--MAX_OFFSET--->|    +      |                                         |         (bytes) +HDR(p)| GC_find_header(p)                       |   |<--MAP_ENTRIES-->|  +      |                           \ from        |    =HBLKSIZE/WORDSZ    +      |    (hdr) (struct hblkhdr) / alloc_hdr() |    (1024 on Alpha) +      +-->+----------------------+              |    (8/16 bits each) +GET_HDR(p)| word   hb_sz (words) |              |           +          +----------------------+              |      +          | struct hblk *hb_next |              | +          +----------------------+              |        +          |mark_proc hb_mark_proc|              | +          +----------------------+              | +          | char * hb_map        |>-------------+ +          +----------------------+            +          | ushort hb_obj_kind   |            +          +----------------------+            +          |   hb_last_reclaimed  |            + ---      +----------------------+                 +  ^       |                      | + MARK_BITS|       hb_marks[]     | *if hdr is free, hb_sz + DISCARD_WORDS +_SZ(words)|                      |  is the size of a heap chunk (struct hblk) +  v       |                      |  of at least MININCR*HBLKSIZE bytes (below), + ---      +----------------------+  otherwise, size of each object in chunk. + +Dynamic data structures above are interleaved throughout the heap in blocks of  +size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be +freed; free lists are used (e.g. alloc_hdr).  HBLK's below are collected. + +	      (struct hblk)       + ---      +----------------------+ < HBLKSIZE ---         ---          DISCARD_ +  ^       |garbage[DISCARD_WORDS]|   aligned   ^           ^ HDR_BYTES WORDS +  |       |                      |             |           v (bytes)   (words) +  |       +-----hb_body----------+ < WORDSZ    |          ---   ---    +  |       |                      |   aligned   |           ^     ^ +  |       |      Object 0        |             |           hb_sz | +  |       |                      |           i |(word-    (words)| +  |       |                      |      (bytes)|aligned)   v     | +  |       + - - - - - - - - - - -+ ---         |          ---    | +  |       |                      |  ^          |           ^     | +  n *     |                      |  j (words)  |          hb_sz BODY_SZ  + HBLKSIZE |      Object 1        |  v          v           |   (words) + (bytes)  |                      |---------------          v   MAX_OFFSET +  |       + - - - - - - - - - - -+                        ---  (bytes) +  |       |                      | !All_INTERIOR_PTRS      ^     | +  |       |                      | sets j only for       hb_sz   | +  |       |      Object N        | valid object offsets.   |     | +  v       |                      | All objects WORDSZ      v     v + ---      +----------------------+ aligned.               ---   --- + +DISCARD_WORDS is normally zero.  Indeed the collector has not been tested +with another value in ages. +</pre> +</body> | 
