| Precisely, this is a collector for a group of local heaps,
in which intra-heap pointers are represented simply by their addresses
while inter-heap (remote) pointers by a local pointer to special stub
objects. A stub represents a remote reference to the corresponding
remote object. The collector reclaims memory no longer reachable
from any processor's root by a chain of local/remote pointers.
You can use this library not only with your stand-alone C/C++ programs
but to construct more high-level distributed systems. From another
point of view, this provides a set of basic functions of distributed object
systems with a distributed garbage collector. One can easily construct
a distributed object system using this as a foundation of the system. Higher
abstractions such as remote method invocation or object migration can be
constructed on top of this library. We are working on a sample implementation
of a distributed object system (a variant of C++) using this library
as a major part of its runtime system.
Table of Contents:
Basic Examples
see example.c for now.
Library Interfaces
Clients should include dgc.h, which
includes Boehm GC's gc.h from within.
Codes must be linked against the distributed GC library (libdgc.a)
and some message-passing library. We provide some sample (but hopefully
useful) implementations of message-passing routines on MPI message-passing
standard, SCore message-passing
library, and SMP using shared-memory. The most important protocol
to use this library is that clients must call DGC_polling
periodically for distributed garbage collection.
Below we describe interfaces of our distributed GC library. The interfaces
are split into four groups:
Note: DGC_PE type in the following tables
are defined as an integer type. Balls are:
 |
indicates an important interface. |
 |
indicates a less important interface. |
 |
indicates that you must implement the function by yourself. |
The folowing interfaces are to create remote references
and distributed ojects. A public object, which can be referenced
by remote processors, must be allocated by GC_malloc*.
A remote reference to a public object can be created through the following
steps:
- Prepare a uid of DGC_uid
type.
- Export an object through DGC_export.
DGC_export puts internal information of DGC
into uid.
- Send uid to the remote processor where
you want to create the remote reference to the object.
- On the remote processor, pass received uid
to DGC_import. DGC_import
returns a pointer to a special stub object, which can be distinguished
from local objects by DGC_is_local.
uid becomes unavailable once it is passed
to DGC_import. Dereferencing a remote object
from a stub goes through exactly the same process: prepare a uid,
export the stub, send the uid to the remote
processor where the object exists (which can be obtained by DGC_get_home),
and import the uid on the processor.
Then the local pointer to the object is returned by DGC_import.
Distributed Object Interfaces
 |
void *GC_malloc(int size); |
allocates size bytes of memory.
An object allocated by this will be automatically reclaimed after it becomes
garbage. |
 |
void *GC_malloc_atomic(int size); |
Ditto. But the allocated memory contains no pointers. |
 |
void *GC_malloc_uncollectable(int
size); |
Ditto. But the allocated memory is retained until explicitly
reclaimed by GC_free. |
 |
int DGC_export(void *obj,
DGC_uid *uid); |
sets up uid to pass a reference to obj
to another processor. obj should be
one of NULL, returned value of GC_malloc*,
or returned value of DGC_import. The
returned uid must be passed to DGC_import
only once. Upon successful, it returns 0; otherwise (when the exporting
obj is an obsolete stub) it returns -1. This
is the most generic export function. |
 |
int DGC_export_many(void *obj,
DGC_uid *uid, int n); |
Ditto. But the returned uid must be passed
to DGC_import exactly n
times. |
 |
int DGC_export_stub(DGC_stub *stub,
DGC_uid *uid); |
A faster version of DGC_export. obj
must be a pointer to a stub. |
 |
int DGC_export_stub_many(DGC_stub
*stub, DGC_uid *uid,
int n); |
Ditto. But the returned uid must be passed
to DGC_import exactly n
times. |
 |
int DGC_export_local(void *lobj,
DGC_uid *uid); |
Another variant of DGC_export. It always
treats lobj as a local object, even when lobj
is a pointer to a stub. This is useful when you create netsted or indirect
remote references. Unlike DGC_export, this
does not accept NULL as lobj. |
 |
int DGC_export_local_many(void *lobj,
DGC_uid *uid, int n); |
Ditto. But the returned uid must be passed
to DGC_import exactly n
times. |
 |
void *DGC_import(const DGC_uid *uid); |
returns a pointer to either a local object or
a stub. uid must have been set up by
DGC_export. This returns -1 when uid
points an invalid scion.
Notice: Ojects that have been exported
but not imported yet are guaranteed to be alive (i.e., never be reclaimed). |
 |
int DGC_is_local(void *obj); |
returns 1 if obj
is a local object; otherwise (when obj
is a stub) returns 0. obj must
be a return value of GC_malloc* or DGC_import.
Note: This returns 1 if obj
is NULL. |
 |
DGC_PE DGC_get_home(void *obj); |
returns the ID of PE where obj exists.
This is a generic (but slow) function to get an object's home PE. |
 |
DGC_PE DGC_get_stub_home(DGC_stub *stub); |
returns the ID of remote PE where the object referenced by stub
exists. stub should be a pointer to
a stub, not a pointer to a local object. |
 |
DGC_PE DGC_get_my_ID(void); |
returns the ID number of the PE where this is invoked. |
 |
int DGC_get_num_PE(void); |
returns the current number of PEs in the entire system. |
 |
int DGC_stub_is_valid(DGC_stub *stub); |
checks whether stub is a valid stub or
an obsolete stub. A stub becomes obsolete when the PE where the object
referenced by stub exists is removed.
It returns 1 if stub is valid; otherwise it
returns 0. |
 |
int DGC_delete(DGC_stub *stub); |
forces removing a stub from the exit table. stub
must be a pointer to a stub. stub is
then automatically freed. It returns 0 upon successful; otherwise it returns
-1.
WARNING: This may break collectability of
objects in some algorithms. You should always use this together with DGC_unexport. |
 |
int DGC_unexport(void *obj); |
forces removing an object from the entry table. obj
must be a pointer to a local object. obj will
NOT be freed. It returns 0 upon successful; otherwise it returns -1.
WARNING: Invalid use of DGC_unexport
will cause a runtime error. You must guarantee that no remote stubs no
longer exist when you call this. |
The following interfaces are to perform distributed garbage
collection. Refer to Boehm GC's manual on lobal GC, especially about
GC_gcollect and GC_free_space_divisor.
The most important function is DGC_polling,
which you must periodically call in your programs. It checks and
processes messages for distributed garbage collection. As described
later in this document, when you use our standard message-passing library,
implicit polling of GC messages is done in sending and receiving user-level
messages; thus you can save calling DGC_polling
as long as communicating periodically.
Distributed GC Interfaces
 |
int DGC_GC_init(int *argcp,
char **argv); |
initializes DGC module. You must call this before calling any DGC functions
(calling Boehm GC functions is OK). You may not call this directly since
this does not set ID of PE, # of PEs, and statuses of PEs. Insted, you
can use DGC_init provided by all of our standard
message-passing modules, which calls DGC_GC_init
from within. So you don't need to call this when using our message-passing
modules. |
 |
void DGC_polling(void); |
checks if GC messages have come and processes them if there are some.
All users of this library must call this function periodically at some
intervals from the client code except when they use our providing message-passing
library and periodically perform sending or receiving messages. |
 |
void GC_gcollect(void); |
explicitly triggers a full, world-stop collection. The collection algorithm
will be choosed judiciously. |
 |
void DGC_request_global_GC(void); |
requests a global GC. This can be invoked on any processor, but only
one stroke of global GC will take at a time. |
 |
void DGC_independent_local_GC(void); |
triggers a full, world-stop local GC independently from other processors. |
 |
void DGC_syncronous_local_GC(void); |
triggers a full, world-stop local GC; at the same time it requests
other processors to perform local GC simultaneously (not synchronized)
as well. |
 |
int DGC_dont_GMS; |
global mark-and-sweep will not be triggered if DGC_dont_GMS
is set to a number other than 0. Default is 0. |
 |
int DGC_dont_SLGC; |
processor will not issue local GC requests to other processors upon
starting a local GC if DGC_dont_SLGC is set
to a number other than 0. Default is 0. |
 |
int DGC_dont_LGC; |
local GC will not be triggered if DGC_dont_LGC
is set to a number other than 0. Default is 0. Setting both DGC_dont_GMS
and DGC_dont_LGC to other than 0 has the same
effect as setting GC_dont_GC to other than
0. |
 |
int GC_dont_GC; |
setting this to a number other than 0 will prohibit GC at all. The
library will simply expand the heap when heap is exhausted. Default is
0. |
 |
unsigned int DGC_free_space_divisor; |
controls frequency of global GC. It behaves as much like as GC_free_space_divisor
of Boehm GC. The default value is 3. |
 |
int DGC_is_active(DGC_PE pe); |
returns 1 if pe is active; otherwise it
returns 0. |
We provide a standard message-passing API with our
GC library. Though we recommend you to use them, you do not need to do
so. Sending and receiving messages in this standard API seem to be blocked
until the operation completes, but they correctly handle GC messages while
they are blocking. Receiving a message must always be done by a pair
of DGC_receive_msg & DGC_receive_done.
Standard Message-Passing Interfaces
 |
int DGC_init(int *argcp,
char ***argvp); |
initializes GC module and message-passing module. This calls DGC_GC_init
from within and sets ID of PE, # of PEs, and statuses of PEs as well as
initializing message-passing facility. It returns 0 upon successful; otherwise
it returns -1. |
 |
int DGC_finalize(void); |
synchronizes all active PEs, and then disables communication and distributed-GC
among them. Upon successful, it returns 0; otherwise it returns -1. |
 |
void DGC_abort(void); |
aborts the execution of the PE where it is called; it may stop the
other PEs too. |
 |
int DGC_check_msg(void); |
checks that there are any incoming messages from other processors.
It returns 0 when there are no messages; otherwise it returns non-zero. |
 |
int DGC_send_msg(DGC_PE pe,
void *data, int size); |
sends a block message of size byte length
from data to pe.
data can be NULL only when size
is 0. Upon successful, it returns 0; otherwise it returns -1. |
 |
void *DGC_receive_msg(DGC_status *status); |
receives a message sent by DGC_send_msg
and returns the pointer to it. The returned pointer is aligned to
double. This will block when there are no incoming messages. status
can be NULL; otherwiseDGC_receive_msg
stores information of received message into status. |
 |
void DGC_receive_done(void *msg); |
releases previously received msg. It declares
the end of message retrieval. Accessing msg
beyond this point is not allowed. |
 |
void DGC_barrier (void); |
synchronizes all PEs at the point; must be called by all PEs at the
same time. |
 |
DGC_PE DGC_get_msg_sender(DGC_status *status); |
returns ID of the sender processor of the message that is correspondent
to status. |
 |
unsigned int DGC_get_msg_size(DGC_status
*status); |
returns the size of the message that is correpondent to status. |
You can use communication routines of your own through
the following steps.
- Look into plugin.h header file.
Implement DGC_internal_{send,poll,recv} and
DGC_receive_done for GC communication. Messages
sent by DGC_internal_send (GC message)
must be distinguished from other messages and received by only DGC_internal_poll
or DGC_internal_recv.
- You must call DGC_GC_init before calling
any DGC_* functions.
- Activate all processors in the system by
foreach ID (id0, id1, id2, ...) do DGC_activate_PE
(ID);
DGC_activate_PE works locally, therefore you
must do this on all processors.
- Set ID number of the processor with DGC_set_my_ID.
The ID must be unique for each processor. ID numbers must range from 0
to (DGC_MAX_PE_ID - 1).
- Set ID number of the master processor with DGC_set_master_ID.
The ID must be identical over all processors.
- You cannot call GC & DGC functions or modify GC heap in DGC_internal_*.
Typically you can store non-GC messages in memory allocated by normal malloc
when you find them in DGC_internal_recv or
DGC_internal_poll.
- GC messages should outrun other messages. For example, you must periodically
check arrival of GC messages by DGC_polling
even when waiting and receiving other messages.
Additionally, if you intend to develop routines that conforms to our standard
message-passing interfaces (see above), go through the following steps.
- Look into dgc_stdmsg.h header file.
Implement all functions declared in it. Information about DGC_status
structure should be found in the same file. The library includes some sample
implementations (shmem.c, mpi.c,
score.c) which will give you good examples.
- Call DGC_polling from within DGC_check_msg,
DGC_send_msg, and DGC_receive_msg.
As mentioned above, don't block a processor during receiving message (i.e.,
call DGC_polling repeatedly while waiting
for a user-level message in DGC_receive_msg).
- DGC_init should set ID number of the processor,
the number of processors, and ID number of the master processor in the
system as described above. It should also activate processors by DGC_activate_PE.
To make your message-passing routines be multi-thread safe, a mutex (DGC_msg_lock)
is provided for that purpose. Some notes to use it are:
- Look into dgc_mt.h header file.
MT-safe support is enabled when DGC_MT_SAFE
macro is defined.
- Use DGC_MUTEX_LOCK & DGC_MUTEX_UNLOCK
to acquire & release the mutex.
- You may need to allow sending while receiving a message in some kind
of message-passing environment to avoid deadlocks. score.c
is an example.
- DGC_msg_lock must be released before calling
any DGC & GC functions, e.g. DGC_polling.
Communication Layer Plug-In Interfaces
 |
int DGC_internal_send(DGC_PE pe,
void *msg, int size); |
sends a size byte GC message from msg
to pe. Upon successful,
it returns 0; otherwise it returns -1. |
 |
void *DGC_internal_recv(void); |
waits until a GC message to come, and returns the pointer to the message.
The returned pointer should be aligned to double. You must not call DGC_process_message
within this function. |
 |
void *DGC_internal_poll(void); |
checks whether a GC message has come. If so, it returns the pointer
to the message; otherwise it returns NULL.
The returned pointer should be aligned to double. You must not call DGC_process_message
within this function. |
 |
void DGC_internal_recv_done(void *msg); |
releases previously received msg. It declares
the end of message retrieval. Accessing msg
beyond this point is not allowed. You can safely free or reuse memory for
msg. |
 |
int DGC_set_my_ID(DGC_PE id); |
sets the ID number of the processor to id.
All processors must have a unique ID number. id
will be returned by DGC_get_my_ID. It returns
-1 if id is not appropriate; otherwise it
returns 0. |
 |
int DGC_set_master_ID(DGC_PE id); |
sets the ID number of the master processor of DGC. The ID must be idential
over all processors. It returns -1 if id is
not appropriate; otherwise it returns 0. |
 |
int DGC_activate_PE(DGC_pe pe); |
sets the status of pe as active. It returns
-1 if pe is not appropriate; otherwise it
returns 0. |
 |
void DGC_inactivate_PE(DGC_pe pe); |
prepared for future enhancement. |
 |
DGC_mutex_t *DGC_msg_lock; |
is a mutex intended to be used to make a message-passing module be
multi-thread safe. |
 |
int DGC_MUTEX_LOCK(DGC_mutex_t *lock); |
acquires lock. |
 |
int DGC_MUTEX_UNLOCK(DGC_mutex_t *lock); |
releases lock. |
Algorithms and Implementation Techniques
Under construction
Requirements
| Compiler |
ANSI C compiler. GCC is recommended. |
| OS |
Modern Unix-like OS. Porting to other OS is easy if supported by Boehm
GC. |
| Machine |
int in C must have at least 32 bits. |
| message-passing library |
One of MPI, SCore-S,
SMP machines (with shmem.c module).
Anyway you can port DGC to your environment without hacking the library. |
| Knowledge |
Basics of distributed garbage collection. See this
great survey. |
Availability
|