mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-28 05:50:30 +02:00
Update README
This commit is contained in:
parent
7b85284a89
commit
7b60164cac
1 changed files with 74 additions and 0 deletions
74
README.md
74
README.md
|
@ -3,6 +3,80 @@
|
|||
This repository is a workbench for implementing different GCs. It's a
|
||||
scratch space.
|
||||
|
||||
## What's there
|
||||
|
||||
There's just the (modified) GCBench, which is an old but standard
|
||||
benchmark that allocates different sizes of binary trees. It takes a
|
||||
heap of 25 MB or so, not very large, and causes somewhere between 20 and
|
||||
50 collections, running in 100 to 500 milliseconds on 2022 machines.
|
||||
|
||||
Then there are currently three collectors:
|
||||
|
||||
- `bdw.h`: The external BDW-GC conservative parallel stop-the-world
|
||||
mark-sweep segregated-fits collector with lazy sweeping.
|
||||
- `semi.h`: Semispace copying collector.
|
||||
- `mark-sweep.h`: Stop-the-world mark-sweep segregated-fits collector
|
||||
with lazy sweeping.
|
||||
|
||||
The two latter collectors reserve one word per object on the header,
|
||||
which makes them collect more frequently than `bdw` because the `Node`
|
||||
data type takes 32 bytes instead of 24 bytes.
|
||||
|
||||
These collectors are sketches and exercises for improving Guile's
|
||||
garbage collector. Guile currently uses BDW-GC. In Guile if we have an
|
||||
object reference we generally have to be able to know what kind of
|
||||
object it is, because there are few global invariants enforced by
|
||||
typing. Therefore it is reasonable to consider allowing the GC and the
|
||||
application to share the first word of an object, for example to store a
|
||||
mark bit, to allow the application to know what kind an object is, to
|
||||
allow the GC to find references within the object, to allow the GC to
|
||||
compute the object's size, and so on.
|
||||
|
||||
The GCBench benchmark is small but then again many Guile processes also
|
||||
are quite short-lived, so perhaps it is useful to ensure that small
|
||||
heaps remain lightweight.
|
||||
|
||||
Guile has a widely used C API and implements part of its run-time in C.
|
||||
For this reason it may be infeasible to require precise enumeration of
|
||||
GC roots -- we may need to allow GC roots to be conservatively
|
||||
identified from data sections and from stacks. Such conservative roots
|
||||
would be pinned, but other objects can be moved by the collector if it
|
||||
chooses to do so. We assume that object references within a heap object
|
||||
can be precisely identified. (The current BDW-GC scans for references
|
||||
conservatively even on the heap.)
|
||||
|
||||
A likely good solution for Guile would be an [Immix
|
||||
collector](https://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix-pldi-2008.pdf)
|
||||
with conservative roots, and a parallel stop-the-world mark/evacuate
|
||||
phase. We would probably follow the [Rust
|
||||
implementation](http://users.cecs.anu.edu.au/~steveb/pubs/papers/rust-ismm-2016.pdf),
|
||||
more or less, with support for per-line pinning. In an ideal world we
|
||||
would work out some kind of generational solution as well, either via a
|
||||
semispace nursery or via sticky mark bits, but this requires Guile to
|
||||
use a write barrier -- something that's possible to do within Guile
|
||||
itself but it's unclear if we can extend this obligation to users of
|
||||
Guile's C API.
|
||||
|
||||
In any case, these experiments also have the goal of identifying a
|
||||
smallish GC abstraction in Guile, so that we might consider evolving GC
|
||||
implementation in the future without too much pain. If we switch away
|
||||
from BDW-GC, we should be able to evaluate that it's a win for a large
|
||||
majority of use cases.
|
||||
|
||||
## To do
|
||||
|
||||
- [ ] Implement a parallel marker for the mark-sweep collector.
|
||||
- [ ] Adapt GCBench for multiple mutator threads.
|
||||
- [ ] Implement precise non-moving Immix whole-heap collector.
|
||||
- [ ] Add evacuation to Immix whole-heap collector.
|
||||
- [ ] Add parallelism to Immix stop-the-world phase.
|
||||
- [ ] Implement conservative root-finding for the mark-sweep collector.
|
||||
- [ ] Implement conservative root-finding and pinning for Immix.
|
||||
- [ ] Implement generational GC with semispace nursery and mark-sweep
|
||||
old generation.
|
||||
- [ ] Implement generational GC with semispace nursery and Immix
|
||||
old generation.
|
||||
|
||||
## License
|
||||
|
||||
GCBench.c, MT_GCBench.c, and MT_GCBench2.c are from
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue