mirror of
https://git.savannah.gnu.org/git/guile.git
synced 2025-06-28 14:00:31 +02:00
Update README
This commit is contained in:
parent
7b85284a89
commit
7b60164cac
1 changed files with 74 additions and 0 deletions
74
README.md
74
README.md
|
@ -3,6 +3,80 @@
|
||||||
This repository is a workbench for implementing different GCs. It's a
|
This repository is a workbench for implementing different GCs. It's a
|
||||||
scratch space.
|
scratch space.
|
||||||
|
|
||||||
|
## What's there
|
||||||
|
|
||||||
|
There's just the (modified) GCBench, which is an old but standard
|
||||||
|
benchmark that allocates different sizes of binary trees. It takes a
|
||||||
|
heap of 25 MB or so, not very large, and causes somewhere between 20 and
|
||||||
|
50 collections, running in 100 to 500 milliseconds on 2022 machines.
|
||||||
|
|
||||||
|
Then there are currently three collectors:
|
||||||
|
|
||||||
|
- `bdw.h`: The external BDW-GC conservative parallel stop-the-world
|
||||||
|
mark-sweep segregated-fits collector with lazy sweeping.
|
||||||
|
- `semi.h`: Semispace copying collector.
|
||||||
|
- `mark-sweep.h`: Stop-the-world mark-sweep segregated-fits collector
|
||||||
|
with lazy sweeping.
|
||||||
|
|
||||||
|
The two latter collectors reserve one word per object on the header,
|
||||||
|
which makes them collect more frequently than `bdw` because the `Node`
|
||||||
|
data type takes 32 bytes instead of 24 bytes.
|
||||||
|
|
||||||
|
These collectors are sketches and exercises for improving Guile's
|
||||||
|
garbage collector. Guile currently uses BDW-GC. In Guile if we have an
|
||||||
|
object reference we generally have to be able to know what kind of
|
||||||
|
object it is, because there are few global invariants enforced by
|
||||||
|
typing. Therefore it is reasonable to consider allowing the GC and the
|
||||||
|
application to share the first word of an object, for example to store a
|
||||||
|
mark bit, to allow the application to know what kind an object is, to
|
||||||
|
allow the GC to find references within the object, to allow the GC to
|
||||||
|
compute the object's size, and so on.
|
||||||
|
|
||||||
|
The GCBench benchmark is small but then again many Guile processes also
|
||||||
|
are quite short-lived, so perhaps it is useful to ensure that small
|
||||||
|
heaps remain lightweight.
|
||||||
|
|
||||||
|
Guile has a widely used C API and implements part of its run-time in C.
|
||||||
|
For this reason it may be infeasible to require precise enumeration of
|
||||||
|
GC roots -- we may need to allow GC roots to be conservatively
|
||||||
|
identified from data sections and from stacks. Such conservative roots
|
||||||
|
would be pinned, but other objects can be moved by the collector if it
|
||||||
|
chooses to do so. We assume that object references within a heap object
|
||||||
|
can be precisely identified. (The current BDW-GC scans for references
|
||||||
|
conservatively even on the heap.)
|
||||||
|
|
||||||
|
A likely good solution for Guile would be an [Immix
|
||||||
|
collector](https://www.cs.utexas.edu/users/speedway/DaCapo/papers/immix-pldi-2008.pdf)
|
||||||
|
with conservative roots, and a parallel stop-the-world mark/evacuate
|
||||||
|
phase. We would probably follow the [Rust
|
||||||
|
implementation](http://users.cecs.anu.edu.au/~steveb/pubs/papers/rust-ismm-2016.pdf),
|
||||||
|
more or less, with support for per-line pinning. In an ideal world we
|
||||||
|
would work out some kind of generational solution as well, either via a
|
||||||
|
semispace nursery or via sticky mark bits, but this requires Guile to
|
||||||
|
use a write barrier -- something that's possible to do within Guile
|
||||||
|
itself but it's unclear if we can extend this obligation to users of
|
||||||
|
Guile's C API.
|
||||||
|
|
||||||
|
In any case, these experiments also have the goal of identifying a
|
||||||
|
smallish GC abstraction in Guile, so that we might consider evolving GC
|
||||||
|
implementation in the future without too much pain. If we switch away
|
||||||
|
from BDW-GC, we should be able to evaluate that it's a win for a large
|
||||||
|
majority of use cases.
|
||||||
|
|
||||||
|
## To do
|
||||||
|
|
||||||
|
- [ ] Implement a parallel marker for the mark-sweep collector.
|
||||||
|
- [ ] Adapt GCBench for multiple mutator threads.
|
||||||
|
- [ ] Implement precise non-moving Immix whole-heap collector.
|
||||||
|
- [ ] Add evacuation to Immix whole-heap collector.
|
||||||
|
- [ ] Add parallelism to Immix stop-the-world phase.
|
||||||
|
- [ ] Implement conservative root-finding for the mark-sweep collector.
|
||||||
|
- [ ] Implement conservative root-finding and pinning for Immix.
|
||||||
|
- [ ] Implement generational GC with semispace nursery and mark-sweep
|
||||||
|
old generation.
|
||||||
|
- [ ] Implement generational GC with semispace nursery and Immix
|
||||||
|
old generation.
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
GCBench.c, MT_GCBench.c, and MT_GCBench2.c are from
|
GCBench.c, MT_GCBench.c, and MT_GCBench2.c are from
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue