BitKeeper Nested Programmers Manual
Overview
This is an overview of the nested data structures and its usage on various BitKeeper operations. It is intended for the programmer and as such it delves into details about the usage of the various fields and the algorithms implemented with them. This description is intended to be more of a guide than an accurate description of how it all works. Remember, the ultimate documentation is the code itself.
Data Structures
We’ll start with a brief overview of nested, and the data structures that support it.
A nested collection is a group of repositories, with the 'product'
serving as a container for zero or more 'components'. This is
reflected in the code in two different data structures in
nested.h
. The nested
structure, which represents the product’s
knowledge of the components, and the comp
structure which represents
knowledge about each individual component.
In addition to the above, BitKeeper has the concept of 'aliases' or
named sets of components. These don’t have a data structure associated
with them, they behave more like 'tags'. The idea is that there is
an API for tagging the individual comp
structures with information
about whether they are selected by the given alias (or set of aliases)
or not.
Each product has a HERE
file, which specifies the list of aliases
that are supposed to be here. BitKeeper tries very hard to maintain
the synchronicity between what is actually here and the contents of
the HERE
files. You can see the HERE
file by running the bk here
command or by just running cat BitKeeper/log/HERE
.
We’ll take a look first at the nested data structure and give a brief definition of each of the fields and how they are used.
Product
A BitKeeper product is a repository marked with an empty file
BitKeeper/log/PRODUCT
. The command bk product
tells you whether
the repository you’re at is a product or not.
Most operations in a nested collection start by getting a nested
structure through a call to nested_init()
. This function has many
modes of operation, some of which are described at the end of this
section. But first, let’s take a look at the nested
structure, and a
brief description of what the fields are for:
{{{#!cplusplus numbers=off struct nested { char here; // the contents of the here file char comps; // addlines of pointers to components char *oldtip; // tip before revs (new tip for undo) char *tip; // newest cset in revs sccs *cset; // cache of cset file project *proj; hash *aliasdb; // lazy init’d aliasdb hash *compdb; // lazy init rk lookup of &n→comp[i] comp *product; // pointer into comps to the product // bits u32 alias:1; // components tagged with c→alias u32 pending:1; // include pending component state u32 freecset:1; // do a sccs_free(cset) in nested_free() u32 fix_idDB:1; // don’t complain about idDB problems }; }}}
- here
-
This is just a
lines
array of the contents of the HERE file. As a side effect of loading theHERE
file, duplicates in it are removed, thePRODUCT
special alias is inserted if it was missing, and the capitalization of thePRODUCT
alias is fixed if it was wrong (it has to be all caps). - comps
-
This is a
lines
array of pointers to each of the components that comprise the nested collection. It is ''all'' of the components, not just the components that are present in this repository. The information comes straight from the product’s ChangeSet file. The product’s owncomp
structure is included either first or last, depending on whetherNESTED_PRODUCTFIRST
was passed tonested_init()
, but it can always be accessed using theproduct
field as described below. - oldtip
-
Only set if
nested_init()
is called with therevs
argument. WithoutNESTED_PULL
, this becomes the latest product cset key not inrevs
. WithNESTED_PULL
oldtip is the tip of the 'local' changes. Mostly forundo
, but used also bypush
to assert that we are an update-only push. Note in undo, the oldtip is really going to be the new tip. See the section onnested_init()
. - tip
-
The revision at the tip of the coloring by
revs
, or justrev
if that is hownested_init()
was called. See the section onnested_init()
- cset
-
This is just the
sccs *
to the product’s ChangeSet file. It is there in order to pass it around to other APIs as to avoid multiplesccs_init()
's. Inpull
this will be the RESYNC ChangeSet file aftertakepatch
has added in the new remote csets. - proj
-
A pointer to the product’s
proj
structure. - aliasdb
-
The
BitKeeper/etc/aliases
file. Lazily initialized when needed. E.g. by the alias code. - compdb
-
Used internally by
nested_findKey()
to map component rootkey’s to the matchingcomp
structure. - product
-
Since for many operations we can treat the product as just another component, we keep a
comp
structure for the product here for easy access. The same struct will be included in then->comps
array. - alias
-
When we need to see which components are included in an alias, we call the function
aliasdb_tag()
which will mark each component included in the alias withcp->alias
, it will also tag thenested
structure withalias
to indicate that some components are tagged. - freecset
-
When we initialize a
nested
structure by callingnested_init()
, we can optionally pass ansccs *
for the product’s ChangeSet file. If we do not,nested_init()
willsccs_init()
the product’s ChangeSet file and keep a pointer to it in thecset
field of thenested
structure. Ifnested_init()
did thesccs_init()
it must free it (sccs_free()
) when thenested
structure is freed. - fix_idDB
-
Used internally in
nested_init()
. Ifnested_init()
was told that it’s okay to fix theidDB
, this fact is remembered here so that we can fix it later.
The main mechanism through which we obtain a nested
structure is by
calling the nested_init()
function. This function is described next.
{{{#!cplusplus numbers=off nested *nested_init(sccs *cset, char *rev, char **revs, u32 flags); }}}
- cset
-
Optional
sccs *
of the product’s ChangeSet file. If omitted (NULL)nested_init()
willsccs_init()
the product’s ChangeSet file and cache thesccs *
. If provided, it will not be freed bynested_free()
and ''must'' be freed by the caller. - rev
-
If present,
nested_init()
will return anested
structure as it existed atrev
time. This parameter is mutually exclusive withrevs
, which is described below. It will be passed tosccs_findrev()
so it can take any valuesccs_findrev()
can handle (e.g. "+" for the tip revision, etc). - revs
-
If present, a list of revisions (in a
lines
array) which will be colored and tagged (with theincluded
field in thecomp
structure, see below). Mutually exclusive withrev
. This field is mainly used bybk pull
andbk undo
which operate on regions of the graph. The region of the graph must have a single tip and when the region and it’s descendents are removed from the graph a single tip must remain. Thenested_init()
will fail if these properties are not true. - flags
-
Many tweaks to how
nested_init()
operates, these are described below.
{{{#!cplusplus numbers=off #define NESTED_PENDING 0x10000000 /* included pending comps / #define NESTED_PULL 0x20000000 / This is a pull / #define NESTED_PRODUCTFIRST 0x40000000 / c→p first in c→comps / #define NESTED_MARKPENDING 0x01000000 / set c→pending / #define NESTED_FIXIDCACHE 0x02000000 / no error for bad idDB */ }}}
- NESTED_PENDING
-
When initializing a
nested
structure, include any new 'pending' components that have not yet been added to the product’s ChangeSet file. Also for existing components that have pending csets not included in the product, setc->pending
. Mutually exclusive with eitherrev
orrevs
since we are not looking at the nested collection as of any revision. - NESTED_PULL
-
Used by
bk pull
when processing theRESYNC/ChangeSet
file to indicate that we have three regions that we care about, the 'common region', the 'remote region', and the 'local region'. In this case,oldtip
will be set as the tip of the local side,tip
as the tip of the remote side, and each of the components will be tagged according to which region they belong to. For instance, components that appear in the GCA region, will be tagged as 'not new', components that appear in the REMOTE region will be tagged as 'included' and components that appear in the LOCAL region will be tagged as 'with local changes'. These three tags arec->new
,c->included
, andc->localchanges
correspondingly in thecomp
structure. This also changes the meaning ofc->lowerkey
in thecomp
structure to be the tipkey of the local repository for this component, whilec->deltakey
is the tip of the remote component. - NESTED_PRODUCTFIRST
-
Put the product at the beginning of the
comps
list. The default is to put it last. - NESTED_MARKPENDING
-
Whether to tag components that have pending changes with
c->pending
or not. TheNESTED_PENDING
mode above does this by default. Thenested_populate()
function requires that pending components be marked. - NESTED_FIXIDCACHE
-
Whether it is okay to fix the idDB cache for moved components or we should error out.
Caching of the nested
struct
Whenever nested_init()
is called with both the rev
and revs
parameters set to zero (meaning the tip of the product’s ChangeSet
file), the information will be loaded from and written to a cache
kept in BitKeeper/log/nested_init.cache
.
Component
Information about components is kept in the comp
structure. A list
of the components is kept in the nested
structure’s comps
field.
{{{#!cplusplus numbers=off typedef struct { nested *n; // backpointer char *rootkey; // rootkey of the repo char *deltakey; // deltakey of repo as of rev char *lowerkey; // in pull, local tip // otherwise, gca tip char *path; // actual path: like GFILE, not DPN
void *data; // scratchpad for applications
// bits u32 alias:1; // in the latest alias u32 included:1; // component modified in 'revs' u32 localchanges:1; // component modified outside 'revs' u32 new:1; // if set, the undo will remove this u32 present:1; // if set, the repo is actually here u32 product:1; // this is the product u32 remotePresent:1; // scratch for remote present bit u32 pending:1; // has pending csets not in product } comp; }}}
- n
-
Back-pointer to the
nested
structure we belong t. - rootkey
-
Rootkey of this component.
- deltakey
-
Tip key of this component, if
revs
were passed tonested_init()
this is the tip of the region implied byrevs
. In the case ofbk pull
, it is the tip of this component in the remote side. - lowerkey
-
When pulling, it is the tip of the local region.