root/pkgcore-checks/HACKING @ marienz%2540gentoo.org-20061108064656-d87e26fcb51778da

Revision marienz%2540gentoo.org-20061108064656-d87e26fcb51778da, 7.9 kB (checked in by Marien Zwart <marienz@…>, 2 years ago)

Add a few sentences.

Line 
1=======================
2 Architecture overview
3=======================
4
5If you want to add a check or frontend to pkgcore-checks you should
6probably read this thing first. If you only want to add a check you
7should only read the first section ("For everyone"). After that is
8extra documentation on adding feed types and some of the internals.
9
10For everyone
11============
12
13Addons
14------
15
16Most interesting objects are addons. The interface is defined in
17base.py. They can have "dependencies" on other addons (by class): all
18addons referenced by the required_addons class attribute of an active
19addon are also active (and this applies recursively, of course).
20
21Before they are instantiated the class (or static) methods
22mangle_option_parser and check_values are used to modify the optparse
23process. At this point all available checks (and their dependencies)
24are active addons. After the commandline is parsed all checks that end
25up being active after parsing the commandline are active addons, and
26those are instantiated. They receive the optparse values instance
27followed by the (instantiated) addons they depend on as positional
28arguments to __init__.
29
30(The reason for the two "phases" here (optparse mangling before
31instantiation) is we want the addons to influence the way options are
32parsed while we cannot instantiate them before options are parsed
33(since we want to pass the values object and since we only want to
34instantiate the ones that end up "selected" according to commandline
35settings). The reason for the dependency system used here is to
36provide a place to put options and state shared between multiple
37checkers (without putting everything on a single global object).)
38
39Checkers, feeds
40---------------
41
42What a checker actually *is* is not well defined at the moment, but
43everything subclassing base.Template and setting a feed_type attribute
44is definitely a checker. Template subclasses Addon, so they are all
45also addons. Their feed_type attribute should be set to one of the
46feed types defined in base.py. Their feed method will be called with
47an iterator producing values of a type depending on their feed_type.
48Currently available are versioned_feed which feeds single package
49objects and package_feed, category_feed and repository_feed which
50produce sequences of packages.
51
52Whatever your feed type is, the first thing you should do with
53everything you get out of the feed is "yield" it. The "feeder" and all
54the checks are chained together, each yielding objects to the next. So
55none of them should modify the data passed to them, and all of them
56should yield *everything* to the next checker.
57
58The second argument to their feed method is a Reporter instance (again
59defined in base.py) to pass Report instances on to.
60
61Scope
62-----
63
64An extra feature available to the feeders is their "scope" attribute.
65This is somewhat similar to their feed_type. The difference is roughly
66that feed_type must match *exactly* and scope indicates a *minimum*
67requirement. This is mainly used by the transforms, but for certain
68checks it is also useful to get "fed" single versions per iteration
69but only if the check is run on the entire repository. Using a
70repository feed would have the same effect of only running your check
71if the entire repository is being checked, but requires building a
72huge sequence containing all packages in memory before your check
73runs. So if you can operate on fewer packages at a time, use a
74"smaller" feed and scope.
75
76
77Checker discovery
78-----------------
79
80Checkers are picked up by using pkgcore's plugin mechanism. For
81writing a simple custom checker the easiest thing to do is putting the
82entire thing in a single file including the pkgcore_plugins
83registration dictionary. See core_checks.py for what such a dictionary
84looks like. Then put it in a directory called pkgcore_checks/plugins/
85on your PYTHONPATH. So if you have a system-wide pkgcore-checks
86installation you can put your own plugin in
87~/lib/python/pkgcore_checks/plugins/mycheck.py and run
88PYTHONPATH=$HOME/lib/python pcheck ... to use the check (without
89having to install a local copy of pkgcore-checks and putting the check
90inside it).
91
92For those who need more feed types
93==================================
94
95Transforms
96----------
97
98Usually the checks should run in a single "pipeline": looping over the
99packages once uses various caches (not just inside pkgcore-checks but
100also the os disk cache) more efficiently. This is accomplished by
101applying "transforms" to the package iterator which change the feed
102type. They are very similar to checks, but they do not yield the same
103thing they receive. Most (currently all) of them are defined in
104feeds.py.
105
106As you can see the way they set their input is a bit different from a
107check. Because some simple operations like "Yield the first thing of
108every value handed in" can be used for more than one "transformation"
109they have more than one source and target feed type. For each of these
110they can set their required scope and an integer indicating the "cost"
111of this operation. These are currently set to mostly random values,
112but the idea is they will allow the plugger to do a better job once
113the number of feed types, sources and transforms grows.
114
115There is no way to change the scope. The scope is assumed to be
116constant for the entire pipeline.
117
118Performance pitfalls
119--------------------
120
121As indicated earlier the checks should run in a single pipeline. This
122pipeline is really a *line*: it is not possible to "fork" the iterator
123without using potentially unlimited temporary storage. This is a
124deficiency of the way iterators are used here. This makes it very
125important that there are transforms available from all possible feeds
126*and* back.
127
128An example: if the only available source is one generating a
129versioned_feed (single package objects), there are transforms from
130that to an ebuild_feed producing ebuild source lines and a
131package_feed producing sequences of package objects for all versions
132in a package, and checks of all those feed types are active, then the
133entire repository will be looped over twice: once for the
134versioned_feed checkers and either the package_feed or ebuild_feed
135checkers, once for the remaining feed type. To avoid the second loop
136register a transform back from package_feed and ebuild_feed to
137versioned_feed.
138
139Sources
140-------
141
142Currently there is only one source, defined in feeds.py.
143
144Registration
145------------
146
147Transforms are discovered the same way as checks are: the pkgcore
148plugin system. See core_checks.py. The single available feed is
149currently hardcoded. This will probably change in the future, but
150exactly how remains to be seen.
151
152For pkgcore-checks internals hackers
153====================================
154
155Commandline frontend
156--------------------
157
158The frontend code uses pkgcore's commandline utilities module and
159lives in pcheck.py. It is pretty straightforward, although how control
160flows through this module is not obvious without knowing how pkgcore's
161commandline utils are used:
162
163- pkgcore's commandline glue instantiates the OptionParser
164- it pulls up all available checks and transforms through the plugin system
165- grab all addon dependencies too
166- give them a chance to mangle the parser
167- the commandline glue parses options, triggering various optparse
168  callbacks (options with a callback action and check_values, which
169  calls check_values on all addon classes).
170- if option parsing succeeded the commandline glue calls main
171- main instantiates all active addons and sources
172- the autoplugger builds one or more pipelines
173- main runs the pipelines
174
175Autoplugger
176-----------
177
178The autoplugger gets handed a bunch of "sink", transform and source
179instances and builds pipelines from them. It is a hack that relies on
180a fair amount of brute force to do its job, but so far it has been
181sufficient. It is still a moving target, so its design (if it has one)
182is not documented here. Use the source and do not forget about the
183tests (it does not have as many as it should but there are a bunch,
184and running the tests with debug mode forced (hacked) on should give
185some idea of what's what).
Note: See TracBrowser for help on using the browser.