root/releases/pkgcore/0.2.11/dev-notes/config.rst @ marienz%2540gentoo.org-20070218170719-tdrfk0mqanfsemxa

Revision marienz%2540gentoo.org-20070218170719-tdrfk0mqanfsemxa, 15.8 KB (checked in by Marien Zwart <marienz@…>, 23 months ago)

Fix rst formatting.

Line 
1=====================================
2 Config use and implementation notes
3=====================================
4
5Using the manager
6=================
7
8Normal use
9----------
10
11To get at the user's configuration::
12
13 from pkgcore.config import load_config
14 config = load_config()
15 main_repo = config.get_default('repo')
16 spork_repo = config.repo['spork']
17
18Usually this is everything you need to know about the manager. Some
19things to be aware of:
20
21- Some of the managed sources of configuration data may be slow, so
22  accessing a source is delayed for as long as possible. Some things
23  require accessing all sources though and should therefore be
24  avoided. The easiest one to trigger is config.repo.keys() or the
25  equivalent list(config.sections('repo')). This has to get the
26  "class" setting for every available config section, which might be
27  slow.
28- For the same reason the manager does not know what type names exist
29  (there is no hardcoded list of them, so the only way to get that
30  information would be loading all config sections). This is why you
31  can get this::
32
33   >>> load_config().section('repo') # typo, should be "sections"
34   Traceback (most recent call last):
35     File "<stdin>", line 1, in ?
36   TypeError: '_ConfigMapping' object is not callable
37
38  This constructed a dictlike object for accessing all config sections
39  of the type "section", then tried to call it.
40
41Testcase use
42------------
43
44For testing of high-level scripts it can be useful to construct a
45config manager containing hardcoded values::
46
47 from pkgcore.config import basics, central
48
49 config = central.ConfigManager([{
50     'repo' = basics.HardCodedConfigSection({'class': my_repo,
51                                             'data': ['1', '2']}),
52     'cont' = basics.ConfigSectionFromStringDict({'class': 'pkgcore.my.cont',
53                                                  'ref': 'repo'}),
54     }])
55
56What this does should be fairly obvious. Be careful you do not use the
57same ConfigSection object in more than one place: caching will not
58behave the way you want. See `Adding a config source`_ for details.
59
60Adding a configurable
61=====================
62
63You often do not really *have* to do anything to make something a
64valid "class" value, but it is clearer and it is necessary in certain
65cases.
66
67Adding a class
68--------------
69
70To make a class available, do this::
71
72 from pkgcore.config import ConfigHint, errors
73
74 class MyRepo(object):
75
76     pkgcore_config_type = ConfigHint({'cache': 'section_ref'},
77                                      typename='repo')
78
79     def __init__(self, repo):
80         try:
81             self.initialize(repo)
82         except SomeRandomException:
83             raise errors.InstantiationError('eep!')
84
85The first ConfigHint arg tells the config system what kind of
86arguments you take. Without it it assumes arguments with no default
87are strings and guesses for the other args based on the type of the
88default value. So if you have no default values or they are just None
89you should tell the system about your args.
90
91The second one tells it you fulfill the repo "protocol", meaning your
92instances will show up in load_config().repo.
93
94ConfigHint takes some more arguments, see the api docs for details.
95
96Adding a callable
97-----------------
98
99To make a callable available you can do this::
100
101 from pkgcore.config import configurable, errors
102
103 @configurable({'cache': 'section_ref'}, typename=repo)
104 def my_repo(repo):
105     # do stuff
106
107configurable is just a convenience function that applies a ConfigHint.
108
109Exception handling
110------------------
111
112If you raise an exception when the config system calls you it will
113catch the exception and wrap it in an InstantiationError. This is good
114for calling code since catching and printing those provides the user
115with a readable description of what happened. It is less good for
116developers since the raising of a new exception kills the traceback
117printed in debug mode. You will have a traceback that "ends" in the
118config code handling instantiation.
119
120You can improve this by raising an InstantiationError yourself. If you
121do this the config system will be able to add the extra information
122needed for a user-friendly error message to it without raising a new
123exception, meaning debug mode will give a traceback leading right back
124to your code raising the InstantiationError.
125
126Adding a config source
127======================
128
129Config sources are pretty straightforward: they are mappings from a
130section name to a ConfigSection subclass. The only tricky thing is the
131combination of section references and caching. The general rule is "do
132not expose the same ConfigSection in more than one way". If you do it
133will be collapsed and instantiated once for every way it is exposed,
134which is usually not what you want. An example::
135
136 from pkgcore.config import basics, configurable
137
138 def example():
139     return object()
140
141 @configurable({'ref': 'section_ref'})
142 def nested(ref):
143     return ref
144
145 multi = basics.HardCodedConfigSection({'class': example})
146
147 myconf = {
148     'multi': multi,
149     'bad': basics.HardCodedConfigSection({'class': nested, 'ref': multi})
150     'good': basics.ConfigSectionFromStringDict({'class': 'nested',
151                                                 'ref': 'multi'})
152
153If you feed this to the ConfigManager and instantiate everything
154"multi" and "good" will be identical but "bad" will be a different
155object. For an explanation of why this happens see the implementation
156notes in the next section.
157
158You trigger a similar problem if you create a custom ConfigSection
159subclass that bypasses central's collapse_named_section for named
160section refs. If you somehow get at the referenced ConfigSection and
161hand it to collapse_section you will most likely circumvent caching.
162Only use collapse_section for unnamed sections.
163
164ConfigManager tries not to extract more things from this mapping than
165it has to. Specifically, it will not call __getitem__ before it needs
166to instantiate the section or needs to know its type. However it
167*will* iterate over the keys (section names) immediately to find
168autoloads. If this is a problem (getting those names is slow) then
169make sure the manager knows your config is "remote".
170
171Implementation notes
172====================
173
174This code has evolved quite a bit over time. The current code/design
175tries among other things to:
176
177- Allow sections to contain both named and nameless/inline references
178  to other sections.
179- Allow serialization of the loaded config.
180- Not do unnecessary work (if possibly not recollapse configs,
181  definitely not trigger unnecessary imports, access configs
182  unnecessarily, reinstantiate configs)
183- Provide both end-user error messages that are complete enough to
184  track down a problem in a complex nested config and tracebacks that
185  reach back to actual buggy code for developers.
186
187Overview from load_config() to instantiated repo
188------------------------------------------------
189
190When you call load_config() it looks up what config files are
191available (/etc/pkgcore.conf, ~/.pkgcore.conf, /etc/make.conf) and
192loads them. This produces a dict mapping section names to
193ConfigSection instances. For the ini-format pkgcore.conf files this is
194straightforward, for make.conf this is a lot of work done in
195pkgcore.config.portage_conf. I'm not going to describe that module
196here, read the source for details.
197
198The ConfigSections have a pretty straightforward api: they work like
199dicts but get passed a string describing what "type" the value should
200be and a central.ConfigManager instance for reasons described later.
201Passing in this "type" string when getting the value is necessary
202because the way things like lists of strings are stored depends on the
203format of the configuration file but the parser does not have enough
204information to know it should parse as a list instead of a string. For
205example, an ini-format pkgcore.conf could contain::
206
207  [my-overlay-cache]
208  class=pkgcore.cache.flat_hash.database
209  auxdbkeys=DEPEND RDEPEND
210
211We want to turn that auxdbkeys value into a list of strings in the ini
212file parser code instead of in the central.ConfigManager or even
213higher up because more exotic config sections may want to store this
214in a different way (perhaps as a comma-separated list, or even as
215"<el>DEPEND</el><el>RDEPEND</el>". But there is obviously not enough
216information in the ini file for the parser to know this is meant as a
217list instead of a string with a space in it.
218
219central.ConfigManager gets instantiated with one or more of those
220dicts mapping section names to ConfigSections. They're split up into
221normal and "remote" configs which I'll describe later, let's assume
222they're all "remote" for now. In that case no work is done when the
223ConfigManager is instantiated.
224
225Getting an actual configured object out of the ConfigManager is split
226in two phases. First the involved config sections are "collapsed":
227inherits are processed, values are converted to the right type,
228presence of required arguments is checked, etc. Everything up to
229actually instantiating the target class and actually instantiating any
230section references it needs. The result of this work is bundled in a
231CollapsedConfig instance. Actual instantiation is handled by the
232CollapsedConfig instance.
233
234The ConfigManager manages CollapsedConfig instances. It creates new
235ones if required and makes sure that if a cached instance is available
236it is used.
237
238For the remainder of the example let's assume our config looks like
239this::
240
241  [spork]
242  inherit=cache
243  auxdbkeys=DEPEND RDEPEND
244
245  [cache]
246  class=pkgcore.cache.flat_hash.database
247
248Running config.repo['spork'] runs
249config.collapse_named_section('spork'). This first checks if this
250section was already collapsed and returns the CollapsedConfig if it is
251available. If it is not in the cache it looks up the ConfigSection
252with that name in the dicts handed to the ConfigManager on
253instantiation and calls collapse_section on it.
254
255collapse_section first recursively finds any inherited sections (just
256the "cache" section in this case). It then grabs the 'class' setting
257(which is always of type 'callable'). In this case that's
258"pkgcore.cache.flat_hash.database", which the ConfigSection imports
259and returns. This is then wrapped in a config.basics.ConfigType. A
260ConfigType contains the information necessary to validate arguments
261passed to the callable. It uses the magic pkgcore_config_type
262attribute if the callable has it and introspection for everything
263else. In this case
264pkgcore.cache.flat_hash.database.pkgcore_config_type is a ConfigHint
265stating the "auxdbkeys" argument is of type "list".
266
267Now that collapse_section has a ConfigType it uses it to retrieve the
268arguments from the ConfigSections and passes the ConfigType and
269arguments to CollapsedConfig's __init__. Then it returns the
270CollapsedConfig instance to collapse_named_section.
271collapse_named_section caches it and returns it.
272
273Now we're back in the __getattr__ triggered by config.repo['spork'].
274This checks if the ConfigType on the CollapsedConfig is actually
275'repo', and returns collapsedConfig.instantiate() if this matches.
276
277Lazy section references
278-----------------------
279
280The main reason the above is so complicated is to support various
281kinds of references to other sections. Example::
282
283  [spork]
284  class=pkgcore.Spork
285  ref=foon
286
287  [foon]
288  class=pkgcore.Foon
289
290Let's say pkgcore.Spork has a ConfigHint stating the type of the "ref"
291argument is "lazy_ref:foon" (lazy reference to a foon) and its typename is
292"repo", and pkgcore.Foon has a ConfigHint stating its typename is
293"foon". a "lazy reference" is an instance of basics.LazySectionRef,
294which is an object containing just enough information to produce a
295CollapsedConfig instance. This is not the most common kind of
296reference, but it is simpler from the config point of view so I'm
297describing this one first.
298
299When collapse_section runs on the "spork" section it calls
300section.get_value(self, 'ref:repo', 'section_ref'). "lazy_ref" in the
301type hint is converted to just "ref" here because the ConfigSections
302do not have to distinguish between lazy and "normal" references.
303Because this particular ConfigSection only supports named
304references it returns a LazyNamedSectionRef(central, 'ref:repo',
305'foon'). This just gets handed to Spork's __init__. If the Spork
306decides to call instantiate() on the LazyNamedSectionRef it calls
307central.collapse_named_section('foon'), checks if the result is of
308type foon, instantiates it and returns it.
309
310The same thing using a dhcp-style config::
311
312  spork {
313      class pkgcore.Spork;
314      ref {
315          class pkgcore.Foon;
316      };
317  }
318
319In this format the reference is an inline unnamed section. When
320get_value(central, 'ref:repo', 'foon') is called it returns a
321LazyUnnamedSectionRef(central, 'ref:repo', section) where section is a
322ConfigSection instance for the nested section (knowing just that
323"class" is "pkgcore.Foon" in this case). This is handed to
324Spork.__init__. If Spork calls instantiate() on it it calls
325central.collapse_section(self.section) and does the same type checking
326and instantiating LazyNamedSectionRef did.
327
328Notice neither Spork nor ConfigManager care if the reference is inline
329or named. get_value just has to return a LazySectionRef instance
330(LazyUnnamedSectionRef and LazyNamedSectionRef are subclasses of
331this). How this actually gets a referenced config section is up to the
332ConfigSection whose get_value gets called.
333
334Normal section references
335-------------------------
336
337If Spork's ConfigHint defines the type of its "ref" argument as
338"ref:foon" instead of "lazy_ref:foon" it gets handed an actual Foon
339instance instead of a LazySectionRef to one. This is built on top of
340the lazy reference code. For the ConfigSections nothing changes (the
341same get_value call is made). But the ConfigManager now immediately
342calls collapse() on the LazySectionRef, retrieving a CollapsedConfig
343instance (for the "foon"). This is handed to the CollapsedConfig for
344"spork", and when this one is instantiated the referenced
345CollapsedConfig is also instantiated.
346
347Miscellaneous details
348---------------------
349
350The support for nameless sections means neither ConfigSection nor
351CollapsedConfig have a name attribute. This makes the error handling
352code a bit tricky as it has to tag in the name at various points, but
353this works better than enforcing names where it does not make sense
354(means lots of unnecessary duplication of names when dealing with
355dicts of HardCoded/StringBasedConfigSections).
356
357The suppport for serialization of the loaded config means section_refs
358cannot be instantiated straight away. The object used for
359serialization is the CollapsedConfig which gives you both the actual
360values and the type they have. If the CollapsedConfig contained
361arbitrary instantiated objects serializing them would be impossible.
362So it contains nested CollapsedConfigs instead.
363
364Not doing unnecessary work is done by caching in two places. The
365simple one is CollapsedConfig caching its instantiated value. This is
366pretty straightforward. The more subtle one is ConfigManager caching
367CollapsedConfigs by name. It is obviously a good idea to cache these
368(if we didn't we would have to cache the instantiated value in the
369ConfigManager). An alternative would be caching them by their
370ConfigSection. This has the minor disadvantage of keeping the
371ConfigSection in memory, and the larger one that it may break caching
372for weird config sources that generate ConfigSections on demand. The
373downside of caching by name is we have to make sure nothing generates
374a CollapsedConfig for a named section in a way other than
375collapse_named_section (handing the ConfigSection to collapse_section
376bypasses caching).
377
378This means a ConfigSection cannot return a raw ConfigSection from a
379section_ref get_value call. If it was a ConfigSection that central
380then collapsed and the reference was actually to a named section
381caching is bypassed.
382
383The need for a section name starting with "autoload" is also there to
384avoid unnecessary work. Without this we would have to figure out the
385typename of every section. While we can do that without entirely
386collapsing the config we cannot avoid importing the "class", which
387means load_config() would import most of pkgcore. That should
388definitely be avoided.
Note: See TracBrowser for help on using the browser.