|
Mondrian Components
Introduction
See OLAP and architecture.
Components
to be written...
Caching
The various subsystems of mondrian have different memory requirements. Some
of them require a fixed amount of memory to do their work, whereas others can
exploit extra memory to increase their performance. This is an overview of how
the various subsystems use memory.
Caching is a scheme whereby a
component uses extra memory when it is available in order to boost its
performance, and when times are hard, it releases memory with loss of
performance but with no loss of correctness. A cache is the use of extra
memory when times are good, use varying amounts of memory.
Garbage collection is carried out by
the Java VM to reclaim objects which are unreachable from 'live' objects. A
special construct called a soft reference allows objects to be
garbage-collected in hard times.
The garbage collector is not very discriminating in what it chooses to throw
out, so mondrian has its own caching strategy. There are several caches in the
system (described below), but they all of the objects in these caches are
registered in the singleton instance of
class mondrian.rolap.CachePool (currently there is just a single instance).
The cache pool doesn't actually store the objects, but handles all of the events
related to their life cycle in a cache. It weighs objects' cost (some function
involving their size in bytes and their usefulness, which is based upon how
recently they were used) and their benefit (the effort it would take to
re-compute them).
The cache pool is not infallible — in particular, it can not adapt to conditions where memory is in short supply — so uses
soft references, so that
the garbage collector can overrule its wisdom.
Cached objects must obey the following contract:
-
They must implement
interface mondrian.rolap.CachePool.Cacheable, which includes methods to
measure objects' cost, benefit, record each time they are used, and tell them
to remove themselves from their cache.
- They must call
CachePool.register(Cacheable)
either in their constructor or, in any case, before they are made visible in
their cache.
- They they must call
CachePool.unregister(Cacheable) when they are removed from their cache and
in their
finalize() method.
- They must be despensable: if they disappear, their subsystem will continue
to work correctly, albeit slower. A subsystem can declare an object to be
temporarily indispensable by calling
CachePool.pin(Cacheable, Collection) and then unpin it a short time later.
-
Their cache must reference them via soft references, so that they
are available for garbage collection.
- Thread safety. Their cache must be thread-safe.
If a cached object takes a significant time to initialize, it may not be
possible to construct it, register it, and initialize it within the same
synchronized section without unnacceptably reducing concurrency. If this is the
case, you should use phased construction. First construct and register the
object, but mark it 'under construction'. Then release the lock on the CachePool
and the object's cache, and continue initializing the object. Other threads will
be able to see the object, and should be able to wait until the object is
constructed. The method
Segment.waitUntilLoaded() is an example of this.
The following objects are cached.
1. Segment
A Segment (class
mondrian.rolap.agg.Segment) is a collection of cell values parameterized by
a measure, and a set of (column, value) pairs. An example of a segment is
(Unit sales, Gender = 'F', State in {'CA','OR'}, Marital Status =
anything)
All segments over the same set of columns belong to an Aggregation, in this
case
('Sales' Star, Gender, State, Marital Status)
Note that different measures (in the same Star) occupy the same Aggregation.
Aggregations belong to the AggregationManager, a singleton.
Segments are pinned during the evaluation of a single MDX query. The query
evaluates the expressions twice. The first pass, it finds which cell values it
needs, pins the segments containing the ones which are already present (one
pin-count for each cell value used), and builds a cell request (class
mondrian.rolap.agg.CellRequest) for those which are not present. It executes
the cell request to bring the required cell values into the cache, again,
pinned. Then it evalutes the query a second time, knowing that all cell values
are available. Finally, it releases the pins.
2. Member set
A member set (class
mondrian.rolap.SmartMemberReader.ChildrenList) is a set of children of a
particular member. It belongs to a member reader (class
mondrian.rolap.SmartMemberReader).
3. Schema
Schemas (class
mondrian.rolap.RolapSchema) are cached in
class
mondrian.rolap.RolapSchema.Pool, which is a singleton (todo: use soft
references). The cache key is the URL which the schema was loaded from.
4. Star schemas
Star schemas (class
mondrian.rolap.RolapStar) are stored in the static member
RolapStar.stars (todo: use soft references), and accessed via
RolapStar.getOrCreateStar(RolapSchema, MondrianDef.Relation).
Author: Julian Hyde; last modified August 2006.
Version: $Id: //open/mondrian-release/3.1/doc/components.html#2 $
(log)
Copyright (C) 2002-2006 Julian Hyde
|