|
Mondrian FAQs
- How do I use Mondrian in my application?
- Why doesn't Mondrian use a standard API?
- How does Mondrian's dialect of MDX differ from MSOLAP's?
- How can Mondrian be extended?
- Can Mondrian handle large datasets?
- How do I enable tracing?
- How do I enable logging?
- What is the syntax of a Mondrian connect string?
- Where is Mondrian going in the future?
- Where can I find out more? (Further reading)
- Mondrian is wonderful! How can I possibly thank you?
- Modeling
- Measures not in the fact table
- How can I define my fact table based on an arbitrary SQL statement?
- Why can't Mondrian find my tables?
- Build/install
- I get compilation errors? Why is this?
- Performance
- When I change the data in the RDBMS, the result doesn't change even if i refresh the browser. Why is this?
- Tuning the Aggregate function
1. How do I use Mondrian in my application?
There are several ways. If you have a fixed set of queries which you'd like
to display as HTML tables, use the tab library.
webapp/taglib.jsp is an example of this.
The JPivot project (http://jpivot.sourceforge.net)
is a JSP-based pivot table, and will allow you to dynamically explore a dataset
over the web. It replaces the prototype pivot table webapp/morph.jsp.
You could also build a pivot table in a client technology such as Swing.
2. Why doesn't Mondrian use a standard API?
Because there isn't one. MDX is a component of Microsoft's OLE DB for OLAP
standard which, as the name implies, only runs on Windows. Mondrian's API is
fairly similar in flavor to ADO MD (ActiveX Data Objects for Multidimensional),
a API which Microsoft built in order to make OLE DB for OLAP easier to use.
XML for Analysis is pretty much OLE DB for OLAP expressed in Web Services
rather than COM, and therefore seems to offer a platform-neutral standard for
OLAP, but take-up seems to be limited to vendors who supported OLE DB for OLAP
already.
The other OLAP vendors failed to reach consensus several years ago with the
OLAP Council API, then moved onto the JSR-069 ('JOLAP') specification. Mondrian
included a partial implementation of the JOLAP API for several years, but this
was removed in mondrian-2.3.
During 2006, Julian Hyde started work, in collaboration with some other
projects and companies, on a pragmatic open API for Java-based OLAP called
olap4j.
3. How does Mondrian's dialect of MDX differ from MSOLAP's?
See MDX language specification.
Not very much.
- The
StrToSet() and StrToTuple() functions take
an extra parameter.
- Parsing is case-sensitive.
- Pseudo-functions
Param() and ParamRef() allow
you to create parameterized MDX statements.
4. How can Mondrian be extended?
todo: User-defined functions
todo: Cell readers
todo: Member readers
5. Can Mondrian handle large datasets?
Yes, if your RDBMS can. We delegate the aggregation to the RDBMS, and if your
RDBMS happens to have materialized group by views created, your query will fly.
And the next time you run the same or a similar query, that will really fly,
because the results will be in the aggregation cache.
6. How do I enable tracing?
To enable tracing, set mondrian.trace.level to 1 in
mondrian.properties. You will see text and execution time of each SQL
statement, like this:
SqlMemberSource.getLevelMemberCount:
executing sql [select count(*) as `c0` from (select distinct `store`.`store_country`
as `c0` from `store` as `store`) as `foo`], 110 ms
SqlMemberSource.getMembers: executing sql [select distinct `store`.`store_sqft`
as `c0` from `store` as `store` order by `store`.`store_sqft`], 50 ms
Notes:
- If you are running mondrian from the command-line, or via Ant,
mondrian.properties should be in the current directory.
- If you are running in Tomcat,
mondrian.properties should be
in TOMCAT_HOME/bin. Changes will only take effect
when you re-start Tomcat. The output goes to the console from which you
started Tomcat.
7. How do I enable logging?
Mondrian uses the
Apache Log4j logger. To build, test, and run Mondrian requires a
log4j.jar file. A log4j.jar file is provided as part of the Mondrian
distribution.
Also provided is a log4j.properties file. Such a file is needed
when running Mondrian in standalone mode (such as when running the
Mondrian junit tests or the CmdRunner utility).
Generally, Mondrian is embedded in an application, such as a webserver,
which may have their
own log4j.properties file or some other mechanism for setting log4j properties.
In such cases, the user must use those for controlling Mondrian's logging.
Mondrian follows Apache's guidance on what type of information is logged
at what level:
-
FATAL: A very severe error event that will presumably
lead the application to abort.
-
ERROR: An error event that might still
allow the application to continue running.
-
WARN: A potentially harmful situation.
-
INFO: An informational message that highlight the
progress of the application at a coarse-grained level.
-
DEBUG: A fine-grained informational event that
is most useful to debug an application.
It is recommended for general use that the Mondrian log level be set to
WARN; arguably, its good to know when things are going South.
8. What is the syntax of a Mondrian connect string?
The syntax of the connect string is described in the Javadoc for the method
mondrian.olap.DriverManager.getConnection(String connectString, boolean fresh).
9. Where is Mondrian going in the future?
- Presentation layer (see
JPivot for more details).
- Complete implementation of MDX (not all of the functions implemented yet)
- Tuning
10. Where can I find out more?
MDX Solutions with Microsoft SQL Server Analysis Services by George
Spofford is the best book I have found on MDX. Despite the title, principles
it describes can be applied to any RDBMS.
OLAP Solutions: Building Multidimensional Information Systems
by Erik Thomsen is a great overview of multidimensional databases, but does
not deal with MDX.
The reference work on data warehousing is
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second
Edition), by Ralph Kimball, Margy Ross. It covers the business process
well, but the focus is more on star schemas and ROLAP than OLAP.
The
Microsoft Analysis Services online documentation has excellent online
documentation of MDX, including a
list of MDX functions.
11. Mondrian is wonderful! How can I possibly thank you?
We'd love to hear what you liked and didn't like about
it. If you can think of ways that Mondrian can be improved, roll up your sleeves
and help make it better. If you use Mondrian in your application, consider
sharing your work so that everyone can use it.
12. Modeling
12.1 Measures not stored in the fact table
I am trying to build a cube with measures from 2 different tables. I have tried a virtual cube, but it does not seem to work - it only relates measures and dimensions from the same table. Is there a way to specify that a measure is not coming from the fact table? Say using SQL select?
Virtual cubes sound like the right approach. The way to do it is to first create a dummy cube on your lookup table, with dimensions for as many columns as are applicable. (A classic example of this kind of cube is an 'ExchangeRate' cube, whose only dimensions are time and currency.)
Then create a virtual cube of the dummy cube and the real cube (onto your fact table).
Note that you will need to use shared dimensions for the cubes to join implicitly.
12.2 How can I define my fact table based on an arbitrary SQL statement?
Use the <View> element INSTEAD OF the <Table> element. You need to specify
the 'alias' attribute, which Mondrian uses as a table alias.
The XML 'CDATA' construct is useful in case there are strange characters in
your SQL, but isn't essential.
<View alias="DFACD_filtered">
<SQL dialect="generic">
<![CDATA[select * from DFACD where CSOC = '09']]>
</SQL>
</View>
12.3 Why can't Mondrian find my tables?
Consider this scenario. I have created some tables in Oracle, like this:
CREATE TABLE sales (
prodid INTEGER,
day INTEGER,
amount NUMBER);
and referenced it in my schema.xml like this:
<Cube name="Sales">
<Table name="sales"/>
...
<Measure name="Sales" column="amount" aggregator="sum"/>
<Measure name="Sales count" column="prodid" aggregator="count"/>
</Cube>
Now I start up Mondrian and get an error
ORA-00942: Table or view "sales" does not exist while executing the
SQL statement
SELECT "prodid", count(*) FROM "sales" GROUP BY "prodid". The query
looks valid, and the table exists, so why is Oracle giving an error?
The problem is that table and column names are case-sensitive. You told
Mondrian to look for a table called "sales", not "SALES" or "Sales".
Oracle's table and column names are case-sensitive too, provided that you
enclose them in double-quotes, like this:
CREATE TABLE "sales" (
"prodid" INTEGER,
"day" INTEGER,
"amount" NUMBER);
If you omit the double-quotes, Oracle automatically converts the identifiers
to upper-case, so the first CREATE TABLE command actually created a
table called "SALES". When the query gets run, Mondrian is looking for a table
called "sales" (because that's what you called it in your schema.xml), yet
Oracle only has a table called "SALES".
There are two possible solutions. The simplest is to change the objects to
upper-case in your schema.xml file:
<Cube name="Sales">
<Table name="SALES"/>
...
<Measure name="Sales" column="AMOUNT" aggregator="sum"/>
<Measure name="Sales count" column="PRODID" aggregator="count"/>
</Cube>
Alternatively, if you decide you would like your table and column names to be
in lower or mixed case (or even, for that matter, to contain spaces), then you
must double-quote object names when you issue CREATE TABLE
statements to Oracle.
13. Build/install
13.1 I get compilation errors? Why is this?
For example:
"SchemaTreeModel.java": Error #: 302 : cannot access class
MondrianDef.Schema; java.io.IOException: class not found: class
MondrianDef.Schema at line 29, column 14
You can't just compile the source code using your IDE; you must build using
ant, as described in the build instructions. This is
because several Java classes, such as mondrian.olap.MondrianDef (as
in this case), mondrian.olap.MondrianResource and
mondrian.olap.Parser are generated from other files. I recommend that you
do ant clean before trying to build again.
Another example:
"NamedObject.java": Error #: 704 : cannot access directory
javax\jmi\reflect at line 4, column 1
You don't have the correct JAR files (in this case, lib/jmi.jar)
on your classpath. Again, you should have followed the
build instructions. This problem often happens when people try to build
using an IDE. You must use ant for the first ever build, but you may be able to
setup your IDE to do incremental builds.
14. Performance
14.1 When I change the data in the RDBMS, the result doesn't change even if i refresh the browser. Why is this?
Mondrian uses a cache to improve performance. The first time you run a query,
Mondrian will execute various SQL statements to load the data (you can see these
statements by turning on tracing). The next time, it will use the information in
the cache.
Cache control is primitive right now. If the data in the RDBMS is modified,
Mondrian has no way to know, and does not refresh its cache. If you are using
the JPivot web ui and refresh the browser, that will simply regenerate the web
page, not flush the cache. The only way to refresh the cache is to call the
following piece of code, which flushes the entire contents:
mondrian.rolap.CachePool.instance().flush();
See caching design for more
information.
14.2 Tuning the Aggregate function
I am using an MDX query with a calculated "aggregate" member. It aggregates
the values between Node A and Node B. The dimension that it is aggregating
on is a Time dimension. This Time dimension has a granularity of one minute.
When executing this MDX query, the performance seems to be fairly bad.
Here is the query:
WITH MEMBER [Time].[AggregateValues] AS
'Aggregate([Time].[2004].[October].[1].[12].[10] :
[Time].[2004].[October].[20].[12].[10])'
SELECT [Measures].[Volume] ON ROWS,
NON EMPTY {[Service].[Name]}
WHERE ([Time].[AggregateValues])
Is this normal behavior? Is there any way I can speed this up?
Answer:
The performance is bad because you are pulling 19 days * 1440 minutes per day =
27360 cells from the database into memory per cell that you actually display.
Mondrian is a lot less efficient at crunching numbers than the database is, and
uses a lot of memory.
The best way to improve performance is to push as much of the processing to
the database as possible. If you were asking for a whole month, it would be
easy:
WITH MEMBER [Time].[AggregateValues]
AS 'Aggregate({[Time].[2004].[October]})'
SELECT [Measures].[Volume] ON ROWS,
NON EMPTY {[Service].[Name]}
WHERE ([Time].[AggregateValues])
But since you're working with time periods which are not aligned with the
dimensional structure, you'll have to chop up the interval:
WITH MEMBER [Time].[AggregateValues]
AS 'Aggregate({
[Time].[2004].[October].[1].[12].[10]
: [Time].[2004].[October].[1].[23].[59],
[Time].[2004].[October].[2]
: [Time].[2004].[October].[19],
[Time].[2004].[October].[20].[0].[00]
: [Time].[2004].[October].[20].[12].[10]})'
SELECT [Measures].[Volume] ON ROWS,
NON EMPTY {[Service].[Name]}
WHERE ([Time].[AggregateValues])
This will retrieve a much smaller number of cells from the database — 18
days + no more than 1440 minutes — and therefore do more of the heavy lifting
using SQL's GROUP BY operator. If you want to improve it still further,
introduce hourly aggregates.
Q. I saw the perforce files, but a I couldn't find where to
register and get new user, or the instructions that you have mentioned
above;
A. The project administrators (Julian) register you. I would
suggest that you start with guest level access and let's see if you need
update access later.
Q. Do you have some model for development environment (e.g.
eclipse 3.0 + ant 1.6 + jboss x.x + .....)?
A. Using Eclipse for Mondrian development works fine. There is an
Eclipse Perforce plug-in, too, but you can use the Perforce client
outside of Eclipse. Some people use Intellij (which is free for
open-source use).
As a test web-server, most people use Tomcat 5.0.
Q. Are all the updated documentation in the perforce server? How
could I get more materials, howtos, etc. to reduce my learn curve?
A. As with any open source project, the documentation is the
web site
(which is source-controlled in Perforce too), the forums
and mailing lists, the test suite and the code.
Q. How could I enroll myself into mondrian source forge project?
A. Sign up as a SourceForge user and subscribe to the
Mondrian mailing lists and forums. Also, there are a lot of Mondrian
related questions from the JPivot project - I suggest you subscribe to
JPivot too.
Author: Julian Hyde; last modified August 2006.
Version: $Id: //open/mondrian/doc/faq.html#23 $
(log)
Copyright (C) 2002-2007 Julian Hyde
|