For a tool which is known to give great insight in a business environment it is ironic that it is lacking information of one of its own key performance indicators. SAP BW has been a very successful platform for BI over the last 20 years or so. Almost from day one people have asked the question ‘which areas use the most space on my database’? This question has become increasingly important in recent years when customers are considering whether they are willing to pay a premium price for a HANA database.
The reporting abilities on database usage were first introduced five years ago as part of the technical content reporting in BW 7.3. There are still some flaws with the database space usage reporting:
- Setting up the technical content reporting is fiddly and needs looking after
- It takes an expert to interpret the results and turn it into easy to understand reporting
- It is incomplete, only showing the sizes of BW objects and not of any other objects of significant size which might be sitting in the same BW database (Basis tables or other).
The last point is the most important point. Below is an example from a production system (BWoH) which filled up much faster than was anticipated. The blue bars are part of standard business content reporting. The yellow bar is missing in the standard reports. If you think it is important to see this, keep reading and you find out how to get an instant, complete overview of database space usage with minimal effort.
Figure 1: BWoH filling up more quickly than expected? Make sure to keep an eye on all objects and tables.
Most organisations running BWoH do not have the right tools to monitor HANA database usage, manage the database size effectively and plan for future growth in a cost effective way. In this blog I describe a first step in getting better insight in database space usage. This solution is based on a script which you can run instantly, without having to configure, develop or customize anything. Running this script will give insight in the ‘as is’ situation. If you want to effectively plan for future growth then you will need some means of getting insight in historic growth as well. In my next blog I will describe how you can build out the solution with minimum development effort to a full blown database size monitoring application which allows for trend reporting.
The result of running this ad-hoc SQL should look something like this and will be available at HANA speed: Within seconds and without any system development!
Figure 2: Output of running the ad-hoc script. From here it is a small step to clear insight and cool visualisations
This data can then be used to create some visualisations which provide great insight. Below are a few examples:
Figure 3a: Only 6 tables take up 55% of the total used database space
Figure 3b: Surprisingly, Cubes (in blue) don't take up much space. DSO's (Orange) do, as does Master Data (Grey).
Figure 3c: There seems to be something wrong with housekeeping on Basis tables
The solution (part 1): A SQL script
Just like any other database, HANA stores all the metadata about tables in its own dictionary tables. You can find out the size of, say, an active table of a DSO or a fact table in a cube if you know which dictionary tables to look at. You have to have some understanding about table partitioning, row and column stores, memory usage and disk space to interpret the results. Luckily SAP is helping you because HANA comes with extremely useful database views on top of the dictionary tables which makes it easier to get a complete overview of table sizes. When you understand these views, you have half the job done.
The other half is understanding how BW objects relate to tables in the HANA database. BW objects use several tables and to understand how database space is used it is important to evaluate BW objects rather than individual tables.
The script in this blog combines the different dictionary views to get a complete view of all tables and then applies some further logic to put the tables in their BW context. Together, this results in a complete breakdown of database space size usage by BW object.
Step 1: The dictionary views
The four views below are used to get an overview of memory usage, disk allocation and other storage parameters for each table in HANA. These are all fully documented on help.sap.com
(Technology Platform > SAP HANA Platform > SAP HANA Platform Core > SAP HANA SQL and System views reference > System Views Reference)
Table Name | Description |
---|---|
TABLES | All available tables |
M_CS_TABLES | Runtime properties of column store tables |
M_RS_TABLES | Runtime properties of row store tables |
M_TABLE_PERSISTENCE_STATISTICS | Persistence (file) storage statistics for all tables |
Note: I am pretty sure that I don’t actually need to use ‘TABLES’ in my script and the reason it is still there is just because I don’t want to break my code (don’t fix it if it ain’t broken). Please feel free to share your cleaner version of the SQL in the comment section of this blog.
The information I use from these views are just the basic storage parameters: Size of memory in Total (main, delta and history); Size of memory in Delta; Disk Size and record count.
The SQL for is quite straight forward. The only thing to bear in mind is that tables can be partitioned, so you have to use a ‘group by’ and aggregate the statistics.
Step 2: The ‘BW’ groupings
Somewhere deep down in BW there might be a table or a number of tables where the relationship between a table name and an object type (Cube, DSO, InfoObject, etceterar) can be found. If such table exists, I have not been able to find it. Instead, I use the well documented naming convention BW uses for its tables.
The code I have posted here is complete for the systems where I have used it. It has the /BIC/A* for DSO’s, /BIC/F* for cubes and a lot more. If you happen to use a system which uses different components (for example BPC) then you might have to add some elements.
Unfortunately, PSA and Change Log tables are defined with the same prefix: /BIC/B*. To distinguish between the two, I look for specific strings in the description. This makes it a bit more complex, the coding may vary if it there is bespoke development in different languages and ultimately there is a risk that a table is not correctly classified. It then ends up in the list as a PSA table instead of a Change Log table or the other way around, which I believe is just a minor inconvenience.
Structure of the code
I have tried to keep the ad-hoc SQL script simple and as a result it is a bit longer than strictly necessary. There is a bit of complexity in interrogating the table names to derive the object type, but apart from that the code is quite easy to read. The price you pay for this is that the code comes in three parts, linked together through a ‘union’. The way the code is cut up is as follows:
Union part 1: BW Objects – assumed to always be column store tables
Union part 2: Non-BW Objects, column store tables
Union part 3: Non-BW Objects, row store tables
There is duplication of code, but the coding is a lot simpler to understand compared to a solution (or at least my solution) where everything is brought together in one statement.
Step 3: Run the code
I promised you a solution which would not require any development but there is a small tweak you will need to make to the code that I provide in this post, unless your BW system happens to sit in schema SAPNW1. Just do a “find and replace” of “SAPNW1” with whatever schema name of your BWoH system and you can run the code.
There is one more caveat though. For the ad-hoc solution I have to join the HANA dictionary tables with a BW text table. The text in this table is language dependent. If you’re lucky your system has all table descriptions available in a single language. If not, you’re in trouble. You can only select a single language if you are sure you are not excluding a relevant table which is not maintained in that language. I usually download the resultset to Excel and create a pivot table on language and table count to see if I can select a specific language.
A better solution might be to use outer joins. I haven’t tried this yet but I might give that a try if I can find the time to do so.
As I mentioned before, I have also created a monitoring application where you can keep track of table growth over time. In this solution I split master data from transactional data so I don’t have the problem around duplicate lines and missing lines in my transaction data (as a result of joining with a text table). I hope to post the details of how to build this monitoring application shortly in a follow-up blog.
Until then, I hope you get some real insight in the database usage of your BWoH system using this ad-hoc solution. Remember: HANA is a premium product so use it efficiently.
Finally: The SQL Statement
(SELECT top 100
-- 1. BW objects, always column store tables
MAX(CASEsubstring(T.TABLE_NAME,6,1)
WHEN'A'THEN'DSO'
WHEN'B'THEN (CASEsubstring(DDTEXT,1,3) WHEN'PSA'THEN'PSA'ELSE'C-LOG'END)
WHEN'D'THEN'CUBE'
WHEN'E'THEN'CUBE'
WHEN'F'THEN'CUBE'
WHEN'P'THEN'IOBJ'
WHEN'Q'THEN'IOBJ'
WHEN'S'THEN'IOBJ'
WHEN'T'THEN'IOBJ'
WHEN'X'THEN'IOBJ'
ELSE'OTHER'
END) AS BW_TYPE,
MAX (CASEsubstring(T.TABLE_NAME,6,1)
WHEN'A'THENsubstring(T.TABLE_NAME,7,length(T.TABLE_NAME)-8)
WHEN'B'THEN (CASEsubstring(DDTEXT,1,3) WHEN'PSA'THENREPLACE(SUBSTR_AFTER(DDTEXT, 'PSA for '),' Segment 0001','')
ELSE (CASEsubstring(DDTEXT,1,8) WHEN'Transfer'THEN SUBSTR_AFTER(DDTEXT, 'Application ')
ELSE SUBSTR_AFTER(DDTEXT, 'Object ') END ) END)
WHEN'D'THENsubstring(T.TABLE_NAME,7,9)
WHEN'E'THENsubstring(T.TABLE_NAME,7,9)
WHEN'F'THENsubstring(T.TABLE_NAME,7,9)
WHEN'P'THENsubstring(T.TABLE_NAME,7,9)
WHEN'Q'THENsubstring(T.TABLE_NAME,7,9)
WHEN'S'THENsubstring(T.TABLE_NAME,7,9)
WHEN'T'THENsubstring(T.TABLE_NAME,7,9)
WHEN'X'THENsubstring(T.TABLE_NAME,7,9)
ELSE'OTHER'
END) AS BW_OBJECT,
T.TABLE_NAME,
D.DDTEXT AS DESCRIPTION,
SUM(ROUND(C.MEMORY_SIZE_IN_TOTAL/1024/1024,0)) AS TOTAL_SIZE_IN_MB,
SUM(ROUND(C.MEMORY_SIZE_IN_DELTA/1024/1024,0)) AS DELTA_SIZE_IN_MB,
SUM(ROUND(P.DISK_SIZE/1024/1024,0)) AS DISK_SIZE_IN_MB,
T.TABLE_TYPE,
SUM(RECORD_COUNT) AS RECORD_COUNT,
D.DDLANGUAGE
FROM TABLES T
JOIN M_CS_TABLES C ON T.TABLE_NAME=C.TABLE_NAME
JOIN M_TABLE_PERSISTENCE_STATISTICS P ON T.TABLE_NAME=P.TABLE_NAME
JOIN SAPNW1.DD02T D ON T.TABLE_NAME=D.TABNAME
WHERE T.SCHEMA_NAME = 'SAPNW1'
AND ( T.TABLE_NAME LIKE'/BI%/D%'
OR T.TABLE_NAME LIKE'/BI%/E%'
OR T.TABLE_NAME LIKE'/BI%/E%'
OR T.TABLE_NAME LIKE'/BI%/B%'
OR T.TABLE_NAME LIKE'/BI%/A%'
OR T.TABLE_NAME LIKE'/BI%/P%'
OR T.TABLE_NAME LIKE'/BI%/Q%'
OR T.TABLE_NAME LIKE'/BI%/S%'
OR T.TABLE_NAME LIKE'/BI%/T%'
OR T.TABLE_NAME LIKE'/BI%/X%'
OR T.TABLE_NAME LIKE'/BI%/1%'
OR T.TABLE_NAME LIKE'/BI%/2%'
OR T.TABLE_NAME LIKE'/BI%/3%' )
GROUPBY T.TABLE_NAME,
D.DDTEXT,
T.TABLE_TYPE,
D.DDLANGUAGE
ORDERBYSUM(MEMORY_SIZE_IN_TOTAL) DESC)
UNIONALL
(SELECT top 100
-- 2. Basis tables, column store tables
NULL,
NULL,
T.TABLE_NAME,
D.DDTEXT,
SUM(ROUND(C.MEMORY_SIZE_IN_TOTAL/1024/1024,0)),
SUM(ROUND(C.MEMORY_SIZE_IN_DELTA/1024/1024,0)),
SUM(ROUND(P.DISK_SIZE/1024/1024,0)),
T.TABLE_TYPE,
SUM(RECORD_COUNT),
D.DDLANGUAGE
FROM TABLES T
JOIN M_CS_TABLES C ON T.TABLE_NAME=C.TABLE_NAME
JOIN M_TABLE_PERSISTENCE_STATISTICS P ON T.TABLE_NAME=P.TABLE_NAME
JOIN SAPNW1.DD02T D ON T.TABLE_NAME=D.TABNAME
WHERE T.SCHEMA_NAME = 'SAPNW1'
AND ( T.TABLE_NAME NOTLIKE'/BI%/D%'
AND T.TABLE_NAME NOTLIKE'/BI%/E%'
AND T.TABLE_NAME NOTLIKE'/BI%/F%'
AND T.TABLE_NAME NOTLIKE'/BI%/B%'
AND T.TABLE_NAME NOTLIKE'/BI%/A%'
AND T.TABLE_NAME NOTLIKE'/BI%/P%'
AND T.TABLE_NAME NOTLIKE'/BI%/Q%'
AND T.TABLE_NAME NOTLIKE'/BI%/S%'
AND T.TABLE_NAME NOTLIKE'/BI%/T%'
AND T.TABLE_NAME NOTLIKE'/BI%/X%'
AND T.TABLE_NAME NOTLIKE'/BI%/1%'
AND T.TABLE_NAME NOTLIKE'/BI%/2%'
AND T.TABLE_NAME NOTLIKE'/BI%/3%')
GROUPBY T.TABLE_NAME,
D.DDTEXT,
T.TABLE_TYPE,
D.DDLANGUAGE
ORDERBYSUM(MEMORY_SIZE_IN_TOTAL) DESC
)
UNIONALL
(SELECT top 100
-- 3. Basis tables, row store tables
NULL,
NULL,
T.TABLE_NAME,
D.DDTEXT,
SUM(ROUND((R.USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024,0)),
NULL,
SUM(ROUND(P.DISK_SIZE/1024/1024,0)),
T.TABLE_TYPE,
SUM(RECORD_COUNT),
D.DDLANGUAGE
FROM TABLES T
JOIN M_RS_TABLES R ON T.TABLE_NAME=R.TABLE_NAME
JOIN M_TABLE_PERSISTENCE_STATISTICS P ON T.TABLE_NAME=P.TABLE_NAME
JOIN SAPNW1.DD02T D ON T.TABLE_NAME=D.TABNAME
WHERE T.SCHEMA_NAME = 'SAPNW1'
AND ( T.TABLE_NAME NOTLIKE'/BI%/D%'
AND T.TABLE_NAME NOTLIKE'/BI%/E%'
AND T.TABLE_NAME NOTLIKE'/BI%/F%'
AND T.TABLE_NAME NOTLIKE'/BI%/B%'
AND T.TABLE_NAME NOTLIKE'/BI%/A%'
AND T.TABLE_NAME NOTLIKE'/BI%/P%'
AND T.TABLE_NAME NOTLIKE'/BI%/Q%'
AND T.TABLE_NAME NOTLIKE'/BI%/S%'
AND T.TABLE_NAME NOTLIKE'/BI%/T%'
AND T.TABLE_NAME NOTLIKE'/BI%/X%'
AND T.TABLE_NAME NOTLIKE'/BI%/1%'
AND T.TABLE_NAME NOTLIKE'/BI%/2%'
AND T.TABLE_NAME NOTLIKE'/BI%/3%')
GROUPBY T.TABLE_NAME,
D.DDTEXT,
T.TABLE_TYPE,
D.DDLANGUAGE
ORDERBYSUM(ROUND((R.USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024,0)) DESC
)