Case Study 3: Cross Tabular Reporting
Reminder! To run these examples, you should first start a new session and then load the sample database using:
"/localvision/samples/general/sample.load" asFileContents evaluate ;and load testList using:
!testList <- Company masterList rankDown: [ sales ] . select: [ rank <= 20 ] ;Any other files referenced can be read from the /localvision/samples/general/ directory.Note: The sample.load file runs by default on a Unix environment. If you are using a Windows NT platform, this location may be prefixed by a drive and optional path (e.g. d:/visiondb/localvision/samples/general/sample.load). Check with your Vision Administrator for further details.
It is often useful to look at data at an aggregate level instead of looking at each specific record. Example 3 develops a cross-tabular report that aggregates a list using different criteria and displays summary information by aggregate.
Read the file example3.a. You should see:
#--- Build a List that is extended by 2 quintile values !quintileList <- testList quintileUp: [ netIncome ] . extendBy: [ !quintile1 <- quintile ] . quintileUp: [ earningsPerShare ] . extendBy: [ !quintile2 <- quintile ] ; #--- Display Basic Report for this List quintileList do: [ ticker print: 10 ; netIncome print: 10 ; quintile1 print: 3 ; earningsPerShare print: 10 ; quintile2 print: 3 ; sales print: 10 ; newLine print ; ] ;Execute this program. You should see:
AET 920.60 2 7.48 5 22114.11 T 2044.00 4 1.88 1 51209.02 AN 1360.00 3 2.65 2 20174.00 ARC 1224.00 2 6.68 4 16282.00 CI 728.30 1 7.25 5 16909.30 ...The first step creates a list named quintileList which is extended by two variables: quintile1 is the quintile (1-5) of the netIncome value and quintile2 is the quintile (1-5) of the earningsPerShare value. A simple report showing actual data and these quintile values is then displayed for each company in the list.
The next version of this report eliminates the company detail and just displays summary information. For each of the netIncome quintiles, the report displays the distribution of earningsPerShare quintiles. Read in the file example3.b:
#--- group by netIncome quintiles (quintile1) quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ #-- for each net income quintile "netIncome Quintile " print ; ^self print: 3 ; #-- print the quintile1 number newLine print ; #--- group the current quintile1 companies by quintile2 groupList groupedBy: [ quintile2 ] . sortUp: [ ^self ] . do: [ " earningsPerShare Quintile " print ; ^self print: 3 ; #-- print quintile2 number " Includes: " print ; groupList count print: 5 ; #-- print quintile2 count " Elements" printNL ; ] ; newLine print ; ] ;Execute this program. You should see:
netIncome Quintile 1 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 2 earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 3 earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 2 Elements netIncome Quintile 4 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 2 Includes: 2 Elements netIncome Quintile 5 earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 2 ElementsThis report groups the original list by quintile1. In other words, 5 distinct groups are formed corresponding to the integers 1 through 5. The groupList message for each of these groups returns the list of companies in the current quintile group. For each of these quintile1 groups, a label is displayed identifying the netIncome quintile number. The groupList is then grouped by quintile2. A summary line is then printed for each earningsPerShare quintile present in the current group.
This report displays the count for the different combination of quintile pairs that exist. For example, you can see that 2 companies have a netIncome quintile of 4 and an earningsPerShare quintile of 1. Only the combinations that exist are displayed in this report. For example, there is no summary line for netIncome quintile 1 and earningsPerShare quintile 2, implying that there are no companies with these characteristics.
The program in example3.c displays an entry for every combination:
quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ "netIncome Quintile " print ; ^self print: 3 ; newLine print ; 5 sequence #--- For each number from 1 to 5 do: [ !q <- ^self ; " earningsPerShare Quintile " print ; q print: 3 ; " Includes: " print ; ^my groupList #-- select companies in this quintile select: [ quintile2 = ^my q ] . count print: 5 ; " Elements" printNL ; ] ; newLine print ; ] ;Execute this program. You should see:
netIncome Quintile 1 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 2 Includes: 0 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 0 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 2 earningsPerShare Quintile 1 Includes: 0 Elements earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 ElementsThis report displays a summary line for each earningsPerShare quintile for each netIncome quintile. The expression 5 sequence creates a list of the first 5 integers. The variable q is used to track which quintile is currently in effect. The expression ^my groupList retrieves the value of groupList defined outside this do: block. In this case, it refers to the groupList associated with the current netIncome quintile. The expression quintile2 = ^my q in the selection block, selects all companies in this groupList whose quintile2 value matches the current value of q. The number of elements selected is then printed.
Notice that the values of netIncome and earningsPerShare are no longer needed once the quintiles have been defined. The program in example3.d is parameterized to make it easy to change the quintile rules:
#--- Create variables for the 2 blocks and the list !block1 <- [ netIncome ] ; !block2 <- [ earningsPerShare ] ; !list <- testList ; #--- Use these variables to build quintileList !quintileList <- list quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ] ; quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ ^my block1 print ; #-- print the block as a label " Quintile " print ; ^self print: 3 ; newLine print ; !block2 <- ^my block2 ; #-- copy value for next level 5 sequence do: [ !q <- ^self ; ^my block2 print ; #-- print the block as a label " Quintile " print ; q print: 3 ; " Includes: " print ; ^my groupList select: [ quintile2 = ^my q ] . count print: 5 ; " Elements" printNL ; ] ; newLine print ; ] ;Execute this program. You should see:
[netIncome] Quintile 1 [earningsPerShare] Quintile 1 Includes: 2 Elements [earningsPerShare] Quintile 2 Includes: 0 Elements [earningsPerShare] Quintile 3 Includes: 1 Elements [earningsPerShare] Quintile 4 Includes: 0 Elements [earningsPerShare] Quintile 5 Includes: 1 Elements [netIncome] Quintile 2 [earningsPerShare] Quintile 1 Includes: 0 Elements [earningsPerShare] Quintile 2 Includes: 1 Elements [earningsPerShare] Quintile 3 Includes: 1 Elements [earningsPerShare] Quintile 4 Includes: 1 Elements [earningsPerShare] Quintile 5 Includes: 1 Elements ...The variables block1 and block2 are used to define the quintiles. The value of the block is printed instead of the hard-coded labels. Because the magic word ^my only "reaches back" one level, you need to copy the value of block2 so that it can be accessed by the innermost do: message.
The file example3.e changes the quintile blocks and the list and executes the same report:
!block1 <- [ sales ] ; !block2 <- [ netIncome ] ; !list <- Company masterList ; !quintileList <- list quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ] ; quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ ^my block1 print ; " Quintile " print ; ^self print: 3 ; newLine print ; !block2 <- ^my block2 ; 5 sequence do: [ !q <- ^self ; ^my block2 print ; " Quintile " print ; q print: 3 ; " Includes: " print ; ^my groupList select: [ quintile2 = ^my q ] . count print: 5 ; " Elements" printNL ; ] ; newLine print ; ] ;Execute this program. You should see:
[sales] Quintile 1 [netIncome] Quintile 1 Includes: 3 Elements [netIncome] Quintile 2 Includes: 4 Elements [netIncome] Quintile 3 Includes: 1 Elements [netIncome] Quintile 4 Includes: 0 Elements [netIncome] Quintile 5 Includes: 0 Elements [sales] Quintile 2 [netIncome] Quintile 1 Includes: 3 Elements [netIncome] Quintile 2 Includes: 1 Elements [netIncome] Quintile 3 Includes: 1 Elements [netIncome] Quintile 4 Includes: 3 Elements [netIncome] Quintile 5 Includes: 0 Elements ...Note: This report shows the quintile distribution for sales and netIncome using the full company universe, not testList.
The file example3.f defines a method at List that produces this report using two supplied blocks as parameters. This method is defined at List because there is nothing about the method itself that knows or cares what type of data is actually being processed. Presumably, the supplied blocks will be appropriate for the type of elements contained in the list:
List defineMethod: #-- define a method [ | quintile: block1 by: block2 | #-- needing two blocks !quintileList <- ^self #-- ^self is recipient list quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ] ; quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ ^my block1 print ; " Quintile " print ; ^self print: 3 ; newLine print ; !block2 <- ^my block2 ; 5 sequence do: [ !q <- ^self ; ^my block2 print ; " Quintile " print ; q print: 3 ; " Includes: " print ; ^my groupList select: [ quintile2 = ^my q ] . count print: 5 ; " Elements" printNL ; ] ; newLine print ; ] ; ] ; #-- Run the Method Company masterList quintile: [sales] by: [ netIncome ] ;Execute this program. Your output should be identical to the previous report.
The next version of the report displays the output in a matrix format. The quintile1 values form the rows and the quintile2 values form the columns. Read in the file example3.g:
List defineMethod: [ | quintile: block1 by: block2 | #--- Titles "Cross Tabular Quintile Distribution" centerNL: 70 . print ; newLine print ; block1 asString print: 10 ; " | " print ; block2 asString concat: " Quintiles" . centerNL: 45 . print ; "Quintiles" print: 10 ; " | " print ; 5 sequence do: [ ^self print: 9 ] ; newLine print ; "-" fill: 70 . printNL ; #--- Setup !quintileList <- ^self quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ] ; #--- Analyze quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ !currentQuintile1 <- ^self ; currentQuintile1 print: -10 ; " | " print ; 5 sequence do: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] . count print: 9 ; ] ; newLine print ; ] ; ] ; ### And Run It Company masterList quintile: [ sales ] by: [ price / earningsPerShare ]Execute this program. You should see:
Cross Tabular Quintile Distribution [sales] | [price / earningsPerShare] Quintiles Quintiles | 1 2 3 4 5The cross tabular report format shows sales quintiles as rows and price-earnings quintiles as columns. The number of companies in the row/column combination is displayed in the table's cells.
1 | 2 2 0 2 2 2 | 1 2 3 1 1 3 | 1 1 1 2 2 4 | 3 0 2 1 2 5 | 1 3 1 2 0
Suppose you wanted to display more than just the count. The version example3.h adds an additional block as a third parameter. The average value for this block is printed for each cell:
List defineMethod: [ | quintile: block1 by: block2 for: block3 | #-- add block3 #--- Titles "Cross Tabular Quintile Distribution" centerNL: 70 . print ; " " print: 30 ; " For: " print ; block3 printNL ; newLine print ; block1 asString print: 10 ; " | " print ; block2 asString concat: " Quintiles" . centerNL: 45 . print ; "Quintiles" print: 10 ; " | " print ; 5 sequence do: [ ^self print: 9 ] ; newLine print ; "-" fill: 70 . printNL ; #--- Setup !quintileList <- ^self quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ; !block3Value <- ^self send: ^my block3 ; #-- store block3 value ] ; #--- Analyze quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ !currentQuintile1 <- ^self ; currentQuintile1 print: -3 ; "Count" print: 7 ; " | " print ; 5 sequence do: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] . count print: 9 ; ] ; newLine print ; #-- Add Average Calc Based on block Value "Average" print: -10 ; " | " print ; 5 sequence do: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] . average: [ block3Value ] . print: 9 ; ] ; newLine print ; ] ; ] ; ### And Run It Company masterList quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]Execute this program. You should see:
Cross Tabular Quintile Distribution For: [netIncome] [sales] | | [price / earningsPerShare] Quintiles Quintiles | 1 2 3 4 5This method requires 3 blocks as parameters. The first two are used to form the quintiles. The third is used to compute the average. After the counts are printed for each column, a second line of average values is displayed.
1 Count | 2 2 0 2 2 Average | 1016.12 579.95 NA 477.00 314.90 2 Count | 1 2 3 1 1 Average | 313.00 906.55 1393.77 478.00 450.09 3 Count | 1 1 1 2 2 Average | 374.40 728.30 1224.00 1239.41 403.50 4 Count | 3 0 2 1 2 Average | 1116.10 NA 1184.00 1007.00 2378.50 5 Count | 1 3 1 2 0 Average | 4625.20 3486.10 4840.00 1651.00 NA
Notice that the section of code that generates the averages duplicates much of the code used to generate the counts. The file example3.i streamlines this:
List defineMethod: [ | quintile: block1 by: block2 for: block3 | #--- Titles "Cross Tabular Quintile Distribution" centerNL: 70 . print ; " " print: 30 ; " For: " print ; block3 printNL ; newLine print ; block1 asString print: 10 ; " | " print ; block2 asString concat: " Quintiles" . centerNL: 45 . print ; "Quintiles" print: 10 ; " | " print ; 5 sequence do: [ ^self print: 9 ] ; newLine print ; "-" fill: 70 . printNL ; #--- Setup !quintileList <- ^self quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ; !block3Value <- ^self send: ^my block3 ; ] ; #--- Analyze quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ !currentQuintile1 <- ^self ; #-- For each of the 5 quintiles, extract the sublist #-- send returns the last statement in block. #-- quintileList is therefore a list of 5 lists !quintileList <- 5 sequence send: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] ] ; #-- Now just use quintileList to display stats currentQuintile1 print: -3 ; "Count" print: 7 ; " | " print ; quintileList do: [ ^self count print: 9 ] ; newLine print ; "Average" print: -10 ; " | " print ; quintileList do: [ ^self average: [ block3Value ] . print: 9 ] ; newLine print ; "Std Dev" print: -10 ; " | " print ; quintileList do: [ ^self stdDev: [ block3Value ] . print: 9 ] ; newLine print ; newLine print ; ] ; ] ; ### And Run It Company masterList quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]Execute this program. You should see:
Cross Tabular Quintile Distribution For: [netIncome] [sales] | [price / earningsPerShare] Quintiles Quintiles 1 2 3 4 5
1 Count 2 2 0 2 2 Average 1016.12 579.95 0.00 477.00 314.90 Std Dev 121.33 158.95 NA 167.00 279.90 2 Count 1 2 3 1 1 Average 313.00 906.55 1393.77 478.00 450.09 Std Dev 0.00 271.45 192.23 0.00 0.00 3 Count 1 1 1 2 2 Average 374.40 728.30 1224.00 1239.41 403.50 Std Dev 0.00 0.00 0.00 120.59 76.50 4 Count 3 0 2 1 2 Average 1116.10 0.00 1184.00 1007.00 2378.50 Std Dev 151.48 NA 602.00 0.00 536.50 5 Count 1 3 1 2 0 Average 4625.20 3486.10 4840.00 1651.00 0.00 Std Dev 0.00 1473.92 0.00 393.00 NA
Suppose you wanted to eliminate the NAs from printing in the standard deviation lines. You could test each value using the expression:
quintileList do: [ !ave <- ^self average: [ block3Value ] ; ave isntNA ifTrue: [ ave print: 9 ] ifFalse: [ " " print: 9 ] ; ] ;Alternatively, you could define a new version of the print message and implement versions for various classes that need special behavior. The file example3.j defines several versions of a message named write: which performs the standard print: function by default. The quintile:by:for: method has been modified to use this new message:
#-- Define write: at Object to do standard print: Object defineMethod: [ | write: format | ^self print: format ] ; #-- Define write: at Number to print with commas in format Number defineMethod: [ | write: format | ^self printWithCommas: format ] ; #-- Define write: at NA to print with blanks NA defineMethod: [ | write: format | " " print: format ] ; #-- Fix method to use write: with average and stdDev List defineMethod: [ | quintile: block1 by: block2 for: block3 | "Cross Tabular Quintile Distribution" centerNL: 70 . print ; " " print: 30 ; " For: " print ; block3 printNL ; newLine print ; block1 asString print: 10 ; " | " print ; block2 asString concat: " Quintiles" . centerNL: 45 . print ; "Quintiles" print: 10 ; " | " print ; 5 sequence do: [ ^self print: 9 ] ; newLine print ; "-" fill: 70 . printNL ; !quintileList <- ^self quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ; !block3Value <- ^self send: ^my block3 ; ] ; quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ !currentQuintile1 <- ^self ; !quintileList <- 5 sequence send: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] ] ; currentQuintile1 print: -3 ; "Count" print: 7 ; " | " print ; quintileList do: [ ^self count print: 9 ] ; newLine print ; "Average" print: -10 ; " | " print ; quintileList do: [ ^self average: [ block3Value ] . write: 9 ] ; newLine print ; "Std Dev" print: -10 ; " | " print ; quintileList do: [ ^self stdDev: [ block3Value ] . write: 9 ] ; newLine print ; newLine print ; ] ; ] ; ### And Run It Company masterList quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]Execute this program. You should see:
Cross Tabular Quintile Distribution For: [netIncome] [sales] | [price / earningsPerShare] Quintiles Quintiles 1 2 3 4 5Notice that the numbers now contain commas and the NA values have been replaced with blanks.
1 Count 2 2 0 2 2 Average 1,016.12 579.95 0.00 477.00 314.90 Std Dev 121.33 158.95 167.00 279.90 2 Count 1 2 3 1 1 Average 313.00 906.55 1,393.77 478.00 450.09 Std Dev 0.00 271.45 192.23 0.00 0.00 3 Count 1 1 1 2 2 Average 374.40 728.30 1,224.00 1,239.41 403.50 Std Dev 0.00 0.00 0.00 120.59 76.50 4 Count 3 0 2 1 2 Average 1,116.10 0.00 1,184.00 1,007.00 2,378.50 Std Dev 151.48 602.00 0.00 536.50 5 Count 1 3 1 2 0 Average 4,625.20 3,486.10 4,840.00 1,651.00 0.00 Std Dev 0.00 1,473.92 0.00 393.00
The final version of this report adds a total column for each quintile row. Read the file example3.k:
List defineMethod: [ | quintile: block1 by: block2 for: block3 | "Cross Tabular Quintile Distribution" centerNL: 70 . print ; " " print: 30 ; " For: " print ; block3 printNL ; newLine print ; block1 asString print: 10 ; " | " print ; block2 asString concat: " Quintiles" . centerNL: 45 . print ; "Quintiles" print: 10 ; " | " print ; 5 sequence do: [ ^self print: 9 ] ; "Totals" printNL: -9 ; #-- Add New Heading "-" fill: 70 . printNL ; !quintileList <- ^self quintileUp: block1 . extendBy: [ !quintile1 <- quintile ] . quintileUp: block2 . extendBy: [ !quintile2 <- quintile ; !block3Value <- ^self send: ^my block3 ; ] ; quintileList groupedBy: [ quintile1 ] . sortUp: [ ^self ] . do: [ !currentQuintile1 <- ^self ; !quintileList <- 5 sequence send: [ !q <- ^self ; ^my groupList select: [ quintile2 = ^my q ] ] ; currentQuintile1 print: -3 ; "Count" print: 7 ; " | " print ; quintileList do: [ ^self count print: 9 ] ; groupList count printNL ; #-- Total count "Average" print: -10 ; " | " print ; quintileList do: [ ^self average: [ block3Value ] . write: 9 ] ; groupList average: [ block3Value ] . printNL ; #--- Ave "Std Dev" print: -10 ; " | " print ; quintileList do: [ ^self stdDev: [ block3Value ] . write: 9 ] ; groupList stdDev: [ block3Value ] . printNL ; #-- StdDev newLine print ; ] ; ] ; ### And Run It Company masterList quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]Execute this program. You should see:
Cross Tabular Quintile Distribution For: [netIncome] [sales] [price / earningsPerShare] Quintiles Quintiles 1 2 3 4 5 Totals
1 Count 2 2 0 2 2 8 Average 1,016.12 79.95 0.00 477.00 314.90 596.99 Std Dev 121.33 158.95 167.00 279.90 322.55 2 Count 1 2 3 1 1 8 Average 313.00 906.55 1,393.77 478.00 450.09 904.44 Std Dev 0.00 271.45 192.23 0.00 0.00 462.96 3 Count 1 1 1 2 2 7 Average 374.40 728.30 1,224.00 1,239.41 403.50 801.79 Std Dev 0.00 0.00 0.00 120.59 76.50 397.74 4 Count 3 0 2 1 2 8 Average 1,116.10 0.00 1,184.00 1,007.00 2,378.50 1435.04 Std Dev 151.48 602.00 0.00 536.50 685.93 5 Count 1 3 1 2 0 7 Average 4,625.20 3,486.10 4,840.00 1,651.00 0.00 3317.93 Std Dev 0.00 1,473.92 0.00 393.00
| Vision Basics | Creating a Demo Database | Single Object Access | Using Lists | Using Dates and TimeSeries |