Case Study 3: Cross Tabular Reporting
Reminder! To run these examples, you should first start a new session and then load the sample database using:
"/localvision/samples/general/sample.load" asFileContents evaluate ;and load testList using:
!testList <- Company masterList rankDown: [ sales ] . select: [ rank <= 20 ] ;Any other files referenced can be read from the /localvision/samples/general/ directory.Note: The sample.load file runs by default on a Unix environment. If you are using a Windows NT platform, this location may be prefixed by a drive and optional path (e.g. d:/visiondb/localvision/samples/general/sample.load). Check with your Vision Administrator for further details.
It is often useful to look at data at an aggregate level instead of looking at each specific record. Example 3 develops a cross-tabular report that aggregates a list using different criteria and displays summary information by aggregate.
Read the file example3.a. You should see:
#--- Build a List that is extended by 2 quintile values
!quintileList <- testList
quintileUp: [ netIncome ] .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: [ earningsPerShare ] .
extendBy: [ !quintile2 <- quintile ] ;
#--- Display Basic Report for this List
quintileList
do: [ ticker print: 10 ;
netIncome print: 10 ;
quintile1 print: 3 ;
earningsPerShare print: 10 ;
quintile2 print: 3 ;
sales print: 10 ;
newLine print ;
] ;
Execute this program. You should see:
AET 920.60 2 7.48 5 22114.11 T 2044.00 4 1.88 1 51209.02 AN 1360.00 3 2.65 2 20174.00 ARC 1224.00 2 6.68 4 16282.00 CI 728.30 1 7.25 5 16909.30 ...The first step creates a list named quintileList which is extended by two variables: quintile1 is the quintile (1-5) of the netIncome value and quintile2 is the quintile (1-5) of the earningsPerShare value. A simple report showing actual data and these quintile values is then displayed for each company in the list.
The next version of this report eliminates the company detail and just displays summary information. For each of the netIncome quintiles, the report displays the distribution of earningsPerShare quintiles. Read in the file example3.b:
#--- group by netIncome quintiles (quintile1)
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ #-- for each net income quintile
"netIncome Quintile " print ;
^self print: 3 ; #-- print the quintile1 number
newLine print ;
#--- group the current quintile1 companies by quintile2
groupList groupedBy: [ quintile2 ] .
sortUp: [ ^self ] .
do: [ " earningsPerShare Quintile " print ;
^self print: 3 ; #-- print quintile2 number
" Includes: " print ;
groupList count print: 5 ; #-- print quintile2 count
" Elements" printNL ;
] ;
newLine print ;
] ;
Execute this program. You should see:
netIncome Quintile 1 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 2 earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 3 earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 2 Elements netIncome Quintile 4 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 2 Includes: 2 Elements netIncome Quintile 5 earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 2 ElementsThis report groups the original list by quintile1. In other words, 5 distinct groups are formed corresponding to the integers 1 through 5. The groupList message for each of these groups returns the list of companies in the current quintile group. For each of these quintile1 groups, a label is displayed identifying the netIncome quintile number. The groupList is then grouped by quintile2. A summary line is then printed for each earningsPerShare quintile present in the current group.
This report displays the count for the different combination of quintile pairs that exist. For example, you can see that 2 companies have a netIncome quintile of 4 and an earningsPerShare quintile of 1. Only the combinations that exist are displayed in this report. For example, there is no summary line for netIncome quintile 1 and earningsPerShare quintile 2, implying that there are no companies with these characteristics.
The program in example3.c displays an entry for every combination:
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ "netIncome Quintile " print ;
^self print: 3 ;
newLine print ;
5 sequence #--- For each number from 1 to 5
do: [ !q <- ^self ;
" earningsPerShare Quintile " print ;
q print: 3 ;
" Includes: " print ;
^my groupList #-- select companies in this quintile
select: [ quintile2 = ^my q ] . count print: 5 ;
" Elements" printNL ;
] ;
newLine print ;
] ;
Execute this program. You should see:
netIncome Quintile 1 earningsPerShare Quintile 1 Includes: 2 Elements earningsPerShare Quintile 2 Includes: 0 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 0 Elements earningsPerShare Quintile 5 Includes: 1 Elements netIncome Quintile 2 earningsPerShare Quintile 1 Includes: 0 Elements earningsPerShare Quintile 2 Includes: 1 Elements earningsPerShare Quintile 3 Includes: 1 Elements earningsPerShare Quintile 4 Includes: 1 Elements earningsPerShare Quintile 5 Includes: 1 ElementsThis report displays a summary line for each earningsPerShare quintile for each netIncome quintile. The expression 5 sequence creates a list of the first 5 integers. The variable q is used to track which quintile is currently in effect. The expression ^my groupList retrieves the value of groupList defined outside this do: block. In this case, it refers to the groupList associated with the current netIncome quintile. The expression quintile2 = ^my q in the selection block, selects all companies in this groupList whose quintile2 value matches the current value of q. The number of elements selected is then printed.
Notice that the values of netIncome and earningsPerShare are no longer needed once the quintiles have been defined. The program in example3.d is parameterized to make it easy to change the quintile rules:
#--- Create variables for the 2 blocks and the list
!block1 <- [ netIncome ] ;
!block2 <- [ earningsPerShare ] ;
!list <- testList ;
#--- Use these variables to build quintileList
!quintileList <- list
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ] ;
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ ^my block1 print ; #-- print the block as a label
" Quintile " print ;
^self print: 3 ;
newLine print ;
!block2 <- ^my block2 ; #-- copy value for next level
5 sequence
do: [ !q <- ^self ;
^my block2 print ; #-- print the block as a label
" Quintile " print ;
q print: 3 ;
" Includes: " print ;
^my groupList
select: [ quintile2 = ^my q ] . count print: 5 ;
" Elements" printNL ;
] ;
newLine print ;
] ;
Execute this program. You should see:
[netIncome] Quintile 1 [earningsPerShare] Quintile 1 Includes: 2 Elements [earningsPerShare] Quintile 2 Includes: 0 Elements [earningsPerShare] Quintile 3 Includes: 1 Elements [earningsPerShare] Quintile 4 Includes: 0 Elements [earningsPerShare] Quintile 5 Includes: 1 Elements [netIncome] Quintile 2 [earningsPerShare] Quintile 1 Includes: 0 Elements [earningsPerShare] Quintile 2 Includes: 1 Elements [earningsPerShare] Quintile 3 Includes: 1 Elements [earningsPerShare] Quintile 4 Includes: 1 Elements [earningsPerShare] Quintile 5 Includes: 1 Elements ...The variables block1 and block2 are used to define the quintiles. The value of the block is printed instead of the hard-coded labels. Because the magic word ^my only "reaches back" one level, you need to copy the value of block2 so that it can be accessed by the innermost do: message.
The file example3.e changes the quintile blocks and the list and executes the same report:
!block1 <- [ sales ] ;
!block2 <- [ netIncome ] ;
!list <- Company masterList ;
!quintileList <- list
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ] ;
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ ^my block1 print ;
" Quintile " print ;
^self print: 3 ;
newLine print ;
!block2 <- ^my block2 ;
5 sequence
do: [ !q <- ^self ;
^my block2 print ;
" Quintile " print ;
q print: 3 ;
" Includes: " print ;
^my groupList
select: [ quintile2 = ^my q ] . count print: 5 ;
" Elements" printNL ;
] ;
newLine print ;
] ;
Execute this program. You should see:
[sales] Quintile 1 [netIncome] Quintile 1 Includes: 3 Elements [netIncome] Quintile 2 Includes: 4 Elements [netIncome] Quintile 3 Includes: 1 Elements [netIncome] Quintile 4 Includes: 0 Elements [netIncome] Quintile 5 Includes: 0 Elements [sales] Quintile 2 [netIncome] Quintile 1 Includes: 3 Elements [netIncome] Quintile 2 Includes: 1 Elements [netIncome] Quintile 3 Includes: 1 Elements [netIncome] Quintile 4 Includes: 3 Elements [netIncome] Quintile 5 Includes: 0 Elements ...Note: This report shows the quintile distribution for sales and netIncome using the full company universe, not testList.
The file example3.f defines a method at List that produces this report using two supplied blocks as parameters. This method is defined at List because there is nothing about the method itself that knows or cares what type of data is actually being processed. Presumably, the supplied blocks will be appropriate for the type of elements contained in the list:
List defineMethod: #-- define a method
[ | quintile: block1 by: block2 | #-- needing two blocks
!quintileList <- ^self #-- ^self is recipient list
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ] ;
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ ^my block1 print ;
" Quintile " print ;
^self print: 3 ;
newLine print ;
!block2 <- ^my block2 ;
5 sequence
do: [ !q <- ^self ;
^my block2 print ;
" Quintile " print ;
q print: 3 ;
" Includes: " print ;
^my groupList
select: [ quintile2 = ^my q ] . count print: 5 ;
" Elements" printNL ;
] ;
newLine print ;
] ;
] ;
#-- Run the Method
Company masterList quintile: [sales] by: [ netIncome ] ;
Execute this program. Your output should be identical to the
previous report.
The next version of the report displays the output in a matrix format. The quintile1 values form the rows and the quintile2 values form the columns. Read in the file example3.g:
List defineMethod:
[ | quintile: block1 by: block2 |
#--- Titles
"Cross Tabular Quintile Distribution" centerNL: 70 . print ;
newLine print ;
block1 asString print: 10 ; " | " print ;
block2 asString concat: " Quintiles" . centerNL: 45 . print ;
"Quintiles" print: 10 ; " | " print ;
5 sequence do: [ ^self print: 9 ] ;
newLine print ;
"-" fill: 70 . printNL ;
#--- Setup
!quintileList <- ^self
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ] ;
#--- Analyze
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ !currentQuintile1 <- ^self ;
currentQuintile1 print: -10 ;
" | " print ;
5 sequence
do: [ !q <- ^self ;
^my groupList
select: [ quintile2 = ^my q ] . count print: 9 ;
] ;
newLine print ;
] ;
] ;
### And Run It
Company masterList
quintile: [ sales ] by: [ price / earningsPerShare ]
Execute this program. You should see:
Cross Tabular Quintile Distribution
[sales] | [price / earningsPerShare] Quintiles
Quintiles | 1 2 3 4 5
1 | 2 2 0 2 2
2 | 1 2 3 1 1
3 | 1 1 1 2 2
4 | 3 0 2 1 2
5 | 1 3 1 2 0
The cross tabular report format shows sales quintiles as rows
and price-earnings quintiles as columns. The number of companies
in the row/column combination is displayed in the table's cells.
Suppose you wanted to display more than just the count. The version example3.h adds an additional block as a third parameter. The average value for this block is printed for each cell:
List defineMethod:
[ | quintile: block1 by: block2 for: block3 | #-- add block3
#--- Titles
"Cross Tabular Quintile Distribution" centerNL: 70 . print ;
" " print: 30 ; " For: " print ; block3 printNL ;
newLine print ;
block1 asString print: 10 ; " | " print ;
block2 asString concat: " Quintiles" . centerNL: 45 . print ;
"Quintiles" print: 10 ; " | " print ;
5 sequence do: [ ^self print: 9 ] ;
newLine print ;
"-" fill: 70 . printNL ;
#--- Setup
!quintileList <- ^self
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy:
[ !quintile2 <- quintile ;
!block3Value <- ^self send: ^my block3 ; #-- store block3 value
] ;
#--- Analyze
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ !currentQuintile1 <- ^self ;
currentQuintile1 print: -3 ;
"Count" print: 7 ; " | " print ;
5 sequence
do: [ !q <- ^self ;
^my groupList
select: [ quintile2 = ^my q ] . count print: 9 ;
] ;
newLine print ;
#-- Add Average Calc Based on block Value
"Average" print: -10 ; " | " print ;
5 sequence
do: [ !q <- ^self ;
^my groupList
select: [ quintile2 = ^my q ] .
average: [ block3Value ] . print: 9 ;
] ;
newLine print ;
] ;
] ;
### And Run It
Company masterList
quintile: [ sales ]
by: [ price / earningsPerShare ]
for: [ netIncome]
Execute this program. You should see:
Cross Tabular Quintile Distribution
For: [netIncome]
[sales] | | [price / earningsPerShare] Quintiles
Quintiles | 1 2 3 4 5
1 Count | 2 2 0 2 2
Average | 1016.12 579.95 NA 477.00 314.90
2 Count | 1 2 3 1 1
Average | 313.00 906.55 1393.77 478.00 450.09
3 Count | 1 1 1 2 2
Average | 374.40 728.30 1224.00 1239.41 403.50
4 Count | 3 0 2 1 2
Average | 1116.10 NA 1184.00 1007.00 2378.50
5 Count | 1 3 1 2 0
Average | 4625.20 3486.10 4840.00 1651.00 NA
This method requires 3 blocks as parameters. The first two are used to
form the quintiles. The third is used to compute the average. After
the counts are printed for each column, a second line of average values
is displayed.
Notice that the section of code that generates the averages duplicates much of the code used to generate the counts. The file example3.i streamlines this:
List defineMethod:
[ | quintile: block1 by: block2 for: block3 |
#--- Titles
"Cross Tabular Quintile Distribution" centerNL: 70 . print ;
" " print: 30 ; " For: " print ; block3 printNL ;
newLine print ;
block1 asString print: 10 ; " | " print ;
block2 asString concat: " Quintiles" . centerNL: 45 . print ;
"Quintiles" print: 10 ; " | " print ;
5 sequence do: [ ^self print: 9 ] ;
newLine print ;
"-" fill: 70 . printNL ;
#--- Setup
!quintileList <- ^self
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ;
!block3Value <- ^self send: ^my block3 ;
] ;
#--- Analyze
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ !currentQuintile1 <- ^self ;
#-- For each of the 5 quintiles, extract the sublist
#-- send returns the last statement in block.
#-- quintileList is therefore a list of 5 lists
!quintileList <- 5 sequence
send: [ !q <- ^self ;
^my groupList select: [ quintile2 = ^my q ]
] ;
#-- Now just use quintileList to display stats
currentQuintile1 print: -3 ;
"Count" print: 7 ; " | " print ;
quintileList do: [ ^self count print: 9 ] ;
newLine print ;
"Average" print: -10 ; " | " print ;
quintileList do: [ ^self average: [ block3Value ] . print: 9 ] ;
newLine print ;
"Std Dev" print: -10 ; " | " print ;
quintileList do: [ ^self stdDev: [ block3Value ] . print: 9 ] ;
newLine print ;
newLine print ;
] ;
] ;
### And Run It
Company masterList
quintile: [ sales ]
by: [ price / earningsPerShare ]
for: [ netIncome]
Execute this program. You should see:
Cross Tabular Quintile Distribution
For: [netIncome]
[sales] | [price / earningsPerShare] Quintiles
Quintiles 1 2 3 4 5
1 Count 2 2 0 2 2
Average 1016.12 579.95 0.00 477.00 314.90
Std Dev 121.33 158.95 NA 167.00 279.90
2 Count 1 2 3 1 1
Average 313.00 906.55 1393.77 478.00 450.09
Std Dev 0.00 271.45 192.23 0.00 0.00
3 Count 1 1 1 2 2
Average 374.40 728.30 1224.00 1239.41 403.50
Std Dev 0.00 0.00 0.00 120.59 76.50
4 Count 3 0 2 1 2
Average 1116.10 0.00 1184.00 1007.00 2378.50
Std Dev 151.48 NA 602.00 0.00 536.50
5 Count 1 3 1 2 0
Average 4625.20 3486.10 4840.00 1651.00 0.00
Std Dev 0.00 1473.92 0.00 393.00 NA
Suppose you wanted to eliminate the NAs from printing in the standard deviation lines. You could test each value using the expression:
quintileList
do: [ !ave <- ^self average: [ block3Value ] ;
ave isntNA
ifTrue: [ ave print: 9 ] ifFalse: [ " " print: 9 ] ;
] ;
Alternatively, you could define a new version of the print message
and implement versions for various classes that need special behavior.
The file example3.j defines several versions of a message named
write: which performs the standard print: function
by default. The quintile:by:for: method has been modified
to use this new message:
#-- Define write: at Object to do standard print:
Object
defineMethod:
[ | write: format |
^self print: format
] ;
#-- Define write: at Number to print with commas in format
Number
defineMethod:
[ | write: format |
^self printWithCommas: format
] ;
#-- Define write: at NA to print with blanks
NA
defineMethod:
[ | write: format |
" " print: format
] ;
#-- Fix method to use write: with average and stdDev
List defineMethod:
[ | quintile: block1 by: block2 for: block3 |
"Cross Tabular Quintile Distribution" centerNL: 70 . print ;
" " print: 30 ; " For: " print ; block3 printNL ;
newLine print ;
block1 asString print: 10 ; " | " print ;
block2 asString concat: " Quintiles" . centerNL: 45 . print ;
"Quintiles" print: 10 ; " | " print ;
5 sequence do: [ ^self print: 9 ] ;
newLine print ;
"-" fill: 70 . printNL ;
!quintileList <- ^self
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ;
!block3Value <- ^self send: ^my block3 ;
] ;
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ !currentQuintile1 <- ^self ;
!quintileList <- 5 sequence
send: [ !q <- ^self ;
^my groupList select: [ quintile2 = ^my q ]
] ;
currentQuintile1 print: -3 ;
"Count" print: 7 ; " | " print ;
quintileList do: [ ^self count print: 9 ] ;
newLine print ;
"Average" print: -10 ; " | " print ;
quintileList do: [ ^self average: [ block3Value ] . write: 9 ] ;
newLine print ;
"Std Dev" print: -10 ; " | " print ;
quintileList do: [ ^self stdDev: [ block3Value ] . write: 9 ] ;
newLine print ;
newLine print ;
] ;
] ;
### And Run It
Company masterList
quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]
Execute this program. You should see:
Cross Tabular Quintile Distribution
For: [netIncome]
[sales] | [price / earningsPerShare] Quintiles
Quintiles 1 2 3 4 5
1 Count 2 2 0 2 2
Average 1,016.12 579.95 0.00 477.00 314.90
Std Dev 121.33 158.95 167.00 279.90
2 Count 1 2 3 1 1
Average 313.00 906.55 1,393.77 478.00 450.09
Std Dev 0.00 271.45 192.23 0.00 0.00
3 Count 1 1 1 2 2
Average 374.40 728.30 1,224.00 1,239.41 403.50
Std Dev 0.00 0.00 0.00 120.59 76.50
4 Count 3 0 2 1 2
Average 1,116.10 0.00 1,184.00 1,007.00 2,378.50
Std Dev 151.48 602.00 0.00 536.50
5 Count 1 3 1 2 0
Average 4,625.20 3,486.10 4,840.00 1,651.00 0.00
Std Dev 0.00 1,473.92 0.00 393.00
Notice that the numbers now contain commas and the NA values have been
replaced with blanks.
The final version of this report adds a total column for each quintile row. Read the file example3.k:
List defineMethod:
[ | quintile: block1 by: block2 for: block3 |
"Cross Tabular Quintile Distribution" centerNL: 70 . print ;
" " print: 30 ; " For: " print ; block3 printNL ;
newLine print ;
block1 asString print: 10 ; " | " print ;
block2 asString concat: " Quintiles" . centerNL: 45 . print ;
"Quintiles" print: 10 ; " | " print ;
5 sequence do: [ ^self print: 9 ] ;
"Totals" printNL: -9 ; #-- Add New Heading
"-" fill: 70 . printNL ;
!quintileList <- ^self
quintileUp: block1 .
extendBy: [ !quintile1 <- quintile ] .
quintileUp: block2 .
extendBy: [ !quintile2 <- quintile ;
!block3Value <- ^self send: ^my block3 ;
] ;
quintileList groupedBy: [ quintile1 ] .
sortUp: [ ^self ] .
do: [ !currentQuintile1 <- ^self ;
!quintileList <- 5 sequence
send: [ !q <- ^self ;
^my groupList select: [ quintile2 = ^my q ]
] ;
currentQuintile1 print: -3 ;
"Count" print: 7 ; " | " print ;
quintileList do: [ ^self count print: 9 ] ;
groupList count printNL ; #-- Total count
"Average" print: -10 ; " | " print ;
quintileList do: [ ^self average: [ block3Value ] . write: 9 ] ;
groupList average: [ block3Value ] . printNL ; #--- Ave
"Std Dev" print: -10 ; " | " print ;
quintileList do: [ ^self stdDev: [ block3Value ] . write: 9 ] ;
groupList stdDev: [ block3Value ] . printNL ; #-- StdDev
newLine print ;
] ;
] ;
### And Run It
Company masterList
quintile: [ sales ] by: [ price / earningsPerShare ] for: [ netIncome]
Execute this program. You should see:
Cross Tabular Quintile Distribution
For: [netIncome]
[sales] [price / earningsPerShare] Quintiles
Quintiles 1 2 3 4 5 Totals
1 Count 2 2 0 2 2 8
Average 1,016.12 79.95 0.00 477.00 314.90 596.99
Std Dev 121.33 158.95 167.00 279.90 322.55
2 Count 1 2 3 1 1 8
Average 313.00 906.55 1,393.77 478.00 450.09 904.44
Std Dev 0.00 271.45 192.23 0.00 0.00 462.96
3 Count 1 1 1 2 2 7
Average 374.40 728.30 1,224.00 1,239.41 403.50 801.79
Std Dev 0.00 0.00 0.00 120.59 76.50 397.74
4 Count 3 0 2 1 2 8
Average 1,116.10 0.00 1,184.00 1,007.00 2,378.50 1435.04
Std Dev 151.48 602.00 0.00 536.50 685.93
5 Count 1 3 1 2 0 7
Average 4,625.20 3,486.10 4,840.00 1,651.00 0.00 3317.93
Std Dev 0.00 1,473.92 0.00 393.00
| Vision Basics | Creating a Demo Database | Single Object Access | Using Lists | Using Dates and TimeSeries |