Vision Tutorial 2: Loading Sample Data

Overview

The sample Tutorials and Case Studies are based on the data described in this section. The sample database provides basic information about companies. Several sample data files have been supplied in addition to sample Vision code which can be executed to read these files and create the database.

By default, all files referenced are in the directory /localvision/samples/general/. Check with your Vision administrator if you do not see the following files in the directory:

File Name Function
sector.dat Vertical bar-delimited file containing sector identifiers and names.
industry.dat Vertical bar-delimited file containing industry identifiers and names and a sector identifier.
company.dat Vertical bar-delimited file containing basic company information: id, name, price, and industry id.
company2.dat Vertical bar-delimited file containing additional company information for different time points: id, year, sales, earnings, income, and common equity.
sample.load File containing Vision code that defines the Company, Industry, and Sector classes and populates these classes using the data in the .dat files.

You can read the sample.load file into your favorite Vision Editor or you can execute the code in this file using the asFileContents evaluate expression:

      "/localvision/samples/general/sample.load" asFileContents evaluate ;
    

Note: The sample.load file runs by default on a Unix environment. If you are using a Windows NT platform, this location may be prefixed by a drive and optional path (e.g., d:/visiondb/localvision/samples/general/sample.load). In addition, the sample.load file must be edited by your Vision Administrator to reflect the correct location.


Loading Sector Data

The file sector.dat contains one line per sector. Each line contains a sector identifier and a sector name, separated by the vertical bar ( | ) character. The first few lines of this file are displayed below:

  NONDUR|Non-durables
  DUR|Durables
The techniques described in Loading Data Files are used to parse this data file to create new sectors.

This subset of the Vision code in the sample.load file is used for defining sectors:

  #--------------------
  #  Create The Class
  #--------------------
  Entity createSubclass: "Sector" ;
     
  #--------------------
  #  Read and process the vertical bar-delimited file
  #  Fields are: id | name
  #--------------------
  "/localvision/samples/general/sector.dat" asFileContents asLines
  do: [ !fields <- ^self breakOn: "|" ;
        !id <- fields at: 1 . stripBoundingBlanks ;
        !name <- fields at: 2 . stripBoundingBlanks ;
        id isBlank not && name isBlank not
           ifTrue: [ ^global Sector createInstance: id . setNameTo: name ] ;
      ] ;

The first step is to create the new class, Sector using the createSubclass: message. By creating Sector as a subclass of Entity, the Named Sector dictionary is automatically created for storing references to individual sector instances.

The file sector.dat is then processed to create and name the Sector instances. The asFileContents message returns the contents of the file as a string, the asLines message returns a list of strings corresponding to the individual rows in the file and the breakOn: message breaks each line into a list of string fields using the vertical bar character as a delimiter. The variable fields is a list of strings where the first element is the sector identifier and the second element is the sector name.

If a non-blank sector id and name are provided, a new sector is created using the createInstance: message. The id value is used to name the new sector and will be the value of the new sector's code property. The name value is used to update the name property of the new sector. Since the property code is defined at the Object class and the property name is defined at the Entity class, these properties do not need to be explicitly defined for the Sector class.

After the sectors have been created you can display the list of all sectors using:

  Sector masterList
  do: [ code print: 10 ; 
        name printNL ;
      ] ;
You can reference a specific sector using:
    Named Sector at: "DUR" . name printNL ;


Loading Industry Data

The file industry.dat contains one line per industry. Each line contains an industry identifier, an industry name, and a sector identifier, separated by the vertical bar ( | ) character. The first few lines of this file are displayed below:

  FBT|Food, Beverage, Tobacco|NONDUR
  HH|Household Products|NONDUR
The techniques described in Loading Data Files are used to parse this data file to create new industries.

This subset of the Vision code in the sample.load file is used for defining industries:

  #--------------------
  #  Create The Class
  #--------------------
  Entity createSubclass: "Industry" ;
     
  #--------------------
  #  Define the sector property
  #--------------------
  Industry defineFixedProperty: 'sector' withDefault: Sector ;
     
  #--------------------
  #  Read and process the vertical bar-delimited file
  #  Fields are: id  name  sectorId
  #--------------------
  "/localvision/samples/general/industry.dat" asFileContents asLines
  do: [ !fields <- ^self breakOn: "|" ;
        !id <- fields at: 1 . stripBoundingBlanks ;
        !name <- fields at: 2 . stripBoundingBlanks ;
        !sectID <- fields at: 3 . stripBoundingBlanks ;
        !sector <- ^global Named Sector uniformAt: sectID ;
        id isBlank not && name isBlank not
        ifTrue: 
          [ !newOne <- ^global Industry createInstance: id . 
                setNameTo: name;
            newOne :sector <- sector ;
          ] ;
      ] ;

The first step is to create the new class, Industry using the createSubclass: message. By creating Industry as a subclass of Entity, the Named Industry dictionary is automatically created for storing references to individual industry instances.

The next step is to define an extra property at the Industry class which will be used to store the value of an industry's sector. Since the withDefault: parameter is supplied for this new property, new industry instances created with the createInstance: message will have their sector property initialized to the default Sector.

The file industry.dat is then processed to create and name the Industry instances. The asFileContents message returns the contents of the file as a string, the asLines message returns a list of strings corresponding to the individual rows in the file and the breakOn: message breaks each line into a list of string fields using the vertical bar character as a delimiter. The variable fields is a list of strings where the first element is the industry identifier, the second element is the industry name, and the third element is the sector id. The variable sector is created to convert the string representing the sector id into an actual sector object. The uniformAt: message is used to return the default sector if the supplied id does not match an existing sector.

If a non-blank industry id and name are provided, a new industry is created using the createInstance: message. The id value is used to name the new industry and will be the value of the new industry's code property. The name value is used to update the name property of the new industry and the sector object is assigned to the new industry's sector property.

After the industries have been created you can display the list of all industries grouped into sectors using:

  Industry masterList
       groupedBy: [ sector ] .
  do: [ "Sector: " print ; code print: 10 ; name printNL ;
        groupList 
        do: [ "   Ind: " print ; 
              code print: 10 ;
              name printNL ;
            ] ;
        newLine print ;
      ] ;
You can reference a specific industry using:
  Named Industry at: "FBT" . name printNL ;
You can reference a specific industry's sector using:
  Named Industry at: "FBT" . sector name printNL ;


Loading Company Data

The file company.dat contains one line per company. Each line contains a company identifier, a company name, a price, and an industry identifier separated by the vertical bar ( | ) character. The first few lines of this file are displayed below:

  AET|Aetna Life & Cas| 40.875|INSUR
  T|American Tel & Tel| 43.625|TEL
The techniques described in Loading Data Files are used to parse this data file to create new companies.

This subset of the Vision code in the sample.load file is used for defining Companies:

  #--------------------
  #  Create The Class
  #--------------------
  Entity createSubclass: "Company" ;
     
  #--------------------
  #  Define some basic properties
  #--------------------
  Company defineFixedProperty: 'ticker' ;
  Company defineFixedProperty: 'price' ;
  Company defineFixedProperty: 'industry' withDefault: Industry ;
  
  #--------------------
  #  Read and process the vertical bar-delimited file
  #  Fields are: id  name  price  industryId
  #--------------------
  "/localvision/samples/general/company.dat" asFileContents asLines
  do: [ !fields <- ^self breakOn: "|" ;
        !id <- fields at: 1 . stripBoundingBlanks ;
        !name <- fields at: 2 . stripBoundingBlanks ;
        !price <- fields at: 3 . asNumber ;
        !indID <- fields at: 4 . stripBoundingBlanks ;
        !industry <- ^global Named Industry uniformAt: indID ;
        id isBlank not && name isBlank not
        ifTrue: 
          [ !newOne <- ^global Company createInstance: id . 
               setNameTo: name;
            newOne :ticker <- id ;
            newOne :price <- price ;
            newOne :industry <- industry ;
          ] ;
      ] ;
The first step is to create the new class, Company using the createSubclass: message. By creating Company as a subclass of Entity, the Named Company dictionary is automatically created for storing references to individual company instances.

The next step is to define some additional properties at the Company class that will be used to store the ticker symbol, price, and industry for each instance. Since the withDefault: parameter is supplied for the industry property, new company instances created with the createInstance: message will have this property initialized to the default Industry. The ticker and price properties will have NA values for new company instances.

The file company.dat is then processed to create and name the Company instances. The asFileContents message returns the contents of the file as a string, the asLines message returns a list of strings corresponding to the individual rows in the file and the breakOn: message breaks each line into a list of string fields using the vertical bar character as a delimiter. The variable fields is a list of strings where the first element is the company identifier, the second element is the company name, the third element is the price (which is converted from a string to a number), and the fourth element is an industry id. The variable industry is created to convert the string representing the industry id into an actual industry object. The uniformAt: message is used to return the default industry if the supplied id does not match an existing industry.

If a non-blank company id and name are provided, a new company is created using the createInstance: message. The id value is used to name the new company and will be the value of the new company's code property. The name value is used to update the name property of the new company, the price is assigned into the price property, and the industry object is assigned to the new company's industry property.

After the companies have been created you can display the list of all companies grouped into industries using:

  Company masterList
       groupedBy: [ industry ] .
  do: [ "Industry: " print ; code print: 10 ; name printNL ;
        groupList 
        do: [ code print: 10 ;
              name print: 30 ; 
              price printNL ;
            ] ;
        newLine print ;
      ] ;
You can reference a specific company using:
  Named Company at: "T" . name printNL ;
You can reference a specific company's industry using:
  Named Company at: "T" . industry name printNL ;
You can reference a specific company's sector using:
  Named Company at: "T" . industry sector name printNL ;


Updating TimeSeries Properties

The file company2.dat contains multiple lines per company, with each line representing a specific year's data for the company. Each line contains a company identifier, a year, and a sales, earnings per share, net income, and common equity value for the company for the year separated by the vertical bar ( | ) character. The first few lines of this file are displayed below:

  AET|90| 22114.11| 7.48| 920.60| 6084.80
  AET|89| 20482.91| 6.18| 1043.10| 5497.19
The techniques described in Loading Data Files are used to parse this data file to update time series properties for existing companies.

This subset of the Vision code in the sample.load file is used for updating the company data:

  #--------------------
  #  Define some time series properties
  #--------------------
  Company define: 'sales' ;
  Company define: 'earningsPerShare' ;
  Company define: 'netIncome' ;
  Company define: 'commonEquity' ;
     
  #--------------------
  #  Read and process the vertical bar-delimited file
  #  Fields are: id year sales earningsPerShare netIncome commonEquity
  #--------------------
  "/localvision/samples/general/company2.dat" asFileContents asLines
  do: [ !fields <- ^self breakOn: "|" ; 
        !id <- fields at: 1 . stripBoundingBlanks ;
        !company <- ^global Named Company at: id ;
        !date <- fields at: 2 . asNumber asDate ;
        !sales <- fields at: 3 . asNumber ;
        !eps <- fields at: 4 . asNumber ;
        !netinc <- fields at: 5 . asNumber ;
        !commone <- fields at: 6 . asNumber ;
  
        company isntNA
        ifTrue: 
          [ company :sales asOf: date put: sales ;
            company :earningsPerShare asOf: date put: eps ;
            company :netIncome asOf: date put: netinc ;
            company :commonEquity asOf: date put: commone ;
          ] 
       ifFalse:
          [ ">>> Company not Found: " print ; id printNL ] ;
      ] ;
The first step is to create four new time series properties for storing the sales, earnings, income, and common equity values. By default, there are no data points in these time series and the value NA will be returned.

The file company2.dat is then processed. The asFileContents message returns the contents of the file as a string, the asLines message returns a list of strings corresponding to the individual rows in the file and the breakOn: message breaks each line into a list of string fields using the vertical bar character as a delimiter. The variable fields is a list of strings where the first element is the company identifier, the second element is the date which is converted from a string to a number to a date, and the third through sixth elements are the sales, earnings, income, and equity values which are converted from strings to numbers. The variable company is created to convert the string representing the company id into an actual company object. Because the at: message is used to lookup the id, the value NA will be returned if the id does not match an existing company.

If the company is found, the values of the time series properties sales, earningsPerShare, netIncome, and commonEquity are updated as of the date supplied. If the company is not found, a warning message is displayed.

After the data has been updated, you can look at a specific sales value for a company using:

  Named Company T sales printNL;                #- most recent
  Named Company T :sales asOf: 88 . printNL ;   #- as of 88
  88 evaluate:                                  #- as of 88
      [ Named Company T sales printNL ] ;       #- alternative
You can display all the sales values for a specific company using:
  Named Company at: "T" . :sales displayAll ;

Related Topics