Learning From Cases Using Excel

Netica

Learning From Cases Using Excel

1. Create an Excel Spreadsheet

2. Add Nodes to the Net

3. Discretize or Combine States

4. Add Link Structure

5. Learn CPTs

6. Use the Resulting Bayes Net

 

Structure:  The spreadsheet should be arranged with columns corresponding to the variables of interest (by node name), and each row being a case (aka "record").  At the intersection of each row and column is the cell that gives the value of the variable indicated by the column, for the case indicated by the row.

The first row must contain the names of the variables.  Each will correspond to a node in the Bayes net, although the Excel file may have some variables that don't appear in the net and vice-versa.  If desired, you can give each case an identification number.

Representing Prior Knowledge:  To simplify prior percentage knowledge in your file, you can use the NumCases function to denote the percentage of cases.  For example, if you had a million cases and there are 50 cases that have the same "set of findings", you can write 10% in the row that matches that data set.  This indicates that you have seen 10% of cases with these exact findings.  Netica will then run that line through 10 times (or whatever represents 10% of the cases).

Link:  The Windows database software must be able to identify the Excel worksheet as a database table.  It may do this automatically, or from Excel you may have to select all the relevant cells, and then in the little box to the left of the formula bar (for defining names), enter in any name and press enter.  Finally, save the file.  Note: if you are using Netica on a Mac, it will throw an error when learning from an Excel file.  You must convert your data into a text file in order for the learning to work.

Subset of Cells/Choosing Table:  If you already have several tables defined in the spreadsheet, or you want Netica to just use a subset of the cells, select the set of cells you want to use, and define it with the name “ForNetica”, as described above.  Whenever there is a table with that name, Netica will use it instead of any other.

>> Next Step