Skip to content

File Analysis

The File Info Window contains four tabs:

  1. Statistics
  2. Column Attributes
  3. Unique Values
  4. Pivot Table

Statistics

Statistics Tab
Statistics - All Columns
Statistics - Selected Cells
Statistics - Selected Cells by Columns
Statistics - Selected Columns

If your file contains numerical data, the Statistics tab will compute basic statistical measures. This includes:

  • Number Of Data Points
  • Sum
  • Mean
  • Variance
  • Standard Deviation
  • Median
  • Min
  • Max
  • 20th Percentile
  • 40th Percentile
  • 60th Percentile
  • 80th Percentile

Note that variance and standard deviations are biased.

The Statistics - Selected Cells command operates on all selected cells containing numerical values regardless of which column they're in. The Statistics - Selected Cells by Columns command segregates the selected cells by column and evaluates them.


Buttons

You can copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.

The measures will not update automatically if the underlying data is changed. You can tell it to update the same columns that you originally had it analyze with the Same Columns button. If you want it to update new columns, you can select them and click the Selected Columns button. If you want to analyze all columns, select the All Columns button.


Column Attributes

Column Attributes
Column Attributes - All Columns
Column Attributes - Selected Columns

The Column Attributes tab gives you the following information about each column:

  • Number of Fields
  • Number of Non-Empty Fields
  • Number of Empty Fields
  • Percent Non-Empty Fields
  • Minimum Field Length
  • Maximum Field Length
  • Data Type (String, Number, or Date)

Dates can be in virtually any format and it will detect them.


Buttons

You can also copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.

Like the Statistics tab, these values will not update automatically if the underlying data is changed. You can tell it to update the same columns that you originally had it analyze with the Same Columns button. If you want it to update new columns, you can select them and click the Selected Columns button. If you want to analyze all columns, select the All Columns button.


Unique Values

Unique Values
Unique Values - Selected Column
Select Values in Selected Column to Filter

The Unique Values tab tells you how many times each distinct value in a column appears and the percentage of entries in the column it occupies. It also provides a histogram to graphically compare the number of appearances of the various values.

It can also be used to pick distinct values to filter. You can either filter them in (i.e. view only rows with selected values) or filter out them (i.e. view all rows except those with the selected values).


Buttons

You can copy part or all of the results to the clipboard using the Whole Table or Selected Values buttons.

Like the other two tabs, the values will not update automatically if the underlying data is changed. You can tell it to update the data with the Same Column button. If you want data on a diffent column, select it in the main window and click the Selected Column button.

To filter values, select whichever value or values you want (it doesn't have to be the value column. You can select the Count or Percent column). Click the Selected Values In or Selected Values Out button to filter the values in or out.


Pivot Table

Pivot Table
Pivot Table - Selected Columns

The Pivot Table shows statistics on groups of data. For instance, if you have sales data for salesmen grouped into regions North, East, South, and West, the Pivot Table can show you the grand total for each region or the average of each salesman in each region.

You'll need to provide it two types of data - headers and data. The headers are the groups (e.g. North, East, South, and West) and the data are numerical values to operate on (e.g. sales figures). There is no limit to how many header columns or data columns you can use. Header columns can go on the left of the pivot table or along the top. You can place header columns on the left axis or along the top and order them however you want.

To use it, select columns representing both the headers and the data and call the Pivot Table - Selected Columns command. It will place all non-numerical data in the Left Headers list and all numerical data in the Data list. You're free to move them around as you wish by simply dragging and dropping. For instance, if there's a year column, it'll start in the Data list, but you'll probably want it to be under one of the Header lists.

You can select as many statistical measures as you like. The options are:

  • Sum
  • Mean
  • Median
  • Count
  • Variance
  • Standard Deviation

If you want to remove a column from the Left Headers, Top Headers, or Data lists, click on the X symbol to the right. If you want to temporarily hide it, uncheck the box to the left.


Buttons

If you want to pull more columns into the Pivot Table, select them on the original file and then click the Add Selected Columns button. If the underlying data is changed, you can update the tables with the Refresh Data button. To remove all columns, click the Clear Data button. If you want to transpose the table, the Swap Left and Top button will do that.

If you want to copy and entire table to the clipboard, including headers and data, right click and select the Copy Entire Table option. You can also copy individual values.