Things are not always what they seem in Excel!
Introduction to Automatic Data Type Conversion
The underlying cause of many of the Excel support calls that we received is down to a failure to recognise that data is not always what it seems to be!
As a result some of the issues caused are:
- VLookup returns #NA or even worse the wrong result.
- PivotTables return wrong results.
- Grouping data by dates in a PivotTable is not available.
- The Data > Filter > Drop Arrows list dates incorrectly.
- Credit card numbers are not stored completely.
- Number formatting doesn’t seem to have any effect.
- The list goes on…
Excel has many situations in which it will implicitly convert from one data type to another or data may not be stored as you are expecting. Excel does not always bring this to your attention.
Data Type Conversion
Unlike with databases and programming, Excel users are not obliged to define what type of data they are entering, pasting or importing into their workbooks. On the whole Excel makes a pretty good job of guessing. Understanding what assumptions Excel makes with input data and how we can quickly detect the unexpected is essential if we are to avoid downstream errors and problems.
At a very basic level everything you enter into Excel will be interpreted as either text or a numeric.
Numeric: As far as Excel is concerned if input data looks numeric it will treat it as such and unless you have already formatted the cell it will try to apply the nearest numeric format to match what you have entered. This could include currency symbols, percent symbols, date/time formatting,thousand separators and decimal places.
Entering a Numeric as a Code: If you require that your numeric data is actually to be stored as text then you need to preformat the cell as text. (Format cell > Numeric Tab > Category = Text or use a label prefix which is an apostrophe (?) as the first character.)
You would wish to do this for the following reasons:
- The code has leading zeroes that you wish to retain.
- The code is more than 15 digits long.
If you have a precise number of digits in your code then you could store it as a value and use a custom numeric format such as 00000 to force 5 digits.
Dates – The Big Issue: Anything resembling a date may be converted to a date serial number and formatted with the nearest date / time format according to your Windows regional & language settings. You need to take control of dates and whilst you can enter them in a variety of styles according to your regional & language settings it is best to preformat the cells to display the date/time as you require. Once converted and stored as a date serial number (Which is a count of the number of days since 1 Jan 1900. e.g. 22 Sep 2017 is 43,000) then you are able to change the format as required and confidently proceed to use dates for filtering, sorting, PivotTables, grouping, charting and calculating.
Seeing what is really going on behind the scenes to uncover automatic data type conversion.
So Excel’s intelligence often masks just what really ended up in the cell. If only we could remove all the fancy formatting, reveal the formulas and widen the columns we could solve many of these issues. A most useful “switch” to do just that is ‘Show Formulas’ (keyboard shortcut: Ctrl + ` (That’s Grave and is to be found at the top left of a UK keyboard. Sometimes called a backquote or a backtick).In the ribbon > Formulas Tab > Formula Auditing Group > Show Formulas. This is a toggle. Once you have finish your diagnostics be sure to return to a normal view with the same shortcut.
Other ‘Gotchas’ on related to Automatic Data Type Conversion are:
- Percentages are stored as a fraction of unity (1) so 50% is actually stored as 0.5.
- Excel only handles a numeric to a precision of 15 digits.
- Dates when accidentally formatted as a numeric will be around 35,000 to 44,000 which always look like an erroneous monetary value has slipped in from somewhere. E.g. 16/03/2016 becomes £42,445.00
- For VLookup to work, the Lookup_value must be the same data type as the data in the first column of the Table_array. Always use the fourth argument (Range_lookup) with a value of FALSE or 0 to ensure that #N/A is returned if you are looking for a precise match.
- If dragging a field in a PivotTable to the values area results in “Count of XXX” the data contains some non-numeric values or blanks. The result will still not include the non-numerics in the data even though you might force it to “Sum of XXX”.
Observe the natural alignment- Unless horizontal alignment has been asserted:
- Values / Dates / Times / Percentages align to the right.
- Text aligns to the left.
Automatic Data Type Conversion Related Links
16 March 2016