Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

The Mystery of the Lost Leading Zeros

Kim WalkerSenior Developer
Published:
Updated:
What really happens to your leading zeros when you save your data in .csv format…

Almost everyone today uses Microsoft Excel to collect their contact information in columns and rows called spreadsheets. With this format, important columns can easily be highlighted and complicated calculations can be executed and updated in real time (see Figure 1). In this data-driven age, a fully-functional spreadsheet is not just convenient, it’s necessary.

Figure 1
So why is it, then, that when you go to send your beautifully stylized, precisely calculated spreadsheet to your data handler, they prefer to receive it in .csv format?

The .csv (or comma-separated-values) file format has seemingly been around forever. It is practically the most basic, human-readable format for raw data; second only to the fixed-length format. It is a simple collection of values – those values being the words or numbers that are displayed in each cell of your spreadsheet. Each row of your spread sheet is represented by a line in the .csv file and the values for each cell are separated by commas. There are no colors or type styles and no formulas to be calculated – just values. After all, the program used by the data handler doesn’t need to recalculate each formula when the spreadsheet has already done that for them. And the program doesn’t need different styles to recognize certain columns. It doesn’t prioritize columns; it handles each column of information with equal significance.

So you comply with your data handler by opening your spreadsheet in Excel and executing the “save as” function, selecting the “save as type” of CSV (Comma delimited) (*.csv).

Now you need to convince yourself that everything saved correctly so you double-click the file to see that all of your data is there. After all, you don’t want the data to go to the data handler wrong. What you find there may be ugly—the styles have all disappeared and the formulas have been replaced by their results—but it appears that the data is all there. (see Figure 2)

Figure 2
However, if you look closely enough, you’ll notice that all your New England zip codes and your eight-digit birthdates have lost their leading zeros. Or maybe you have numeric identification codes that originally began with zeros which now start with the first non-zero number instead.

But why are my leading zeros missing? After all, it’s just another Excel format. Right?

Wrong! You’ve been duped…

Actually, .csv files are simple ASCII text files and should be opened with a text editor application such as Notepad or TextEdit. But when you install Microsoft Excel, Excel claims the file format as its very own and changes the double-click behavior of your computer to open this type of file in Excel even though it is NOT a native Excel file format.

What really happens when you double-click a .csv file is that Excel imports it into an empty spreadsheet without displaying any messages or menus to indicate such. Because of the covert nature of the import, most users are misled into believing they are viewing the native file and incorrectly assume that the leading zeros are lost.

The fact is, the leading zeros are included in the .csv file but are eliminated when the .csv file is covertly imported into the empty Excel spreadsheet when you double-clicked the file. You can confirm this by opening the .csv file in a text editor such as Notepad for Windows or TextEdit for MacOS. (see Figure 3)

Figure 3
In Figure 3, the zip codes have been highlighted in yellow, the birth dates in green, to help you see that the leading zeros are actually there. This is what the data handler would see – without the highlighting, of course. The rows of values from the Excel file are on separate lines. The values from each cell are separated, or delimited, by commas.

You can be confident now that the data you’re sending to the data handler is complete, though not so visually appealing and organized as your Excel file.

Beware the middle man...or woman... ah...person…

It seems the person who generates the .csv file is rarely the person who delivers the file to the data handler. All too often, the .csv file is passed off to a purchasing agent or project manager who, wisely, decides to double-check the file to see that is has the correct information. They, too, will double-click the .csv file to peruse it’s contents. Because the data was actually imported, Excel considers the file to be a new file so it will ask if you want to save changes when the file is closed.

Never save changes when opening a .csv file by double-clicking!

Few people will question why they are being asked to save changes. Almost everyone will click the save button and then confirm that when the next screen pops up warning that the file “may contain features that are not compatible with CSV.” The leading zeros are now truly lost. The only way to recover them is to go back to the original Excel file and resave it as .csv.

But what if your data processor presents a .csv file to you for your own project? How do you get your comma-delimited data into Excel with leading zeros intact?

It is actually fairly easy to import your .csv file into Excel without losing your leading zeros. The only real problem with double-clicking the .csv file is that the import that takes place sets the column data format to “General” for all cells. In a cell with a “General” format, if it looks like a number, it’s treated like a number. In order to set the column to a different format before importing the data, you must perform a manual import.

The following instructions are for Excel 2007. Earlier versions of Excel would be very similar, though the menus may be in entirely different locations.

1.

If Excel is not already running, launch Excel from the Start menu.

2.

If you just launched Excel, it will be open with a blank workbook already created so you may skip to Step 3. If Excel was already running it may be necessary to create a blank workbook as follows: (See Figure 4)
Figure 4Select the File tab at the top of the window and select New from the left menu bar.
From the Available Templates window, choose Blank workbook and click the Create button.

3.

Click the Data tab at the top of the window and select From Text from the Get External Data drop-down or section of the top tool bar. (See Figure 5)
Figure 5

4.

When the Import Text File window appears, navigate to and select the .csv file you wish to import.

5.

When the Text Import Wizard window, Step 1 of 3, appears, be sure that Delimited is selected, then click the Next button. (See Figure 6-1)
Figure 6

6.

In the Text Import Wizard window, Step 2 of 3, choose Comma under the list of Delimiters, and choose the double quote (") as the Text qualifier, then click the Next button. (See Figure 6-2)

7.

In the Text Import Wizard window, Step 3 of 3, select the column(s) with the leading zeros and change the Column data format to Text, then click the Next button. (See Figure 6-3)

8.

In the Import Data window that follows, click the cell where you want your data import to begin or leave cell A1 selected and click the OK button. (See Figure 7)
Figure 7Your data appears in the worksheet, complete with leading zeros, and is ready for you to style as you see fit.
15
13,170 Views
Kim WalkerSenior Developer

Comments (11)

mkwestAnalyst

Commented:
Thank you for the great article, especially for the backstory along with the solution.  After all these years I didn't know how it happened, so I've just avoided Excel for my .csv files.  Now I know how to safely use excel for my .csv's without losing the leading zeros.  A very helpful article.  Thanks!
ValentinoVBI Consultant
CERTIFIED EXPERT
Most Valuable Expert 2011

Commented:
Nice article, though I prefer Notepad++ ;)
(clicked +1!)
Jeff DarlingDeveloper Analyst
CERTIFIED EXPERT

Commented:
We can thank Microsoft for trying to help out the casual user. Unfortunately, sometimes Microsoft doesn't have all the answers and guesses incorrectly at the data types for fields in csv files.  A similar kind of problem was happening with phone numbers being converted.  

Now, I just rename the .csv files to .txt then Excel brings up the text import wizard.

Or I use Microsoft Access instead.
Kim WalkerSenior Developer

Author

Commented:
@jeffld: After changing the extension of the the file, do you right-click and then "open with" Excel? Or do you go to Excel and do a File/Open?
Start Excel, and use File, Open, which automatically starts the Get External Data from Text File wizard. Open With treats it as text, and you must use the Text to Columns wizard to split it into columns.

View More

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.