Don’t mess around with varchar(1)

Maybe I’m showing how old I am, but I remember as a kid hearing a song on the radio that went like this:

You don’t tug on Superman’s cape
You don’t spit into the wind
You don’t pull the mask off the old Lone Ranger
And you don’t mess around with Jim

You’ve probably heard it at some point, although likely you won’t remember the whole thing. This is because it isn’t played much anymore due to the lack of demand for ’70s folk-rock stations.

Anyhow, I get most of that chorus, but I don’t know who Jim is. I knew a kid named Jim King in grade school, and he was a nice enough fellow. As his friends we all called him King James, not Jim. And then there’s Jim Halpert from “The Office”. If you mess with him he’ll put your stapler in gelatin. And then there’s Jim Beam, which is just awful stuff, so maybe that’s what the song is about.

Wait, don’t go yet. I do have a point, and it is this: Nevermind Jim, you don’t want to mess around with varchar(1).

(I know, it’s not a catchy lyric, but maybe a KillaDBA song could do something with it.)

If you’re using varchar(1) you’re likely screaming to the world “I don’t know the difference between char and varchar”. Pro Tip: don’t do that. You have a lot of data types in your tool box, so please choose wisely. And by choosing wisely I mean count the bytes.

If you create a column with char, you’re storing the bytes at how ever many bytes the char is. Char(1) is one byte. Always. Char(100) is 100 bytes. Also always. This is what is meant by fixed length. You’re smart, so you already know this.

But what you might not realize is that varchar is always two bytes plus the data length. Those two extra bytes are used to handle the actual length of the data, which of course can vary (hence the “var” in “varchar”). And those two bytes are always used for every varchar value.

So you can see where the math goes:

1. Char(1) is always 1 byte
2. Varchar(1) is always 2 bytes plus the length of the data
3. 1 byte is always less than 2 bytes

Don’t waste your bytes. There are starving tables in your database that will need those precious bytes someday.

Now, as a smart person you might find yourself considering the likelihood of blank data. Maybe you’re thinking blank data in varchar(1) will be 0 bytes whereas with char(1) it’s always 1 byte whether or not there is data. Again, do the math for the blank data.

1. Char(1) is always 1 byte
2. Varchar(1) is always 2 bytes plus the length of the data (0 bytes)
3. 1 byte is always less than 2 + 0 bytes

Two bytes may not seem like a lot of waste, but if you have a million records in your table that’s about 2 megabytes of wasted data space. And if you have 500 million records (and if you haven’t seen that many records in a table, you will soon enough) then you have wasted about 1 GB. And if you build an index on this column then you’ve wasted another 1 GB.

Feeling the pain yet?

If that doesn’t hurt enough, remember that’s data space wasted not just on pages of this table that you write. It’s also wasted for every time the database engine needs to read those pages into the buffer cache, where space is even more of a precious commodity.

I’m just trying to help you avoid unnecessary pain. Well, other than having that old song as an earworm for the next few hours.

Leave a Reply

Your email address will not be published.