The Epoch Time bug is coming – so let’s finally learn some lessons from the Millennium Bug

In 1999, we left it to the last minute to fix the Millennium Bug but got away with it. As IT Manager Michael Dear explains, its 2038 successor, the “Epoch Time bug” is ticking, and this time we may not be so lucky

25 years ago, a large number of technology experts were working very hard to make sure nothing happened. The issue was variously called the Millennium Bug, Y2K bug or Year 2000 bug, and boiled down to this: in the first days of computing, when little memory was available due to its expense, computer designers and programmers used shortcuts to save memory. One of those shortcuts was to use only two digits for the year, so 2024 would be stored as 24.

No-one thought much about the turn of the century in the 1970s or 1980s, but if you are using a two-digit dates at the turn of the century the date is 00 and a lower number than 99. What does a computer do if it were to order these years? Would 00 become 1900 rather than 2000? No-one knew for sure. 

We can’t blame those 1970s and 1980s programmers. At that point, the turn of the century was ages away – if the thought even passed through their heads, they would have dismissed it, safe in the knowledge that there was no way anyone would still be running the software they working on now. Right?

The trouble is, people don’t like to throw away what has been invented. In the example of code, you’d be throwing away what’s known to work. So code is added to rather than being replaced until, for whatever reason, its time has come.

If you know where to look, you can see tiny actions that occurred even before I was born still affecting the decisions that we’re having to make today. And that includes Epoch Time: a problem we need to fix sooner rather than later.

Rewind to 1999

Back to the Millennium Bug.

So, as the 1990s drew to an end, a lot of consultants made a lot of money advising companies on what to do. Often, this involved pulling programmers out of retirement to themselves charge a lot of money to go through code and make sure that nothing horrendous happened at the turn of midnight on 1 January 2000.

There were armies of IT people watching for issues when the clock rolled over, with teams based in the West talking to those that changed first in the Far East to make sure that nothing untoward happened in their time zones a number of hours later. 

Amazingly, the event passed without any electronic plague of locusts, we moved on and <irony> we all learnt our lesson </irony>.

Phone numbers game

I recently had a ticket appear; the ticket stated that none of the phones are letting a staff member dial internationally. This must be fixed immediately! I decided to rock up to the staff member’s desk and see what the issue was. 

It turns out they were trying to dial France. For the sake of example, I’ll say the number was written down as +33 1234567890. They ignored the plus, seeing no + on their handset, and when they dialled 331234567890 they got a wrong number tone. So, they reasoned, IT have blocked international calls.

You will have guessed by now that this was a younger member of our team, who had never heard of the international access code that you must add before the start of a number so that you can then dial the country code.

Sadly, my much younger staff member didn’t know all this and so I had to explain the above to them. Of course, now we all have mobiles, the number is normally entered once and then forgotten. Gone are address books with people’s names and numbers next to them.

If you grew up in a certain era, all this is common knowledge. But for a modern generation, the decisions made decades ago continue to have an impact. And, for reasons I will come to, bear in mind that in 2038 these people will be in charge.

IP addresses

One reason I give the phone number example because it will help to explain the problems we now have with the internet address system.

The original system worked on an address that is four blocks of number from zero to 255. You may have seen this occasionally when talking to your home internet router: something like 192.168.0.1 will appear. This is an IP address.

There is a second important number that looks like the IP address but is called the subnet mask; 255.255.255.0 is a good example. Together, these allow for the network and address of a computer on a network to be found and to deliver traffic to it. Much like an area code and international code allow one phone to call any other in the world,

Except, once more, there is a problem. The IP address and subnet mask, when combined, allow for 4,294,967,296 addresses. In reality, not that many. Some were reserved for other uses, so we can deduct about 290 million addresses from that total, giving us a round 4 billion.

At the time, this felt like a huge number. Many orders of magnitude more than enough for all the computers to connect to the internet. But now it isn’t just computers that connect to the internet: it’s phones, smoke alarms, washing machines… you probably own 20+ such devices yourself.

This is why we are moving away from the old system, IPv4 to a new version called IPv6. It should give us 3.4 x 1038 addresses and certain of your devices are almost certainly using this system without knowing. Still, I’m confident that IPv4 will still be around when I retire. And almost certainly causing problems, too.

The Epoch Time bug is coming

I started this with a tale with the year 2000, so I will finish with another time: Epoch Time. And in particular, the Millennium Bug’s natural successor, the Epoch Time bug.

You see, in Unix systems the date is stored as the number of seconds since 1 January 1970. This is called Epoch Time. It’s stored in a 32-bit signed variable, and in case your eyes have glazed over let me explain.

Signed means that it’s possible that the number is negative number. As such, the first “bit” (a one or a zero) states if the number is positive or negative. That leaves 31 bits for the time, and when it fills up on 19 January 2038 (imagine 31 repeats of the digit 1), the only logical thing for those Unix systems to do is tick around to 0. So the number, the date, turn negative and zero seconds.

Who knows what happens to a system that suddenly has the time as zero seconds away from 1 Jan 1970 and then starts moving backwards in time from 31 December 1969?

This matters, incidentally, because much of the internet’s fundamental infrastructure runs on Unix. In the same way that a CrowdStrike update error caused the world’s Windows PCs to crash, the Epoch Time bug could cause similar problems for Unix servers.

There is a solution in place. Last year, the latest Linux kernel moved the time to a 64-bit variable. My question is this: will systems using already existing kernels be in use in 15 years’ time?

I don’t know, but I suggest there will be consultants and people presently employed coming out of retirement to nurse systems through the rollover date. At least this time we won’t all be nursing a colossal hangover from partying like it’s 1999.

More articles by Michael Dear:

michael dear
Michael Dear

Michael has worked for more than 20 years running IT departments, mainly for small to medium insurance firms. His primary interest is focused on security and compliance.

NEXT UP