Year 2020 starts on December 30, 2019. It's true... I was not mistaken. So if your team has to deal with software where dates and timestamps are relevant, you might want to read this. We have been through this problem before and it was not easy to find out what was going on.
In the end of 2015, we were alerted to the importance of formatting dates, and the misuse of "YYYY"
in some languages.
TL;DR:
In the beginning of 2017 the "BUG" attacked again, but in a more discreet way; so discreet that I took some days to notice what had happened. More than if we should use "YYYY"
or "yyyy"
the lesson to be retained is that as engineers we have to try to get to know the APIs, libraries, frameworks and such that we use. Know and test their operations, and be aware that even this may sometimes not be enough. And at the end of the day, always use "yyyy"
.
The story I will tell you happened in the backoffice (BO) of a resource management system. Fortunately, it wasn't a critical error, but it could have been.
Chronicles of a "YYYY" in 2017
Once upon a time there was a user who ended the usage of a resource manually, for reasons beyond our control. The procedure in these cases is for the manager of the system to register the occurrence in the BO, to ensure that the timestamp is registered. However, for some reason, the BO was not allowing the manager to do so. Instead the system presented him with the message: "Enter a valid date".
The only restriction that exists in the code is that the timestamp for the end of activity must be later than the one for the beginning, and that it also should not exceed the current time. The user had started at 2017-01-01 22:07, and the manager was trying to close the timestamp at 2017-01-01 23:45. Since we were already on the 2nd of January, there was no doubt that the date entered was valid, and within the conditions.
We started by looking at the code, and we confirmed that the conditions were corrasect. We confirmed that there was no apparent reason for this error, and we felt we were facing a byzantine fault. As per usual, the team was under some pressure and we even briefly considered force-closing the session, so that the user would not be inconvenienced, while we tried to correct the problem. It would not be the first time.
This is a recurring situation with which you should also be familiar: the external pressure we are subject to, and that sometimes causes some errors to go by unnoticed, until they regrettably cause critical failures.
We debugged in production (PROD) and realized that the date the user was starting was being converted to 2016-01-01 23:45. And why? Exactly because we had put "YYYY"
in the code instead of "yyyy"
For those who do not know, in a java.text.SimpleDateFormat(Java > 7), 'Y'
and not 'y'
represents what is called Week Year
. That is, the year in which the week belongs to, and not the current year. As per the Week Year
"convention", January 1, 2017 is still considered a week of 2016, and so in PROD the format()
returns 2016 in the data field for the year. On the other hand, with "yyyy"
it would return 2017.
SimpleDateFormat sdfYYYY = new SimpleDateFormat("YYYY-MM-dd");
SimpleDateFormat sdfyyyy = new SimpleDateFormat("yyyy-MM-dd");
Date inputDate = new DateTime(2017, 1, 1, 23, 45).toDate(); // form inputDate mock
System.out.println(sdfYYYY.format(inputDate)); // prints 2016-01-01
System.out.println(sdfyyyy.format(inputDate)); // prints 2017-01-01
But the story does not end here.
Do you think I would waste my time on this novella if it were just to remind you of the differences between a 'Y'
and a 'y'
?
The problem is that "it worked on my machine". On all the developers' machines that did this test using "YYYY"
it printed 2017. That is, no one could replicate the bug. Or in other words, nobody could get the 'Y'
to function as documented.
We started by solving the obvious and replaced the 'Y'
's with 'y'
's, and the system started to function as intended in all environments.
// some extra sarcasm {
Should we be concerned that 'Y'
does not work locally? Would it be a problem? It could even be a desirable feature: we would never have to worry about choosing between lower and upper case!
}
We tried replicating the bug on different machines, and we were able to do it on the Pre-Production machine (PRE). This clearly increased the pressure for us to try to figure out what was going on. PRE is a local machine. The chances of this behavior being due to the misalignment of the stars with Amazon's data-center had diminished.
As the engineers that we are, we could not let such inconsistencies go away. After all, we try to understand the reason for these problems, and we enjoy this search. We know that machines are always right, and we are aware that if we do not understand their motives well, sooner or later, we will be haunted by our lack of knowledge.
Why then did a 'Y'
behave on our machines as if it were a 'y'
? In the java.text.SimpleDateFormat
documentation we read:
If week year
'Y'
is specified and the calendar doesn't support any week years, the calendar year ('y'
) is used instead. The support of week years can be tested with a call togetCalendar()
.isWeekDateSupported()
The isWeekDateSupported()
from java.util.Calendar
returns false
because the Calendar
is an abstract class that does not implement the Week Date
concept, which is the basis of Week Year
. However in the GregorianCalendar
- which is the default implementation on this side of the globe - it returns true
and the Week
Date
concept is in fact implemented. We have confirmed that in every environment isWeekDateSupported()
returns true
. It would therefore be expected that "YYYY"
would give us 2016 in all environments... but no: in our machines, it continued to yield 2017. This was not where the problem came from.
(Note that, for example, if we ever stop by Thailand and/or our machines are set with their Locale there, our application and probably everything that belongs to package java.*
, will eventually be using a BuddhistCalendar
by default.)
However in this project we do not use java.util.Calendar
. We use org.joda.time.DateTime
from Joda, which even has a method that gives us the Week Year
:
public int getWeekyear()
Get the weekyear field value.
The weekyear is the year that matches with the weekOfWeekyear field. In the standard ISO8601 week algorithm, the first week of the year is that in which at least 4 days are in the year. As a result of this definition, day 1 of the first week may be in the previous year. The weekyear allows you to query the effective year for that day.
This method confirmed to us that we were not crazy. In all environments (new DateTime(2017, 01, 01, 23, 45)).getWeekyear()
returned 2016. We also learned that it was ISO-8601
that set the standard for the Week Year
representation, at least in Joda
DateTime
.
We realized that if it was not a matter of the constitution of the date itself, it would have to be something to do with formatters. Out of curiosity and to test our sanity, we tried to do the same transformation, but this time using org.joda.time.format.DateTimeFormatter
with "YYYY"
. We ended up getting 2017 in all environments, including PROD and PRE. But isn't "YYYY"
Week Year
?
When we stopped practicing divination and went to read the org.joda.time.format.DateTimeFormat
documentation, we realized that the DateTimeFormatter
was working correctly, and we also began to realize that, in the end, conventions aren't an exact science:
Yes, as you can see, the 'Y'
in Joda
does not mean Week Year
... Well. So the problem was only with java.text.SimpleDateFormat
. At least, the 'y'
continues to mean the same for everyone, and the rule "always use 'y'
" continues to be valid.
We then entered that phase which we all love: where we no longer know where to start. We've forced timezones between other things that I'm too embarrassed to share. We even considered the chance of being the lucky ones with a bug derived from this year's Leap Second.
Our last hope was that the developers had left a bug somewhere in that version of Java, and that it had already been fixed in the versions on our machines. Even if there were no record in any bug tracker, never lose hope! We then installed on one of our machines the same Java build as there was in PRE and PROD and... nothing. 'YYYY'
still did not give 2016 locally.
We then did what we should have done a long time ago. We went to explore the SimpleDateFormat
code:
public SimpleDateFormat(String pattern)
{
this(pattern, Locale.getDefault(Locale.Category.FORMAT));
}
public SimpleDateFormat(String pattern, Locale locale)
{
if (pattern == null || locale == null) {
throw new NullPointerException();
}
initializeCalendar(locale);
this.pattern = pattern;
this.formatData = DateFormatSymbols.getInstanceRef(locale);
this.locale = locale;
initialize(locale);
}
private void initializeCalendar(Locale loc)
{
if (calendar == null) {
assert loc != null;
// The format object must be constructed using the symbols for this zone.
// However, the calendar should use the current default TimeZone.
// If this is not contained in the locale zone strings, then the zone
// will be formatted using generic GMT+/-H:MM nomenclature.
calendar = Calendar.getInstance(TimeZone.getDefault(), loc);
}
}
If up until now we thought that a SimpleDateFormat
used only the location to translate symbols (e.g. Dec vs Dez, Mon vs Seg), then after all this we realized that it uses its own java.util.Calendar
in the process of formatting dates, to where the Locale
is passed.
The PROD and PRE machines were in en_GB
, and the development machines where we had been testing locally were in en_US
. Et voilà!
(At this point it started to make sense that in the SimpleDateFormat documentation there would be references to the Calendar, anyway...moving on!)
But wait: "So, if there's a standard, such as ISO-8601, didn't the United States adopt it?" Yes and no! Or in the middle, whichever you prefer.
To start with, it was clear to us that the java.util.Calendar
is not ISO-8601
compliant, with regard to a Week Date
: Java Calendar WEEK_OF_YEAR not ISO-8601compliant?
On the other hand, if you remember, when we previously tried to extract the Week Year
from a DateTime
, as it was org.joda.time.DateTime
we always got 2016. And it was what we expected, that is, Joda
respects the standard:
The standard considers the first day of the week as Monday. However, this is not true in the conventions of some countries such as the United States, which adopted the ISO-8601
for the representation of dates but diverge on the concept of Week. For instance, in the United States the first day of the week is Sunday. It is precisely this that, for good or bad, java.util.Calendar
, more specifically the GregorianCalendar
class, reflect.
So if you ever for some reason or chance really, really want to use the concept of Week Date
and Week Year
(with "YYYY"
or in some other way), make sure your code does what you intend.
And if, in this sense, for some reason or chance, you really have to use a java.text.SimpleDateFormat
and need to follow ISO-8601
standards, the best way to do it will be to calculate it manually. It is easy, as suggested by the "Leap Second" Wikipedia page, using the assumption that the first week is always the one containing the first Thursday of the month, as described in the standard.
Finally, to help the "party", in Java 8
, we also have available the java.time.format.DateTimeFormatter
class. Despite this new java.time.*
package having the same Joda
conventions, in this DateTimeFormatter
the 'Y'
continues to mean Week Year
, unlike Joda
's DateTimeFormatter
.
So, it's never too much to remember, always use 'y'
... until the day when it no longer does what you want.
And on that day read the documentation of whatever you are using :)