What's Wrong with Timestamp (Again)

Lately, some new interns sent a few merge requests to me, with tons of weird bugs. So here it comes.

TLDR

For Practice

  • In PostgreSQL, use timestamp with time zone

    • Example: ADD COLUMN created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
  • In Node.js

    • Use setUTC functions such as setUTCHours to deal with Date type data
    • It’s OK to use type Date directly in prepared statements
    • It’s OK to use Date.toUTCString() in SQLs for convenience
    • Set timezone manually in cron jobs

Time Zone and Timestamp

Unix Time

  • A number like 1619141513 is a standard representation of all time types, which is the exact number of seconds passed since 1970-01-01 00:00:00 GMT
  • It’s the one and only. No need to worry about a “Chinese Unix time” thing
  • Actually it’s also the lower level of how everyone stores time-based data (but with millisecond info)
  • Notice in JS or some other modern languages/databases, Date type contains millisecond info to suit more situations, so use Math.floor(Number(new Date()) / 1000) to get a standard Unix time

Timezone

  • Say +0800 at Asia/Shanghai or +0900 at Asia/Tokyo (US uses different timezones for different states so don’t add more work)
  • It’s actually the offset info based on GMT

ISO Standard

  • A bunch of formats to show time
  • A problem is that the offset is optional
  • With an offset, we can ensure the exact same time. But without it, of course we can’t be sure

And in Conclusion

  • We can easily notice that in command line mode, we can write SQLs directly using pure strings instead of actual types (of course we can specify one for each column). There are some converters to translate strings to timestamp type.
    • Say INSERT INTO "target_table" (created_at) VALUES ('2021-05-05T06:40:36.066Z')
  • For timestamp without time zone type, the converter will ignore any timezone info
    • Say INSERT INTO "target_table" (created_at) VALUES ('2021-05-05T06:40:36.066+0800') will insert a row with 2021-05-05T06:40:36.066 as created_at column, which is actually equal to 2021-05-05T14:40:36.066+0800
  • And what’s worse, if you set a default value like now() or current_timestamp, the column will receive a timestamp based on the server physical location, and cause differences between local development environment and production environment
  • So in any cases, we don’t use timestamp without time zone just for safety
  • And for the same reason, SQL string splicing such as query(`INSERT INTO "target_table" (created_at) VALUES (${new Date()})`) is also not a good idea, because you won’t know what it will be. By default, it converts to local date string without timezone info, and leads you to the same situation above
  • So in any cases, we use (new Date()).toUTCString() or even Math.floor(Number(new Date()) / 1000) to splice SQL sentences (and in the latter case, it can solve the problem caused by using a timestamp without time zone type column in a very limited way, since it won’t help the default value problem)
  • But in the first place, SQL splicing is a very wrong idea. In most cases we should stick to prepared statements. The database dependency in Node.js will convert a Date type parameter into a timestamp with correct time zone. That’s another subject to discuss though. For some really complex, over-optimized SQL sentences, it may not be reasonable to use ORM or pure prepared statements
  • I think we should end here, since it’s still fun and ready to become boring.