Inserting datetime with timezone object into postgres column without timezone appears to first convert to a local timestamp and then removes the offset. Eg for 2024-01-15 19:04:38.921936+00:00 I expected to see 2024-01-15 19:04:38.92193 in the database, but we get: 2024-01-15 14:04:38.921936. (I’m in UTC-05)
In updating a bunch of a code base to Python 3.12, datetime.datetime.utcnow()
calls that return a naive datetime object with UTC time were replaced with timezone aware datetime objects in UTC timezone – as recommended in the Python documentation (utcnow is soon to be deprecated). Using sqlalchemy
2.0.25 in python 3.12 on windows with Postgres 14 – I noticed some behavior I didn’t expect, and I’m wondering if it’s expected or not, and either way if someone with a better understanding can help me understand what’s happening.
Assume that you have a postgres database and a table with two columns:
- a col of type
timestamp without timezone
– which is the default type of column created bysqlalchemy
withpsycopg2
drivers for sqlalchemy column typeDatetime
- a col of type
timestamp with timezone
– which is the default type of column created bysqlalchemy
withpsycopg2
driver for sqlalchemy column typeDatetime(timezone=True)
Now insert two rows using the ORM:
- a python naive datetime.datetime object into both cols
- a python datetime.datetime object with timezone (UTC) into both cols
The result for the first row, or the naive datetime object, is that both timestamps are the same. This is what I expect, and this also what happens using SQLite instead of postgres.
Behavior for the second row, where the UTC timezone datetime object is inserted, is different. The postgres timestamp with timezone
column is mostly as expected, it is displayed in local time instead of UTC time, but it has the correct UTC offset associated with it so it’s referring to the same time and there’s no real problem. Eg instead of 2024-01-15 19:04:38.921936+00:00
it shows 2024-01-15 14:04:38.921936-05:00
in my case.
On the other hand the postgres timestamp without timezone
column in this case contains a timestamp with no timezone as expected, but it has been converted to local time. So it’s as if the UTC timezone datetime object was first taken as the local time with timezone indicator (-XY:00), but then had the indicator striped to fit in the column (because it doesn’t allow timezones).
That second part is what I didn’t expect and actually caused some problems during the update. I expected it to take the timestamp, and just strip the timezone information. Eg if you print the datetime w/ timezone object, it prints relative to the timezone offset it was created with (eg for 2024-01-15 19:04:38.921936+00:00
I expected to see 2024-01-15 19:04:38.92193
in the database, but we get: 2024-01-15 14:04:38.921936
).
This is different from what happens with SQLite and the same setup – but I realize that might be a difference in how SQLite’s timestamp columns work (I’m not familiar with that at the moment). At least the behavior in SQL is what I intuitively expected.
I set up a minimal test to show what I mean here: https://gist.github.com/heathhenley/35c89bbbc2fe6013e32e7e25ded5da0c
It might a little large to paste all in this question, but if that’s preferred no problem.
It assumes you have a local postgres running with a “test_db” database (could use docker to spin up quickly), and pip installed sqlalchemy and psycopg2.
Example output from the test:
Testing sqlite
ts_now_with_tz: 2024-01-15 19:11:06.711044+00:00
ts_now_niave: 2024-01-15 19:11:06.711044
Insert ts utc with tz --> TS no TZ: 2024-01-15 19:11:06.711044, TS with TZ: 2024-01-15 19:11:06.711044
Insert ts utc naive --> TS no TZ: 2024-01-15 19:11:06.711044, TS with TZ: 2024-01-15 19:11:06.711044
Testing postgres
ts_now_with_tz: 2024-01-15 19:11:06.851948+00:00
ts_now_niave: 2024-01-15 19:11:06.851948
Insert ts utc with tz --> TS no TZ: 2024-01-15 14:11:06.851948, TS with TZ: 2024-01-15 14:11:06.851948-05:00
Insert ts utc naive --> TS no TZ: 2024-01-15 19:11:06.851948, TS with TZ: 2024-01-15 19:11:06.851948-05:00
Please let me know if there is more information needed, would just like to understand it better if possible.
Basically all times we work with on the back end are always in UTC – so at the moment we’re using columns of type timestamp without timezone
as they’re all UTC. Maybe in the future it’s better to using timestamp with timezone, but I’ll have to look into that.
Of course I can convert any datetime objects to UTC and strip timezone info to fix this problem. In that case the UTC timestamp ends up in the col – and this basically replicates the pre 3.12 version as all the datetimes were naive (but technically in UTC timezone). I would just like understand what’s going on better if possible.
Is this expected behavior and why?