Uncategorized

postgresql – Is this expected behavior for Python Datetime with timezone inserted into Postgres DB using sqlalchemy orm? What’s going on?


Inserting datetime with timezone object into postgres column without timezone appears to first convert to a local timestamp and then removes the offset. Eg for 2024-01-15 19:04:38.921936+00:00 I expected to see 2024-01-15 19:04:38.92193 in the database, but we get: 2024-01-15 14:04:38.921936. (I’m in UTC-05)

In updating a bunch of a code base to Python 3.12, datetime.datetime.utcnow() calls that return a naive datetime object with UTC time were replaced with timezone aware datetime objects in UTC timezone – as recommended in the Python documentation (utcnow is soon to be deprecated). Using sqlalchemy 2.0.25 in python 3.12 on windows with Postgres 14 – I noticed some behavior I didn’t expect, and I’m wondering if it’s expected or not, and either way if someone with a better understanding can help me understand what’s happening.

Assume that you have a postgres database and a table with two columns:

  • a col of type timestamp without timezone – which is the default type of column created by sqlalchemy with psycopg2 drivers for sqlalchemy column type Datetime
  • a col of type timestamp with timezone – which is the default type of column created by sqlalchemy with psycopg2 driver for sqlalchemy column type Datetime(timezone=True)

Now insert two rows using the ORM:

  • a python naive datetime.datetime object into both cols
  • a python datetime.datetime object with timezone (UTC) into both cols

The result for the first row, or the naive datetime object, is that both timestamps are the same. This is what I expect, and this also what happens using SQLite instead of postgres.

Behavior for the second row, where the UTC timezone datetime object is inserted, is different. The postgres timestamp with timezone column is mostly as expected, it is displayed in local time instead of UTC time, but it has the correct UTC offset associated with it so it’s referring to the same time and there’s no real problem. Eg instead of 2024-01-15 19:04:38.921936+00:00 it shows 2024-01-15 14:04:38.921936-05:00 in my case.

On the other hand the postgres timestamp without timezone column in this case contains a timestamp with no timezone as expected, but it has been converted to local time. So it’s as if the UTC timezone datetime object was first taken as the local time with timezone indicator (-XY:00), but then had the indicator striped to fit in the column (because it doesn’t allow timezones).

That second part is what I didn’t expect and actually caused some problems during the update. I expected it to take the timestamp, and just strip the timezone information. Eg if you print the datetime w/ timezone object, it prints relative to the timezone offset it was created with (eg for 2024-01-15 19:04:38.921936+00:00 I expected to see 2024-01-15 19:04:38.92193 in the database, but we get: 2024-01-15 14:04:38.921936).

This is different from what happens with SQLite and the same setup – but I realize that might be a difference in how SQLite’s timestamp columns work (I’m not familiar with that at the moment). At least the behavior in SQL is what I intuitively expected.

I set up a minimal test to show what I mean here: https://gist.github.com/heathhenley/35c89bbbc2fe6013e32e7e25ded5da0c
It might a little large to paste all in this question, but if that’s preferred no problem.

It assumes you have a local postgres running with a “test_db” database (could use docker to spin up quickly), and pip installed sqlalchemy and psycopg2.

Example output from the test:

Testing sqlite
ts_now_with_tz: 2024-01-15 19:11:06.711044+00:00
ts_now_niave: 2024-01-15 19:11:06.711044
 Insert ts utc with tz -->  TS no TZ: 2024-01-15 19:11:06.711044, TS with TZ: 2024-01-15 19:11:06.711044
 Insert ts utc naive -->  TS no TZ: 2024-01-15 19:11:06.711044, TS with TZ: 2024-01-15 19:11:06.711044

Testing postgres
ts_now_with_tz: 2024-01-15 19:11:06.851948+00:00
ts_now_niave: 2024-01-15 19:11:06.851948
 Insert ts utc with tz -->  TS no TZ: 2024-01-15 14:11:06.851948, TS with TZ: 2024-01-15 14:11:06.851948-05:00
 Insert ts utc naive -->  TS no TZ: 2024-01-15 19:11:06.851948, TS with TZ: 2024-01-15 19:11:06.851948-05:00

Please let me know if there is more information needed, would just like to understand it better if possible.

Basically all times we work with on the back end are always in UTC – so at the moment we’re using columns of type timestamp without timezone as they’re all UTC. Maybe in the future it’s better to using timestamp with timezone, but I’ll have to look into that.

Of course I can convert any datetime objects to UTC and strip timezone info to fix this problem. In that case the UTC timestamp ends up in the col – and this basically replicates the pre 3.12 version as all the datetimes were naive (but technically in UTC timezone). I would just like understand what’s going on better if possible.

Is this expected behavior and why?



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *