vendredi 8 avril 2022

Database design: coherence issue when managing timestamped data

I have a SQL database with two tables.

A sessions table containing:

  • session id: the primary key
  • login timestamp: the unix timestamp of the start of the session
  • logout timestamp: the unix timestamp of the end of the session
  • user id of the logged user

An events table containing:

  • event id: the primary key
  • event timestamp: the unix timestamp of the event
  • action id: the type of event occurred

An event may occurs during a user's session, or it may occur while no one is logged in. In my scenario only one user can be logged in at a time (think about a kiosk-like machine; sure, I am omitting project details to better clarify the essence of the problem).

Sample data:

events table: | event_id | event_timestamp | action_id | | - | - | - | | 1 | 1649400005 | 421 | | 2 | 1649400010 | 112 | | 3 | 1649402000 | 331 | | 4 | 1649402010 | 112 | | 5 | 1649405005 | 421 |

sessions table: | session_id | login_timestamp | logout_timestamp | user_id | | - | - | - | - | | 1 | 1649400000 | 1649400020 | 11 | | 2 | 1649405000 | 1649405030 | 22 |

What I need:

I need to get the list of events occurred during each user session like so:

user_id session_id event_id event_timestamp action_id
11 1 1 1649400005 421
11 1 2 1649400010 112
22 4 5 1649405000 421

What I've tried:

I've tried to add a nullable FK column session_id on the events table to rely on FK capability of the database to retrieve the data in an optimized way. In this case, if an event occurs without a user being logged in the FK key will be NULL, otherwise it will contain the session id.

But as I stated in the title I see a potential coherence issue with this solution. It is infact too easy to enter an invalid data by mistake linking an event to a session having a time window (login-logout) that does not include the event time itself.

My current solution:

In order to avoid potential errors, I'm thinking of not using any FK and using join queries based on timestamp values. In this way a session is linked to an event if the event timestamp is between the login and logout timestamps of a session.

What I'm asking for:

The question is about good design practices and tips about my current design.

I'd like to know if the choice of not using an FK makes sense for this scenario or violates the principles of good database design.

I'm worried about performance problems in data retrieval since the conceptual relation between sessions and events don't rely on FKs but on a "weak" conceptual relationship not explicitly modeled by my design.

side note: This is a very simple example and performance issues are not very likely. But the important bit is the concept. If a better design exists for this situation I'd like to learn it now and use the right design also for this simple scenario in order to avoid potential problems in more complex projects.

Aucun commentaire:

Enregistrer un commentaire