hydra
hydra copied to clipboard
Janitor does not clean up the sessions
I noticed that janitor $DSN --requests
cleans up hydra_oauth2_authentication_request
and hydra_oauth2_consent_request
tables, but not hydra_oauth2_authentication_session
. That latter one is the second largest table in our database right now. It does not seem to be a reason to keep those rows around.
Describe the solution you'd like
Remove the rows from hydra_oauth2_authentication_session
that are no longer needed.
- We should not delete a row if there is a foreign key to it from either
hydra_oauth2_authentication_request
orhydra_oauth2_consent_request
- We should not delete a row if the session has not yet expired
Additional context
The two queries to clean up hydra_oauth2_authentication_request
and hydra_oauth2_consent_request
tables are placed here. It seems fine to me to follow them by a new query to clean up the sessions as well:
var ls consent.LoginSession
err = p.Connection(ctx).RawQuery(fmt.Sprintf(`
DELETE
FROM %[1]s
WHERE NOT EXISTS
(
SELECT NULL
FROM %[2]s
WHERE %[2]s.login_session_id = %[1]s.id
)
AND NOT EXISTS
(
SELECT NULL
FROM %[3]s
WHERE %[3]s.login_session_id = %[1]s.id
)
AND authenticated_at < ?
AND authenticated_at < ?
`,
(&ls).TableName(),
(&lr).TableName(),
(&cr).TableName()),
time.Now().Add(-p.config.ConsentRequestMaxAge()),
notAfter).Exec()
I think time.Now().Add(-p.config.ConsentRequestMaxAge())
may not be the right constraint for sessions since they may be useable beyond the lifespan of consent challenges (or am I wrong?) but the rest should be fine.
It seems to me that the values in hydra_oauth2_authentication_session
are used for two different reasons:
- When doing a logout, this is use as id_token hint. It doesn't seem to be a big deal if hydra get's a request for an already deleted row here since the user is already technically logged out.
- When a user has
rememberFor
option set, this table is looked up to see if there is an active session with the corresponding session_id stored in cookie. If the session is not in the table, the user is required to reauthenticate.
The problem I see here is that rememberFor
is not kept in the database. Instead, it's set as the maxAge of the cookie. Is there a way to safely delete the rows then? Do we keep the rememberFor it in another table?
True, we don't keep that around. I think we should add that field to the table!
Does the table have any foreign keys on e.g. login or consent requests?
Both hydra_oauth2_authentication_request
and hydra_oauth2_consent_request
have a login_session_id
pointing to the session table.
It seems that remember_for
is kept in the hydra_oauth2_authentication_request_handled
table
Ok, I don't remember if the constraint is cascade delete or cascade set null. If it is the latter, this table would be safe for purging old records
It seems that it cascades deletes:
CONSTRAINT `hydra_oauth2_authentication_request_login_session_id_fk`
FOREIGN KEY (`login_session_id`)
REFERENCES `hydra_oauth2_authentication_session` (`id`)
ON DELETE CASCADE
I think we can delete the sessions with id
s returned from the following query before we cleanup the other two tables (requests and consents). Otherwise, we loose the connection between the hydra_oauth2_authentication_session
and hydra_oauth2_authentication_request_handled
that holds the remember_for
. This triggers a cascade delete that removes rows from hydra_oauth2_authentication_request
as well, which seems fine to me.
select s.id
from hydra_oauth2_authentication_session s,
hydra_oauth2_authentication_request r,
hydra_oauth2_authentication_request_handled h
where s.id = r.login_session_id
and r.challenge = h.challenge
and h.remember = 1
and now() - h.requested_at > h.remember_for;
The problem with the proposed query is that it would delete login session, which deletes authentication session, which deletes (I think) consent session, which deletes access tokens. So while the login session might be expired, it could still bel inked to an access token. This connection must be kept alive for the OIDC Front/Back-channel logouts to work.
Thus, I think that we can only safely delete all sessions that do not have foreign key references. If they do have foreign key references, it means they are still used.
Foreign keys references should be removed by the current janitor already.
What we would need however would be a expires_at
timestamp in the login_session
. Then we do basically delete all login sessions with expires_at < now() AND NOT EXISTS hydra_oauth2_authentication_request ...
Deletes from hydra_oauth2_authentication_session
are cascaded to hydra_oauth2_authentication_request
, but not to hydra_oauth2_consent_request
. So, tokens and other things dependent on consent should not be affected.
But you are right, sessions are not useful without hydra_oauth2_authentication_request[_handled]
. So, it seems fine to delete them.
I created a draft PR. Please have a look and let me know what you think about it. I can add more tests if it is needed.
One issue with the current implementation is this line:
https://github.com/ory/hydra/blob/0dda22d8106027c12833d04ec019ff10cfd9ab29/persistence/sql/persister_consent.go#L490
It should probably use remember_for
rather than ConsentRequestMaxAge()
.
Hello contributors!
I am marking this issue as stale as it has not received any engagement from the community or maintainers a year. That does not imply that the issue has no merit! If you feel strongly about this issue
- open a PR referencing and resolving the issue;
- leave a comment on it and discuss ideas how you could contribute towards resolving it;
- leave a comment and describe in detail why this issue is critical for your use case;
- open a new issue with updated details and a plan on resolving the issue.
Throughout its lifetime, Ory has received over 10.000 issues and PRs. To sustain that growth, we need to prioritize and focus on issues that are important to the community. A good indication of importance, and thus priority, is activity on a topic.
Unfortunately, burnout has become a topic of concern amongst open-source projects.
It can lead to severe personal and health issues as well as opening catastrophic attack vectors.
The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone.
If this issue was marked as stale erroneous you can exempt it by adding the backlog
label, assigning someone, or setting a milestone for it.
Thank you for your understanding and to anyone who participated in the conversation! And as written above, please do participate in the conversation if this topic is important to you!
Thank you 🙏✌️
@aeneasr hello, we are currently working on this here - https://github.com/ory/hydra/pull/3133
so this has stalled? I really would need to clean up those sessions ... There are millions of rows there on a relatively small site.
Hello contributors!
I am marking this issue as stale as it has not received any engagement from the community or maintainers for a year. That does not imply that the issue has no merit! If you feel strongly about this issue
- open a PR referencing and resolving the issue;
- leave a comment on it and discuss ideas on how you could contribute towards resolving it;
- leave a comment and describe in detail why this issue is critical for your use case;
- open a new issue with updated details and a plan for resolving the issue.
Throughout its lifetime, Ory has received over 10.000 issues and PRs. To sustain that growth, we need to prioritize and focus on issues that are important to the community. A good indication of importance, and thus priority, is activity on a topic.
Unfortunately, burnout has become a topic of concern amongst open-source projects.
It can lead to severe personal and health issues as well as opening catastrophic attack vectors.
The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone.
If this issue was marked as stale erroneously you can exempt it by adding the backlog
label, assigning someone, or setting a milestone for it.
Thank you for your understanding and to anyone who participated in the conversation! And as written above, please do participate in the conversation if this topic is important to you!
Thank you 🙏✌️