In the Linux kernel, the following vulnerability has been resolved:
net/rds: No shortcut out of RDS_CONN_ERROR
RDS connections carry a state "rds_conn_path::cp_state" and transitions from one state to another and are conditional upon an expected state: "rds_conn_path_transition."
There is one exception to this conditionality, which is "RDS_CONN_ERROR" that can be enforced by "rds_conn_path_drop" regardless of what state the condition is currently in.
But as soon as a connection enters state "RDS_CONN_ERROR", the connection handling code expects it to go through the shutdown-path.
The RDS/TCP multipath changes added a shortcut out of "RDS_CONN_ERROR" straight back to "RDS_CONN_CONNECTING" via "rds_tcp_accept_one_path" (e.g. after "rds_tcp_state_change").
A subsequent "rds_tcp_reset_callbacks" can then transition the state to "RDS_CONN_RESETTING" with a shutdown-worker queued.
That'll trip up "rds_conn_init_shutdown", which was never adjusted to handle "RDS_CONN_RESETTING" and subsequently drops the connection with the dreaded "DR_INV_CONN_STATE", which leaves "RDS_SHUTDOWN_WORK_QUEUED" on forever.
So we do two things here:
a) Don't shortcut "RDS_CONN_ERROR", but take the longer path through the shutdown code.
b) Add "RDS_CONN_RESETTING" to the expected states in "rds_conn_init_shutdown" so that we won't error out and get stuck, if we ever hit weird state transitions like this again."
|