글의 목적

JPA에서 lazy loading(FetchType.LAZY)을 사용해 N+1 문제를 해결할 수 있다고 한다. 그런데, 격리 레벨이 각각 repeatable read, read committed인 DBMS와 함께 사용 시 어떻게 작동하게 될까? 라는 질문에 대답하고자 한다.

공통 상황

Transaction A와 B가 있다고 하자.

A에서 table T의 1~10번째 row를 select했다. 그러나 이 row들은 lazy loading이 될 예정이다.
B에서 table T의 5번째 row를 update 후 commit 했다.
A에서 table T의 5번째 row를 읽어들였다.

repeatable read의 경우

이 때, repeatable read이므로 B에 의한 update 내용이 반영되지 않은, 그 전의 내용이 보여야한다. 그러나 애초에 DBMS에 select 쿼리를 늦게 날렸는데, 어떻게 이것을 읽어들일 줄 알고 미리 보존해둘 수 있을까?…잘 생각해보면 여기서부터 정답이 보이긴 한다.

애초에 repeatable read가 어떻게 보장되나

이는 row 단위의 version control을 통해 보장된다.

If another transaction (Transaction B) updates that row, the DBMS creates a new version of the row rather than overwriting the original one.
As long as Transaction A is still active, it will continue to see the old version of the row that was present when its transaction began. This prevents non-repeatable reads.

다시 내 고민 상황을 살펴보면, 사실 DBMS 입장에서는 다음과 같다:

Transaction A 시작 (lazy하게 select 될 row에 대한 정보는 전혀 없음)
Transaction B가 row update 후 commit
Trnasaction A가 update된 row를 읽음

이 때, TransactionA의 timestamp를 토대로 old version을 반환해주는 것이다.

read committed 인 경우

위와 전부 동일한데 격리 레벨만 read committed인 경우. 예상치 못한 동작이 발생한다.

격리 레벨이 read committed 인 dbms와 JPA의 lazy loading을 사용하게 된다면, 아직 access 되지 않은, 즉 reference만 존재하는 select된 row가 ‘실제로 access 되는 시점’에 따라 그 결과가 달라질 수 있다!

A에서 table T의 1~10번째 row를 select했다. - 이 시점의 5번째 row는 old value이다.
B에서 table T의 5번째 row를 update 후 commit 했다.
A에서 table T의 5번째 row를 읽어들였다. - 이 시점의 5번째 row는 update value이다.

논리적으로는 old value를 return하는 것이 맞을 것 같지만, lazy loading을 선택했기 때문에 eager loading과 비교해 그 결과가 달라졌다.

결론

이 고민 덕분에 DBMS의 격리 레벨이 어떤 방식으로 보장되는지 조금은 더 잘 알게 되었다!

JPA의 lazy loading과 read committed