code-corps-api icon indicating copy to clipboard operation
code-corps-api copied to clipboard

Fix deadlocks caused by EctoOrdered

Open joshsmith opened this issue 7 years ago • 0 comments

Problem

Our new sync process revealed some issues caused in our use of EctoOrdered when there are a large number of writes. At the kind of scale we need for highly concurrent GitHub syncing, we are hitting deadlocks in the database.

Here's an example of the errors:

2017-11-02T23:39:42.632335Z [error] Task #PID<0.22638.6> started from #PID<0.3430.0> terminating
** (Postgrex.Error) ERROR 40P01 (deadlock_detected): deadlock detected

Process 14260 waits for ShareLock on transaction 1981764; blocked by process 14258.
Process 14258 waits for ShareLock on transaction 1981765; blocked by process 14260.
    (ecto) lib/ecto/adapters/sql.ex:440: Ecto.Adapters.SQL.execute_or_reset/7
    (ecto_ordered) lib/ecto_ordered.ex:242: EctoOrdered.increment_other_ranks/2
    (ecto_ordered) lib/ecto_ordered.ex:172: EctoOrdered.ensure_unique_position/2
    (ecto) lib/ecto/repo/schema.ex:456: anonymous fn/2 in Ecto.Repo.Schema.run_prepare/2
    (elixir) lib/enum.ex:1811: Enum."-reduce/3-lists^foldl/2-0-"/3
    (ecto) lib/ecto/repo/schema.ex:268: anonymous fn/14 in Ecto.Repo.Schema.do_update/4
    (ecto) lib/ecto/repo/schema.ex:768: anonymous fn/3 in Ecto.Repo.Schema.wrap_in_transaction/6
    (ecto) lib/ecto/adapters/sql.ex:576: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
    (db_connection) lib/db_connection.ex:1275: DBConnection.transaction_run/4
    (db_connection) lib/db_connection.ex:1199: DBConnection.run_begin/3
    (db_connection) lib/db_connection.ex:790: DBConnection.transaction/3
    (code_corps) lib/code_corps/github/sync/issue/task/task.ex:51: CodeCorps.GitHub.Sync.Issue.Task.find_or_create_task/2
    (elixir) lib/enum.ex:1255: Enum."-map/2-lists^map/1-0-"/2
    (elixir) lib/enum.ex:1255: Enum."-map/2-lists^map/1-0-"/2
    (code_corps) lib/code_corps/github/sync/issue/task/task.ex:45: CodeCorps.GitHub.Sync.Issue.Task.sync_project_github_repo/1
    (code_corps) lib/code_corps/github/sync/sync.ex:178: CodeCorps.GitHub.Sync.sync_project_github_repo/1
    (elixir) lib/task/supervised.ex:85: Task.Supervised.do_apply/2
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: #Function<3.4867021/0 in CodeCorpsWeb.ProjectGithubRepoController.create/2>
    Args: []

One approach to this is to rewrite our card syncing to use a single position column with floats and some ordering scheme like this one, with most of the operations happening on the server itself.

joshsmith avatar Nov 03 '17 03:11 joshsmith