consider a more sklearn like, pipeline approach
-
break out all the active learning bits into a separate class or multiple separate classes
-
train a blocking model, using the familiar fit_transform syntax. this is a separate class that emits a stream of pairs. (is this something that could really fit into the sklearn pattern)
-
train a classification model using fit_transform., this takes in a stream of pairs and emits a stream of classification decisions
actually, this all would work quite well.
https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html
we can think of blocking as related to clustering, and use that as inspo.
https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html#sklearn.cluster.KMeans