shared-housing icon indicating copy to clipboard operation
shared-housing copied to clipboard

Transform Ratings Matrix to Similarity Graph via Cosine Similarity

Open tylerthome opened this issue 5 years ago • 0 comments

Overview

This module should receive as input the ratings matrix (or similar data structure), as well as a threshold to determine whether two neighbors are "similar", and generate a similarity graph using Cosine Similarity.

Action Items

  • Python module/package containing a top-level method for generating the similarity graph

Resources/Instructions

Build a graph, given a ratings matrix:

  1. Create a node for each user
  2. Create an edge between two users when the cosine similarity of their Ratings vectors is greater than the specified threshold.

The ratings matrix is simply the collection of ratings "vectors" for each user. For example, three users rating two different intake questions (or living unit) is represented as a 3x2 matrix (a row for each user, and a column for each question or living unit).

Note that this module is not affected by the business rules used to compute the ratings, and therefore it does not matter what information is represented by the columns in the matrix.

Suggested output data type: NetworkX Graph

Suggested computation libraries: Sci-kit learn Cosine Similarity implementation

tylerthome avatar Oct 15 '19 18:10 tylerthome