shared-housing
shared-housing copied to clipboard
Transform Ratings Matrix to Similarity Graph via Cosine Similarity
Overview
This module should receive as input the ratings matrix (or similar data structure), as well as a threshold to determine whether two neighbors are "similar", and generate a similarity graph using Cosine Similarity.
Action Items
- Python module/package containing a top-level method for generating the similarity graph
Resources/Instructions
Build a graph, given a ratings matrix:
- Create a node for each user
- Create an edge between two users when the cosine similarity of their Ratings vectors is greater than the specified threshold.
The ratings matrix is simply the collection of ratings "vectors" for each user. For example, three users rating two different intake questions (or living unit) is represented as a 3x2 matrix (a row for each user, and a column for each question or living unit).
Note that this module is not affected by the business rules used to compute the ratings, and therefore it does not matter what information is represented by the columns in the matrix.
Suggested output data type: NetworkX Graph
Suggested computation libraries: Sci-kit learn Cosine Similarity implementation