Brian Wylie

Results 248 issues of Brian Wylie

FAISS (facebook research): https://github.com/facebookresearch/faiss Blog about Sim Search: https://towardsdatascience.com/similarity-search-knn-inverted-file-index-7cab80cc0e79

algorithm
research

https://xgboost.readthedocs.io/en/stable/tutorials/categorical.html

model
research

Version 1.3.dev0 of scikit-learn has a new set_output API that lets you pipeline objects with dataframe outputs/inputs https://scikit-learn.org/dev/auto_examples/miscellaneous/plot_set_output.html#sphx-glr-auto-examples-miscellaneous-plot-set-output-py

api
transform
algorithm
pandas
research

We need to think about what domain specific functionality is needed when creating/generating feature sets. The obvious one is that each domain will need a separate set of Python classes...

domain

https://github.com/YingfanWang/PaCMAP/blob/master/demo/basic_demo.py

algorithm

https://dashaggrid.pythonanywhere.com/ Standard Dash Table: https://dash.plotly.com/datatable Dash AG Table Comparison: https://youtu.be/dovf4FwtwPg?t=1862 Dash AG Table Info/Install: https://dash.plotly.com/dash-ag-grid

web_interface
application
architecture
web_view

Lets take a deeper dive on Athena Views and see how/where they fit into SageWorks. https://docs.aws.amazon.com/athena/latest/ug/views.html

aws_service_broker
data_source
feature_set
architecture
athena
web_view

We should optimize the SQL query for Value Counts in the same way we did for column_stats()

algorithm
performance
SQL

We have the AthenaSource, so we need to make an RDSSource that hits an AWS RDS database.

api
aws_service_broker
data_source
architecture

Might be fun to think about a Zeek Application Prototype that uses SageWorks to quickly build an application that uses AWS ML Services (via SageWorks).

application