pyrs
pyrs copied to clipboard
Using pyrs with monkeytype for type inference
Hello, i would like to show the results i obtained using pyrs together with monkeytype and the process i followed.
These are the first commands i used:
cd typing
echo 'layout python3' > .envrc
direnv allow
git clone https://github.com/chonyy/fpgrowth_py.git
git clone https://github.com/Instagram/MonkeyType.git
git clone https://github.com/konchunas/pyrs.git
cd MonkeyType
python3 -m pip install -e .
cd ../fpgrowth_py
monkeytype run run.py
monkeytype list-modules # optional step that shows what `apply` accepts
monkeytype apply fpgrowth_py.utils
git clone https://github.com/chonyy/fpgrowth_py.git
is here the project that is being converted but it might be any project.
After that i ran
monkeytype apply fpgrowth_py.fpgrowth
But i ran into a problem
File "/home/flip111/typing/fpgrowth_py/fpgrowth_py/utils.py", line 7, in Node
def __init__(self, itemName: str, frequency: int, parentNode: Optional[Node]) -> None:
NameError: name 'Node' is not defined
I solved this by temporarily removing the type definition part : Optional[Node]
, then running monkeytype again and then putting the type definition back.
After that it was time for pyrs
python3 -m pyrs fpgrowth_py/fpgrowth_py/fpgrowth.py > fpgrowth_py/fpgrowth_py/fpgrowth.rs
python3 -m pyrs fpgrowth_py/fpgrowth_py/utils.py > fpgrowth_py/fpgrowth_py/utils.rs
rustfmt fpgrowth_py/fpgrowth_py/fpgrowth.rs
rustfmt fpgrowth_py/fpgrowth_py/utils.rs
rustfmt
then complains
» rustfmt fpgrowth_py/fpgrowth_py/utils.rs 6 files, 435 ins.(+), 268 del.(-) [14:48:38]
error: unexpected closing delimiter: `}`
--> /home/flip111/typing/fpgrowth_py/fpgrowth_py/utils.rs:48:1
|
34 | fn getFromFile<T0, RT>(fname: T0) -> RT {
| - this opening brace...
...
46 | }
| - ...matches this closing brace
47 | return (itemSetList, frequency);
48 | }
| ^ unexpected closing delimiter
Because python source code
def getFromFile(fname):
itemSetList = []
frequency = []
with open(fname, 'r') as file:
csv_reader = reader(file)
for line in csv_reader:
line = list(filter(None, line))
itemSetList.append(line)
frequency.append(1)
return itemSetList, frequency
got translated into (i indented this for convenience of reading this post)
fn getFromFile<T0, RT>(fname: T0) -> RT {
let mut itemSetList = vec![];
let mut frequency = vec![];
// with!(open(fname, "r") as file) //unsupported
{
let csv_reader = reader(file);
}
for line in csv_reader {
line = line.into_iter().filter(None).collect::<Vec<_>>();
itemSetList.push(line);
frequency.push(1);
}
}
return (itemSetList, frequency);
}
There is the unsupported with
language construct together with a file open. Also one closing bracket got introduced after let csv_reader = reader(file);
for some reason. I manually fixed this into
fn getFromFile<T0, RT>(fname: T0) -> RT {
let mut itemSetList = vec![];
let mut frequency = vec![];
// with!(open(fname, "r") as file) //unsupported
if let Ok(file) = std::fs::File::open(fname) {
let csv_reader = reader(file);
for line in csv_reader {
line = line.into_iter().filter(None).collect::<Vec<_>>();
itemSetList.push(line);
frequency.push(1);
}
}
return (itemSetList, frequency);
}
I previously ran this process without the monkeytype step. After this i was able to make a diff of the resulting rust source code. Here is a diff of the pygrowth.rs
file which shows also in Rust there are a lot more concrete types available
9,11c9,11
< itemName: ST0,
< count: ST1,
< parent: ST2,
---
> itemName: &str,
> count: i32,
> parent: Option<Node>,
17c17
< fn __init__<T0, T1, T2>(&self, itemName: T0, frequency: T1, parentNode: T2) {
---
> fn __init__(&self, itemName: &str, frequency: i32, parentNode: Option<Node>) {
24c24
< fn increment<T0>(&self, frequency: T0) {
---
> fn increment(&self, frequency: i32) {
54c54,58
< fn constructTree<T0, T1, T2, RT>(itemSetList: T0, frequency: T1, minSup: T2) -> RT {
---
> fn constructTree(
> itemSetList: Vec<Union<Any, Vec<&str>>>,
> frequency: Vec<Union<Any, i32>>,
> minSup: f32,
> ) -> Union<(None, None), (Node, HashMap<&str, Vec<Union<i32, Node>>>)> {
92c96,103
< fn updateHeaderTable<T0, T1, T2>(item: T0, targetNode: T1, headerTable: T2) {
---
> fn updateHeaderTable(
> item: &str,
> targetNode: Node,
> headerTable: HashMap<
> &str,
> Union<Vec<Option<i32>>, Vec<Option<Union<i32, Node>>>, Vec<Union<i32, Node>>>,
> >,
> ) {
103c114,122
< fn updateTree<T0, T1, T2, T3, RT>(item: T0, treeNode: T1, headerTable: T2, frequency: T3) -> RT {
---
> fn updateTree(
> item: &str,
> treeNode: Node,
> headerTable: HashMap<
> &str,
> Union<Vec<Option<i32>>, Vec<Option<Union<i32, Node>>>, Vec<Union<i32, Node>>>,
> >,
> frequency: i32,
> ) -> Node {
113c132
< fn ascendFPtree<T0, T1>(node: T0, prefixPath: T1) {
---
> fn ascendFPtree(node: Node, prefixPath: Vec<Union<Any, &str>>) {
119c138,141
< fn findPrefixPath<T0, T1, RT>(basePat: T0, headerTable: T1) -> RT {
---
> fn findPrefixPath(
> basePat: &str,
> headerTable: HashMap<&str, Vec<Union<i32, Node>>>,
> ) -> Union<(Vec<Any>, Vec<Any>), (Vec<Vec<&str>>, Vec<i32>)> {
134c156,161
< fn mineTree<T0, T1, T2, T3>(headerTable: T0, minSup: T1, preFix: T2, freqItemList: T3) {
---
> fn mineTree(
> headerTable: HashMap<&str, Vec<Union<i32, Node>>>,
> minSup: f32,
> preFix: Set<&str>,
> freqItemList: Vec<Union<Set<&str>, Any>>,
> ) {
151c178
< fn powerset<T0, RT>(s: T0) -> RT {
---
> fn powerset(s: Set<&str>) -> chain {
159c186
< fn getSupport<T0, T1, RT>(testSet: T0, itemSetList: T1) -> RT {
---
> fn getSupport(testSet: Union<Set<&str>, (&str)>, itemSetList: Vec<Vec<&str>>) -> i32 {
168c195,199
< fn associationRule<T0, T1, T2, RT>(freqItemSet: T0, itemSetList: T1, minConf: T2) -> RT {
---
> fn associationRule(
> freqItemSet: Vec<Set<&str>>,
> itemSetList: Vec<Vec<&str>>,
> minConf: f32,
> ) -> Vec<Vec<Union<Set<&str>, f32>>> {
182c213
< fn getFrequencyFromList<T0, RT>(itemSetList: T0) -> RT {
---
> fn getFrequencyFromList(itemSetList: Vec<Vec<&str>>) -> Vec<i32> {
After this i copied the two new source files into a new project
cargo new fpgrowth_rs
cp fpgrowth_py/fpgrowth_py/fpgrowth.rs fpgrowth_rs/src
cp fpgrowth_py/fpgrowth_py/utils.rs fpgrowth_rs/src
cd fpgrowth_rs
I added an import for fpgrowth into src/main.rs
mod fpgrowth;
fn main() {
println!("Hello, world!");
}
I then tried to fix the source files with clippy
cargo clippy --fix --allow-dirty
Clippy reported the following errors
» cargo clippy --fix --allow-dirty
Checking fpgrowth_rs v0.1.0 (/home/flip111/typing/fpgrowth_rs)
error[E0433]: failed to resolve: use of undeclared crate or module `fpgrowth_py`
--> src/fpgrowth.rs:6:5
|
6 | use fpgrowth_py::utils::*;
| ^^^^^^^^^^^ use of undeclared crate or module `fpgrowth_py`
|
help: there is a crate or module with a similar name
|
6 | use fpgrowth::utils::*;
| ~~~~~~~~
error[E0432]: unresolved imports `collections::defaultdict`, `collections::OrderedDict`
--> src/fpgrowth.rs:4:19
|
4 | use collections::{defaultdict, OrderedDict};
| ^^^^^^^^^^^ ^^^^^^^^^^^ no `OrderedDict` in `collections`
| |
| no `defaultdict` in `collections`
error[E0432]: unresolved import `csv`
--> src/fpgrowth.rs:5:5
|
5 | use csv::reader;
| ^^^ use of undeclared crate or module `csv`
error[E0432]: unresolved import `itertools`
--> src/fpgrowth.rs:7:5
|
7 | use itertools::{chain, combinations};
| ^^^^^^^^^ use of undeclared crate or module `itertools`
error[E0432]: unresolved import `optparse`
--> src/fpgrowth.rs:8:5
|
8 | use optparse::OptionParser;
| ^^^^^^^^ use of undeclared crate or module `optparse`
error[E0412]: cannot find type `Set` in this scope
--> src/fpgrowth.rs:14:11
|
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
| ^^^ not found in this scope
error[E0412]: cannot find type `Union` in this scope
--> src/fpgrowth.rs:14:31
|
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
| ^^^^^ not found in this scope
|
help: consider importing one of these items
|
1 | use crate::fpgrowth::collections::btree_set::Union;
|
1 | use crate::fpgrowth::collections::hash_set::Union;
|
1 | use std::collections::btree_set::Union;
|
1 | use std::collections::hash_set::Union;
|
error[E0412]: cannot find type `Set` in this scope
--> src/fpgrowth.rs:14:37
|
14 | ) -> (Vec<Set<&str>>, Vec<Vec<Union<Set<&str>, f32>>>) {
| ^^^ not found in this scope
error[E0425]: cannot find function `getFrequencyFromList` in this scope
--> src/fpgrowth.rs:15:21
|
15 | let frequency = getFrequencyFromList(itemSetList);
| ^^^^^^^^^^^^^^^^^^^^ not found in this scope
error[E0425]: cannot find function `constructTree` in this scope
--> src/fpgrowth.rs:17:33
|
17 | let (fpTree, headerTable) = constructTree(itemSetList, frequency, minSup);
| ^^^^^^^^^^^^^ not found in this scope
error[E0425]: cannot find function `mineTree` in this scope
--> src/fpgrowth.rs:22:9
|
22 | mineTree(headerTable, minSup, set(), freqItems);
| ^^^^^^^^ not found in this scope
error[E0425]: cannot find function `set` in this scope
--> src/fpgrowth.rs:22:39
|
22 | mineTree(headerTable, minSup, set(), freqItems);
| ^^^ not found in this scope
error[E0425]: cannot find function `associationRule` in this scope
--> src/fpgrowth.rs:23:21
|
23 | let rules = associationRule(freqItems, itemSetList, minConf);
| ^^^^^^^^^^^^^^^ not found in this scope
error[E0425]: cannot find function `getFromFile` in this scope
--> src/fpgrowth.rs:28:36
|
28 | let (itemSetList, frequency) = getFromFile(fname);
| ^^^^^^^^^^^ not found in this scope
error[E0425]: cannot find function `constructTree` in this scope
--> src/fpgrowth.rs:30:33
|
30 | let (fpTree, headerTable) = constructTree(itemSetList, frequency, minSup);
| ^^^^^^^^^^^^^ not found in this scope
error[E0425]: cannot find function `mineTree` in this scope
--> src/fpgrowth.rs:35:9
|
35 | mineTree(headerTable, minSup, set(), freqItems);
| ^^^^^^^^ not found in this scope
error[E0425]: cannot find function `set` in this scope
--> src/fpgrowth.rs:35:39
|
35 | mineTree(headerTable, minSup, set(), freqItems);
| ^^^ not found in this scope
error[E0425]: cannot find function `associationRule` in this scope
--> src/fpgrowth.rs:36:21
|
36 | let rules = associationRule(freqItems, itemSetList, minConf);
| ^^^^^^^^^^^^^^^ not found in this scope
warning: unused import: `std::collections::HashMap`
--> src/fpgrowth.rs:1:5
|
1 | use std::collections::HashMap;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: unnecessary parentheses around assigned value
--> src/fpgrowth.rs:16:18
|
16 | let minSup = (itemSetList.len() * minSupRatio);
| ^ ^
|
= note: `#[warn(unused_parens)]` on by default
help: remove these parentheses
|
16 - let minSup = (itemSetList.len() * minSupRatio);
16 + let minSup = itemSetList.len() * minSupRatio;
|
warning: unnecessary parentheses around assigned value
--> src/fpgrowth.rs:29:18
|
29 | let minSup = (itemSetList.len() * minSupRatio);
| ^ ^
|
help: remove these parentheses
|
29 - let minSup = (itemSetList.len() * minSupRatio);
29 + let minSup = itemSetList.len() * minSupRatio;
|
error[E0277]: cannot multiply `usize` by `f32`
--> src/fpgrowth.rs:16:37
|
16 | let minSup = (itemSetList.len() * minSupRatio);
| ^ no implementation for `usize * f32`
|
= help: the trait `std::ops::Mul<f32>` is not implemented for `usize`
error[E0308]: mismatched types
--> src/fpgrowth.rs:31:23
|
27 | fn fpgrowthFromFile<T0, T1, T2, RT>(fname: T0, minSupRatio: T1, minConf: T2) -> RT {
| -- this type parameter
...
31 | if fpTree == None {
| _______________________^
32 | | println!("{:?} ", "No frequent item set");
33 | | } else {
| |_____^ expected type parameter `RT`, found `()`
|
= note: expected type parameter `RT`
found unit type `()`
error[E0308]: mismatched types
--> src/fpgrowth.rs:37:16
|
27 | fn fpgrowthFromFile<T0, T1, T2, RT>(fname: T0, minSupRatio: T1, minConf: T2) -> RT {
| -- this type parameter -- expected `RT` because of return type
...
37 | return (freqItems, rules);
| ^^^^^^^^^^^^^^^^^^ expected type parameter `RT`, found tuple
|
= note: expected type parameter `RT`
found tuple `(std::vec::Vec<_>, _)`
Some errors have detailed explanations: E0277, E0308, E0412, E0425, E0432, E0433.
For more information about an error, try `rustc --explain E0277`.
warning: `fpgrowth_rs` (bin "fpgrowth_rs" test) generated 3 warnings
error: could not compile `fpgrowth_rs` due to 21 previous errors; 3 warnings emitted
warning: build failed, waiting for other jobs to finish...
warning: `fpgrowth_rs` (bin "fpgrowth_rs") generated 3 warnings (3 duplicates)
error: build failed
I have yet to inspect these errors and figure out whether they best be fixed before or after using pyrs
Conclusion:
- The experiment was nice to do.
- monkeytype seems promising for infering types.
- c2rust formatter tool might be helpful as well here, but didn't try https://github.com/immunant/c2rust/tree/master/c2rust-refactor https://c2rust.com/manual/c2rust-refactor/commands.html
- still manual work todo, python things like
itertools
don't yet get translated to their rust equivelant