gwizo icon indicating copy to clipboard operation
gwizo copied to clipboard

Simple Go implementation of the Porter Stemmer algorithm with powerful features.

gwizo

home

Gwizo version GoDoc License Twitter

Package gwizo implements Porter Stemmer algorithm, M. "An algorithm for suffix stripping." Program 14.3 (1980): 130-137. Martin Porter, the algorithm's inventor, maintains a web page about the algorithm at http://www.tartarus.org/~martin/PorterStemmer/

Installation

To install, simply run in a terminal:

go get github.com/kampsy/gwizo

Stem

Stem: stem the word.

package main

import (
  "fmt"
  "github.com/kampsy/gwizo"
)

func main() {
  stem := gwizo.Stem("abilities")
  fmt.Printf("Stem: %s\n", stem)
}
$ go run main.go

Stem: able

Vowels, Consonants and Measure

gwizo returns a type Token which has two fileds, VowCon which is the vowel consonut pattern and the Measure value [v]vc{m}[c]

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "strings"
  )

func main() {
  word := "abilities"
  token := gwizo.Parse(word)

  // VowCon
  fmt.Printf("%s has Pattern %s \n", word, token.VowCon)

  // Measure value [v]vc{m}[c]
  fmt.Printf("%s has Measure value %d \n", word, token.Measure)

  // Number of Vowels
  v := strings.Count(token.VowCon, "v")
  fmt.Printf("%s Has %d Vowels \n", word, v)

  // Number of Consonants
  c := strings.Count(token.VowCon, "c")
  fmt.Printf("%s Has %d Consonants\n", word, c)
}
$ go run main.go

abilities has Pattern vcvcvcvvc
abilities has Measure value 4
abilities Has 5 Vowels
abilities Has 4 Consonants

File Stem Performance.

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "bufio"
    "io/ioutil"
    "strings"
    "os"
    "time"
  )

  func main() {
    curr := time.Now()
    writeOut()
    elaps := time.Since(curr)
    fmt.Println("============================")
    fmt.Println("Done After:", elaps)
    fmt.Println("============================")
  }

  func writeOut() {
    re, err := ioutil.ReadFile("input.txt")
    if err != nil {
      fmt.Println(err)
    }

    file := strings.NewReader(fmt.Sprintf("%s", re))
    scanner := bufio.NewScanner(file)
    out, err := os.Create("stem.txt")
    if err != nil {
      fmt.Println(err)
    }
    defer out.Close()
    for scanner.Scan() {
      txt := scanner.Text()
      stem := gwizo.Stem(txt)
      out.WriteString(fmt.Sprintf("%s\n", stem))
      fmt.Println(txt, "--->", str)
    }
    if err := scanner.Err(); err != nil {
      fmt.Println(err)
    }
  }
$ go run main.go