doogle
doogle copied to clipboard
Web search of the people, by the people, for the people with Go.
doogle
Web search of the people, by the people, for the people with Go.
doogle is a Proof of Concept software of decentralized search engine based on gRPC written in Go.
This is just a PoC, and so I am not serious about making doogle secure, scalable, usable in production, etc.
algorithms behind doogle
1. Distributed Hash Table based on S/Kademlia
Baumgart, Ingmar, and Sebastian Mies. "S/kademlia: A practicable approach towards secure key-based routing." Parallel and Distributed Systems, 2007 International Conference on. IEEE, 2007.
2. local estimation of PageRank with WorldNode
Parreira, Josiane Xavier, et al. "Efficient and decentralized pagerank approximation in a peer-to-peer web search network." Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, 2006.
development
build and test
❯ go get -u -d github.com/mathetake/doogle
❯ cd $GOPATH/src/github.com/mathetake/doogle
❯ go build .
❯ go test -v -race ./...
start node
❯ ./doogle --help
Usage of ./doogle:
-c int
crawler's channel capacity
-d int
difficulty for cryptographic puzzle
-p string
port for node
-w int
number of crawler's worker
❯ ./doogle -c 4 -d 1 -p :12312 -w 4
INFO[0000] node created: doogleAddress=ad97676370397f6eb23dc165a34bf74a9c11d243
INFO[0000] crawler is ready
INFO[0000] node listen on port: :12312, num of crawler's worker: 0
INFO[0000] difficulty: 1, crawler's queue capacity: 4
You can connect to the node with, for example, grpcc:
❯ grpcc --proto grpc/doogle.proto --address localhost:12312 --insecure
Connecting to doogle.Doogle on localhost:12312. Available globals:
client - the client connection to Doogle
storeItem (StoreItemRequest, callback) returns Empty
findIndex (FindIndexRequest, callback) returns FindIndexReply
findNode (FindNodeRequest, callback) returns NodeInfos
pingWithCertificate (NodeCertificate, callback) returns NodeCertificate
ping (StringMessage, callback) returns StringMessage
pingTo (NodeInfo, callback) returns StringMessage
getIndex (StringMessage, callback) returns GetIndexReply
postUrl (StringMessage, callback) returns StringMessage
printReply - function to easily print a unary call reply (alias: pr)
streamReply - function to easily print stream call replies (alias: sr)
createMetadata - convert JS objects into grpc metadata instances (alias: cm)
printMetadata - function to easily print a unary call's metadata (alias: pm)
and then call Ping
:
Doogle@localhost:12312> client.ping({ message: 'ping' }, printReply)
EventEmitter {}
Doogle@localhost:12312>
{
"message": "pong"
}
start node using docker
docker run -p 12312:8080 -it doogle bash -c "./doogle -d 2 -p :8080"
modify interface
install protc, then run:
protoc -I grpc/ grpc/doogle.proto --go_out=plugins=grpc:grpc
References
see my survey: Towards decentralized information retrieval: research papers
Also there are two articles in Japanese:
Author
LICENSE
MIT