skyfall
skyfall copied to clipboard
A Ruby gem for streaming data from the Bluesky/AtProto firehose
Skyfall
A Ruby gem for streaming data from the Bluesky/AtProto firehose 🦋
[!NOTE] ATProto Ruby gems collection: skyfall | blue_factory | minisky | didkit
What does it do
Skyfall is a Ruby library for connecting to the "firehose" of the Bluesky social network, i.e. a websocket which streams all new posts and everything else happening on the Bluesky network in real time. The code connects to the websocket endpoint, decodes the messages which are encoded in some binary formats like DAG-CBOR, and returns the data as Ruby objects, which you can filter and save to some kind of database (e.g. in order to create a custom feed).
Installation
gem install skyfall
Usage
Start a connection to the firehose by creating a Skyfall::Stream object, passing the server hostname and endpoint name:
require 'skyfall'
sky = Skyfall::Stream.new('bsky.network', :subscribe_repos)
Add event listeners to handle incoming messages and get notified of errors:
sky.on_connect { puts "Connected" }
sky.on_disconnect { puts "Disconnected" }
sky.on_message { |m| p m }
sky.on_error { |e| puts "ERROR: #{e}" }
When you're ready, open the connection by calling connect:
sky.connect
Processing messages
Each message passed to on_message is an instance of a subclass of WebsocketMessage, depending on the message type. The supported message types are:
CommitMessage(#commit) - represents a change in a user's repo; most messages are of this typeHandleMessage(#handle) - when a different handle is assigned to a user's DIDTombstoneMessage(#tombstone) - when an account is deletedInfoMessage(#info) - a protocol error message, e.g. about an invalid cursor parameterUnknownMessageis used for other unrecognized message types
All message objects have the following properties:
type(symbol) - the message type identifier, e.g.:commitseq(integer) - a sequential index of the messagerepoordid(string) - DID of the repository (user account)time(Time) - timestamp of the described action
All properties except type may be nil for some message types that aren't related to a specific user, like #info.
Commit messages additionally have:
commit- CID of the commitprev- CID of the previous commit in that repooperations- list of operations (usually one)
Handle messages additionally have:
handle- the new handle assigned to the DID
Info messages additionally have:
name- identifier of the message/errormessage- a human-readable description
Commit operations
Operations are objects of type Operation and have such properties:
repoordid(string) - DID of the repository (user account)collection(string) - name of the relevant collection in the repository, e.g.app.bsky.feed.postfor poststype(symbol) - short name of the collection, e.g.:bsky_postrkey(string) - identifier of a record in a collectionpath(string) - the path part of the at:// URI - collection name + ID (rkey) of the itemuri(string) - the complete at:// URIaction(symbol) -:create,:updateor:deletecid- CID of the operation/record (nilfor delete operations)
Create and update operations will also have an attached record (JSON object) with details of the post, like etc. The record data is currently available as a Ruby hash via raw_record property (custom types will be added in future).
So for example, in order to filter only "create post" operations and print their details, you can do something like this:
sky.on_message do |m|
next if m.type != :commit
m.operations.each do |op|
next unless op.action == :create && op.type == :bsky_post
puts "#{op.repo}:"
puts op.raw_record['text']
puts
end
end
For more examples, see the example folder or the bluesky-feeds-rb project, which implements a feed generator service.
Custom lexicons
A note on custom lexicons: the Skyfall::Operation objects have two properties that tell you the kind of record they're about: #collection, which is a string containing the official name of the collection/lexicon, e.g. "app.bsky.feed.post"; and #type, which is a symbol meant to save you some typing, e.g. :bsky_post.
When Skyfall receives a message about a record type that's not on the list, whether in the app.bsky namespace or not, the operation type will be :unknown, while the collection will be the original string. So if an app like e.g. "Skygram" appears with a zz.skygram.* namespace that lets you share photos on ATProto, the operations will have a type :unknown and collection names like zz.skygram.feed.photo, and you can check the collection field for record types known to you and process them in some appropriate way, even if Skyfall doesn't recognize the record type.
Do not however check if such operations have a type equal to :unknown first - just ignore the type and only check the collection string. The reason is that some next version of Skyfall might start recognizing those records and add a new type value for them like e.g. :skygram_photo, and then they won't match your condition anymore.
Credits
Copyright © 2023 Kuba Suder (@mackuba.eu).
The code is available under the terms of the zlib license (permissive, similar to MIT).
Bug reports and pull requests are welcome 😎