wail
wail copied to clipboard
Implement WASAPI for importing WARCs from other sources
https://github.com/unt-libraries/py-wasapi-client
Web Archiving Systems API (WASAPI), a standard means for "shipping" WARCs (and other metadata) to and fro.
Should be able to import wasapi_client
per https://github.com/unt-libraries/py-wasapi-client/issues/19
Caveat in that I think WAIL has largely been tested to be built in Py2 due to previous limitations in pyinstaller but this may no longer be a restriction (py-wasapi-client requires Py3.4+)
Archive-It implement WASAPI. Other places?
Examples: https://docs.google.com/presentation/d/1lAjeNmnnJb_lLYofqR-ZlqcqxKZ_ithQ57vCPWdPFt4/edit#slide=id.g1e4a647d12_0_0 https://github.com/archivesunleashed/auk implements WARC import from Archive-It.
More info:
- https://github.com/WASAPI-Community/data-transfer-apis/tree/master/ait-specification Archive-It requires auth before any data is returned.
Crawl data (e.g., configs) could also be imported.
More details on Webrecorder's new WASAPI endpoint at https://github.com/oduwsdl/ipwb/issues/524.
I have two separate integrations in a separate repo and added native GUI elements in c2de004 rather than relying on a browser-based interface. Example separate hacky webview for posterity (adapted from a SO example):
import wx
import wx.html2
class WebView(wx.Dialog):
def __init__(self, *args, **kwds):
wx.Dialog.__init__(self, *args, **kwds)
sizer = wx.BoxSizer(wx.VERTICAL)
self.browser = wx.html2.WebView.New(self)
sizer.Add(self.browser, 1, wx.EXPAND, 10)
self.SetSizer(sizer)
self.SetSize((1000, 1000))
if __name__ == '__main__':
app = wx.App()
wv = WebView(None, -1)
wv.browser.LoadURL("http://0.0.0.0:23119/")
wv.Show()
app.MainLoop()