CodExt

Encode/decode anything.

CodExt is a (Python2-3 compatible) library that extends the native codecs library (namely for adding new custom encodings and character mappings) and provides 120+ new codecs, hence its name combining CODecs EXTension. It also features a guess mode for decoding multiple layers of encoding and CLI tools for convenience.

$ pip install codext

Want to contribute a new codec ?	Want to contribute a new macro ?
Check the documentation first Then PR your new codec	PR your updated version of `macros.json`

:mag: Demonstrations

Using CodExt from the command line

Using base tools from the command line

Using the unbase command line tool

:computer: Usage (main CLI tool)

$ codext -i test.txt encode dna-1
GTGAGCGGGTATGTGA

$ echo -en "test" | codext encode morse
- . ... -

$ echo -en "test" | codext encode braille
⠞⠑⠎⠞

$ echo -en "test" | codext encode base100
👫👜👪👫

Chaining codecs

$ echo -en "Test string" | codext encode reverse
gnirts tseT

$ echo -en "Test string" | codext encode reverse morse
--. -. .. .-. - ... / - ... . -

$ echo -en "Test string" | codext encode reverse morse dna-2
AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC

$ echo -en "Test string" | codext encode reverse morse dna-2 octal
101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103

$ echo -en "AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC" | codext -d dna-2 morse reverse
test string

Using macros

$ codext add-macro my-encoding-chain gzip base63 lzma base64

$ codext list macros
example-macro, my-encoding-chain

$ echo -en "Test string" | codext encode my-encoding-chain
CQQFAF0AAIAAABuTgySPa7WaZC5Sunt6FS0ko71BdrYE8zHqg91qaqadZIR2LafUzpeYDBalvE///ug4AA==

$ codext remove-macro my-encoding-chain

$ codext list macros
example-macro

:computer: Usage (base CLI tool)

$ echo "Test string !" | base122
*.7!ft9�-f9Â

$ echo "Test string !" | base91 
"ONK;WDZM%Z%xE7L

$ echo "Test string !" | base91 | base85
B2P|BJ6A+nO(j|-cttl%

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr
QVx5tvgjvCAkXaMSuKoQmCnjeCV1YyyR3WErUUErFf

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | base58-flickr -d | base36 -d | base85 -d | base91 -d
Test string !

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -m 3
Test string !

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -f Test
Test string !

:computer: Usage (Python)

Getting the list of available codecs:

>>> import codext

>>> codext.list()
['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']

>>> codext.encode("this is a test", "base58-bitcoin")
'jo91waLQA1NNeBmZKUF'

>>> codext.encode("this is a test", "base58-ripple")
'jo9rA2LQwr44eBmZK7E'

>>> codext.encode("this is a test", "base58-url")
'JN91Wzkpa1nnDbLyjtf'

>>> codecs.encode("this is a test", "base100")
'👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫'

>>> codecs.decode("👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫", "base100")
'this is a test'

>>> for i in range(8):
        print(codext.encode("this is a test", "dna-%d" % (i + 1)))
GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA
CTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCA
ACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAG
AGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGAC
TCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTG
TGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTC
GAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGT
CACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT
>>> codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA", "dna-1")
'this is a test'

>>> codecs.encode("this is a test", "morse")
'- .... .. ... / .. ... / .- / - . ... -'

>>> codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse")
'this is a test'

>>> with open("morse.txt", 'w', encoding="morse") as f:
	f.write("this is a test")
14

>>> with open("morse.txt",encoding="morse") as f:
	f.read()
'this is a test'

>>> codext.decode("""
      =            
              X         
   :            
      x         
  n  
    r 
        y   
      Y            
              y        
     p    
         a       
 `          
            n            
          |    
  a          
o    
       h        
          `            
          g               
           o 
   z      """, "whitespace-after+before")
'CSC{not_so_invisible}'

>>> print(codext.encode("An example test string", "baudot-tape"))
***.**
   . *
***.* 
*  .  
   .* 
*  .* 
   . *
** .* 
***.**
** .**
   .* 
*  .  
* *. *
   .* 
* *.  
* *. *
*  .  
* *.  
* *. *
***.  
  *.* 
***.* 
 * .*

:page_with_curl: List of codecs

BaseXX

[X] base1: useless, but for the sake of completeness
[X] base2: simple conversion to binary (with a variant with a reversed alphabet)
[X] base3: conversion to ternary (with a variant with a reversed alphabet)
[X] base4: conversion to quarternary (with a variant with a reversed alphabet)
[X] base8: simple conversion to octal (with a variant with a reversed alphabet)
[X] base10: simple conversion to decimal
[X] base11: conversion to digits with a "a"
[X] base16: simple conversion to hexadecimal (with a variant holding an alphabet with digits and letters inverted)
[X] base26: conversion to alphabet letters
[X] base32: classical conversion according to the RFC4648 with all its variants (zbase32, extended hexadecimal, geohash, Crockford)
[X] base36: Base36 conversion to letters and digits (with a variant inverting both groups)
[X] base45: Base45 DRAFT algorithm (with a variant inverting letters and digits)
[X] base58: multiple versions of Base58 (bitcoin, flickr, ripple)
[X] base62: Base62 conversion to lower- and uppercase letters and digits (with a variant with letters and digits inverted)
[X] base63: similar to base62 with the "_" added
[X] base64: classical conversion according to RFC4648 with its variant URL (or file) (it also holds a variant with letters and digits inverted)
[X] base67: custom conversion using some more special characters (also with a variant with letters and digits inverted)
[X] base85: all variants of Base85 (Ascii85, z85, Adobe, (x)btoa, RFC1924, XML)
[X] base91: Base91 custom conversion
[X] base100 (or emoji): Base100 custom conversion
[X] base122: Base100 custom conversion
[X] base-genericN: see base encodings ; supports any possible base

This category also contains ascii85, adobe, [x]btoa, zeromq with the base85 codec.

Binary

[X] baudot: supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, ...
[X] baudot-spaced: variant of baudot ; groups of 5 bits are whitespace-separated
[X] baudot-tape: variant of baudot ; outputs a string that looks like a perforated tape
[X] bcd: Binary Coded Decimal, encodes characters from their (zero-left-padded) ordinals
[X] bcd-extended0: variant of bcd ; encodes characters from their (zero-left-padded) ordinals using prefix bits 0000
[X] bcd-extended1: variant of bcd ; encodes characters from their (zero-left-padded) ordinals using prefix bits 1111
[X] excess3: uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinals
[X] gray: aka reflected binary code
[X] manchester: XORes each bit of the input with 01
[X] manchester-inverted: variant of manchester ; XORes each bit of the input with 10
[X] rotateN: rotates characters by the specified number of bits (N belongs to [1, 7] ; Python 3 only)

Common

[X] a1z26: keeps words whitespace-separated and uses a custom character separator
[X] cases: set of case-related encodings (including camel-, kebab-, lower-, pascal-, upper-, snake- and swap-case, slugify, capitalize, title)
[X] dummy: set of simple encodings (including integer, replace, reverse, word-reverse, substite and strip-spaces)
[X] octal: dummy octal conversion (converts to 3-digits groups)
[X] octal-spaced: variant of octal ; dummy octal conversion, handling whitespace separators
[X] ordinal: dummy character ordinals conversion (converts to 3-digits groups)
[X] ordinal-spaced: variant of ordinal ; dummy character ordinals conversion, handling whitespace separators

Compression

[X] gzip: standard Gzip compression/decompression
[X] lz77: compresses the given data with the algorithm of Lempel and Ziv of 1977
[X] lz78: compresses the given data with the algorithm of Lempel and Ziv of 1978
[X] pkzip_deflate: standard Zip-deflate compression/decompression
[X] pkzip_bzip2: standard BZip2 compression/decompression
[X] pkzip_lzma: standard LZMA compression/decompression

:warning: Compression functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Cryptography

[X] affine: aka Affine Cipher
[X] atbash: aka Atbash Cipher
[X] bacon: aka Baconian Cipher
[X] barbie-N: aka Barbie Typewriter (N belongs to [1, 4])
[X] citrix: aka Citrix CTX1 password encoding
[X] railfence: aka Rail Fence Cipher
[X] rotN: aka Caesar cipher (N belongs to [1,25])
[X] scytaleN: encrypts using the number of letters on the rod (N belongs to [1,[)
[X] shiftN: shift ordinals (N belongs to [1,255])
[X] xorN: XOR with a single byte (N belongs to [1,255])

:warning: Crypto functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Hashing

[X] blake: includes BLAKE2b and BLAKE2s (Python 3 only ; relies on hashlib)
[X] checksums: includes Adler32 and CRC32 (relies on zlib)
[X] crypt: Unix's crypt hash for passwords (Python 3 and Unix only ; relies on crypt)
[X] md: aka Message Digest ; includes MD4 and MD5 (relies on hashlib)
[X] sha: aka Secure Hash Algorithms ; includes SHA1, 224, 256, 384, 512 (Python2/3) but also SHA3-224, -256, -384 and -512 (Python 3 only ; relies on hashlib)
[X] shake: aka SHAKE hashing (Python 3 only ; relies on hashlib)

:warning: Hash functions are of course definitely NOT encoding functions ; they are implemented for convenience with the .encode(...) API from codecs and useful for chaning codecs.

Languages

[X] braille: well-known braille language (Python 3 only)
[X] ipsum: aka lorem ipsum
[X] galactic: aka galactic alphabet or Minecraft enchantment language (Python 3 only)
[X] leetspeak: based on minimalistic elite speaking rules
[X] morse: uses whitespace as a separator
[X] navajo: only handles letters (not full words from the Navajo dictionary)
[X] radio: aka NATO or radio phonetic alphabet
[X] southpark: converts letters to Kenny's language from Southpark (whitespace is also handled)
[X] southpark-icase: case insensitive variant of southpark
[X] tap: converts text to tap/knock code, commonly used by prisoners
[X] tomtom: similar to morse, using slashes and backslashes

Others

[X] dna: implements the 8 rules of DNA sequences (N belongs to [1,8])
[X] letter-indices: encodes consonants and/or vowels with their corresponding indices
[X] markdown: unidirectional encoding from Markdown to HTML

Steganography

[X] hexagram: uses Base64 and encodes the result to a charset of I Ching hexagrams (as implemented here)
[X] klopf: aka Klopf code ; Polybius square with trivial alphabetical distribution
[X] resistor: aka resistor color codes
[X] rick: aka Rick cipher (in reference to Rick Astley's song "Never gonna give you up")
[X] sms: also called T9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decoding
[X] whitespace: replaces bits with whitespaces and tabs
[X] whitespace_after_before: variant of whitespace ; encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before")

Web

[X] html: implements entities according to this reference
[X] url: aka URL encoding

python-codext
python-codext copied to clipboard

Metadata

CodExt

Encode/decode anything.

:mag: Demonstrations

:computer: Usage (main CLI tool)

Chaining codecs

Using macros

:computer: Usage (base CLI tool)

:computer: Usage (Python)

:page_with_curl: List of codecs

BaseXX

Binary

Common

Compression

Cryptography

Hashing

Languages

Others

Steganography

Web

:clap: Supporters

← Metadata

Owner

Metadata

python-codext python-codext copied to clipboard

Metadata

CodExt

Encode/decode anything.

:mag: Demonstrations

:computer: Usage (main CLI tool)

Chaining codecs

Using macros

:computer: Usage (base CLI tool)

:computer: Usage (Python)

:page_with_curl: List of codecs

:clap: Supporters

← Metadata

Owner

Metadata

python-codext
python-codext copied to clipboard