Vyxal
Vyxal copied to clipboard
Vyxal Corpus
Similar to this issue opened in the Jelly repo and this issue opened in the Husk repo, I decided to run Lynn's method on 05AB1E answers (with a few modifications).
SEDE query used to get all Vyxal answers on CGCC. Code used to analyze the data. Final results
Results as of January 29, 2023:
2-graphs:
41 \|
27 (n
27 \_
25 \\
24 +,
24 =;
23 ,)
22
22 ka
22 `
22 ;f
21 *\
20 :£
20 ?(
19 øm
19 `\
19 ;h
19 ++
18 =[
18 `|
18 \/
17 ,
17 2ẇ
17 vL
17 2l
16 *+
16 ÞT
16 12
16 ƛ?
16 ‛
3-graphs:
11 \_*
10 `:q
9 *\|
8 :qp
7 ĠvL
7 +,)
7 \|+
6 \\`
6 \|:
6 #
6 ma
6 2lv
5 ;ȯt
5 »₄τ
5 ++,
5 ⁰nε
5 ⇧\_
5 ð*\
5 \\꘍
5 \++
5 `|
5 ?(:
5 #
5 # m
5 mai
5 ain
5 in
5 n p
5 pr
5 pro
4-graphs:
8 `:qp
5 ⇧\_*
5 #
5 # m
5 # ma
5 mai
5 main
5 ain
5 in p
5 n pr
5 pro
5 prog
5 rogr
5 ogra
5 gram
5 2lvƒ
4 ʀʁɾɽ
4 :q$+
4 *\|+
4 \|:„
4 |:„+
4 :„+p
4 „+p,
4 *ðp,
4 :qpq
3 ‛, j
3 t's
3 I'm
3 I'm
3 1234
Notes:
- All the single characters are ASCII-art related
-
+,
,p,
, and++
are two common operations joined together -
øm
andÞT
are deprecated in favour of 1-byte alternatives -
ð*
is nowI
-
v∑
is nowṠ
-
$i
is rarely required anymore asi
now indexes the number into the list/string -
(n
,?(
and=[
are just generally useful operations
-
\|
,\\
and\/
look like useful enough nilads -
⁰nε
is absolute difference with input and context variable. or string joining. why?
import csv
import collections
digraphs = collections.Counter()
trigraphs = collections.Counter()
quadgraphs = collections.Counter()
with open("QueryResults.csv", newline="", encoding="utf-8") as f:
for row in csv.reader(f):
if row[0] == "Post Link":
continue
code = row[1]
if "<pre><code>" not in code:
continue
# Extract the first bit of code
vyxal = (
code.partition("<pre><code>")[2]
.partition("</code></pre>")[0]
.strip()
)
vyxal = vyxal.replace(""", '"')
vyxal = vyxal.replace(">", ">").replace("<", "<")
vyxal = vyxal.replace("&", "&")
if any(vyxal.count(c) >= 10 for c in vyxal):
continue
if len(vyxal) > 100:
continue
for line in vyxal.split("\n"):
for (a, b) in zip(line, line[1:]):
digraphs[a, b] += 1
for (a, b, c) in zip(line, line[1:], line[2:]):
trigraphs[a, b, c] += 1
for (a, b, c, d) in zip(line, line[1:], line[2:], line[3:]):
quadgraphs[a, b, c, d] += 1
with open("most-common.txt", "w", encoding="utf-8") as f:
f.write("2-graphs:\n")
for d, n in digraphs.most_common(30):
f.write("%4d %s\n" % (n, "".join(d)))
f.write("\n3-graphs:\n")
for d, n in trigraphs.most_common(30):
f.write("%4d %s\n" % (n, "".join(d)))
f.write("\n4-graphs:\n")
for d, n in quadgraphs.most_common(30):
f.write("%4d %s\n" % (n, "".join(d)))
Modified corpus that uses the lexer to get least/most common elements. Results:
Least common:
1 X
1 ¢
2 ↑
3 @
3 Ẇ
3 z
3 ė
3 ↓
4 P
4 ⟩
4 ⟨
4 ¼
5 ƈ
5 ḃ
5 ⟇
5 Ż
6 ŀ
6 §
6 ∪
6 ṙ
7 □
7 Ṁ
7 ǒ
7 Ǒ
7 Ȯ
8 }
9 ∇
9 ɖ
9 ⊍
10 ⋎
10 ‟
10 ₆
10 ⋏
10 H
11 !
11 ≥
11 M
11 ⟑
11 ₈
12 ḋ
12 ¶
12 ≤
12 ṫ
12 Ċ
13 ↲
13 ǎ
13 ṁ
13 „
13 ^
13 ∵
14 ċ
14 Ǎ
14 ꜝ
14 ₇
14 q
14 ∴
14 ₅
15 Ǐ
15 Ȧ
16 Ǔ
16 ₄
16 B
17 Ḟ
17 ¤
17 ±
17 ∨
18 ¾
18 F
18 †
18 €
18 Y
19 &
19 ∧
20 ₂
20 ¡
20 ⌐
20 ɽ
20 ġ
20 m
21 ↳
21 β
21 ǔ
21 x
21 ß
22 ↵
22 _
23 o
23 ∞
23 ₁
24 ǐ
24 j
24 ≠
25 ṗ
25 ∷
25 ḣ
26 •
26 r
26 æ
26 √
26 ṡ
27 Z
27 w
28 ż
28 E
28 ⅛
29 V
29 µ
29 ȧ
29 Ė
29 y
29 ₃
30 ẏ
30 ƒ
30 ḭ
31 Ŀ
31 ≬
31 ʀ
31 >
31 ⇩
31 O
31 ×
31 ₍
32 <
33 ∩
33 u
34 ḟ
34 Ġ
34 K
35 a
35 Ḋ
35 T
35 ẋ
35 …
35 Ẋ
35 {
36 ⁽
37 D
37 ‡
38 S
38 g
38 ¬
38 Ḃ
38 ʁ
38 ¦
38 G
39 ⁼
39 W
40 ⌈
40 ≈
40 ₌
41 ]
42 ~
42 ₴
42 I
43 ¯
43 Ṡ
44 Ṗ
44 ¹
44 ℅
45 l
45 ε
45 ²
46 R
46 İ
47 ȯ
48 ⇧
48 Ṫ
49 ₀
49 b
50 ẇ
50 ⁋
50 ↔
50 N
50 £
51 Ẏ
52 Π
53
56 ꘍
57 A
57 τ
57 Ṅ
58 ⌊
59 U
61 "
62 Ḣ
62 %
62 λ
62 ð
63 e
63 c
63 ½
71 i
73 ÷
74 s
74 ¥
75 /
78 C
82 )
87 |
88 d
102 p
104 ɾ
104 ‹
105 t
108 ⁰
109 ṅ
111 [
111 -
118 Ṙ
119 h
123 '
126 ,
126 J
134 =
140 (
142 L
147 ›
161 n
179 f
209 ƛ
217 ∑
224 $
231 ?
246 +
269 *
278 ;
281 :
302 v
Least common digraphs:
1 øḋ
1 øp
1 ∆L
1 ∆ǐ
1 ¨^
1 Þ¾
1 k¹
1 øR
1 ∆q
1 k1
1 Þ:
1 ∆/
1 ∆e
1 ø↳
1 kḭ
1 Þm
1 ∆ṁ
1 ∆p
1 ¨*
1 Þe
1 øḞ
1 ÞḊ
1 ∆T
1 Þ∪
1 ÞḞ
1 Þ∴
1 Þ∵
1 øC
1 ÞI
1 Þż
1 ∆F
1 k½
1 ki
1 kg
1 ød
1 øÞ
1 ¨»
1 ∆₌
1 Þ/
1 ∆ė
1 Þ↑
1 kj
1 Þ*
1 kp
1 Þo
1 øβ
1 ∆Ċ
1 kV
1 kW
1 k<
1 kḂ
1 k[
1 kɽ
1 ø`
1 k§
1 k\
1 øT
1 k…
1 ∆*
1 ∆%
1 ¨U
1 k¦
1 kṠ
1 kṀ
1 kḢ
1 kτ
1 kε
1 ∆ƈ
1 ÞU
1 ∆ṗ
1 Þ!
1 kD
1 Þ…
2 ∆M
2 k□
2 ¨²
2 ÞȮ
2 kð
2 ÞR
2 ÞM
2 øĖ
2 k₁
2 Þḋ
2 øɽ
2 k-
2 øŀ
2 ∆I
2 øṗ
2 øl
2 ∆Ė
2 k^
2 ∆i
2 øM
2 øB
2 Þẇ
2 ∆o
2 ÞZ
2 k4
2 Þṁ
2 k+
2 øb
2 kṡ
2 Þ₀
2 ø∆
2 kL
2 ∆d
2 ∆Ŀ
2 øo
2 øṀ
2 kz
2 ¨…
2 øe
3 ¨V
3 ÞG
3 Þj
3 Þx
3 ÞK
3 Þ⇧
3 Þ□
3 Þg
3 ø^
3 k⁰
3 Þ℅
3 Þǔ
3 k≈
3 Þ×
3 k•
3 ÞṪ
3 k×
3 ∆Ṙ
3 ∆τ
3 ∆Q
3 kF
4 Þ⊍
4 ¨p
4 ø∧
4 ¨2
4 ÞṠ
4 Þp
4 øĊ
4 ¨M
4 k(
4 ÞD
4 kl
5 ¨£
5 ¨=
5 Þṡ
5 ∆f
5 øṘ
5 Þ•
5 kv
5 k6
5 kB
6 øA
6 ÞẊ
6 øṖ
6 k2
6 kh
6 k∨
6 k/
7 ∆ċ
8 ∆Z
8 øṙ
8 kr
8 ÞF
9 kP
9 ∆²
9 ∆K
10 Þu
10 kd
10 kH
11 Þ∞
12 ÞS
13 kA
15 Þf
15 ÞT
19 øm
22 ka
Hey, this feels automatable! :p
@GingerIndustries Apparently, SEDE doesn't exactly have an API
It looks like v
and :
are extremely common characters. So it would also be helpful to make some elements that would shorten common operations on these items.
Judging from the 2023 results, v
is commonly used in vL
, so it would be helpful to introduce a 1-byte element that takes a vectorized length. Also, :£
is quite common as well, so you can add a command that sets the register to a value without using up the item on the stack.
;
is quite common even though you have flags. So it would be nice to merge certain commands with an end-of-loop structure. ;f
, =;
, and ;h
are the most common among them.
;f
(End loop structure and flatten) sounds like a pretty useful thing to me, along with ;h
. It would be very nice to make 1-byte shortenings for these, since these are pretty useful. (I'll do an analysis on these structures, to see whether they happen at the end of a program. This will determine whether making flags would be helpful.)
Not sure about =;
, do you recall the cases where you had to use =;
?