simple-yet-powerful-srt-subtitle-parser-cpp
simple-yet-powerful-srt-subtitle-parser-cpp copied to clipboard
unique_copy() error
unique_copy (output.begin(), output.end(), std::back_insert_iteratorstd::string(_justDialogue), [](char a,char b)
=> a = -30 , b = -128
The value comes in like this, and there's an error here
srt
1 00:00:00,000 --> 00:00:06,719I'm going to use a few terms throughout this video that not everyone might understand.
2 00:00:06,719 --> 00:00:10,960 A plane is a mathematical term for a square or rectangle.
3 00:00:10,960 --> 00:00:14,560 This is mostly the walls, ceiling and floor.
4 00:00:14,560 --> 00:00:19,519 Vertices are the corners of said planes, the singular being vertex.
5 00:00:19,519 --> 00:00:23,519 Polygons are another term for surfaces derived from computer science rather than maths.
6 00:00:23,519 --> 00:00:26,519 It's going to be used interchangeably with plane.
7 00:00:26,559 --> 00:00:31,160 To do something recursively means a process must repeat itself over and over until an
8 00:00:31,160 --> 00:00:33,399 end goal is met to solve a problem.
9 00:00:33,399 --> 00:00:37,079 A data type is a way data is classified in programming.
10 00:00:37,079 --> 00:00:43,000 For example, a string, which are words or a series of letters, and int, which is an
11 00:00:43,000 --> 00:00:44,960 integer, a typical number.
12 00:00:44,960 --> 00:00:50,200 There are different kinds of numbers as well, such as float for precise decimal numbers,
13 00:00:50,200 --> 00:00:53,039 and long for, well, long numbers.
14 00:00:53,039 --> 00:01:02,039 Of the total size of the games industry today, around 20% are games within the shooter genre.
15 00:01:02,039 --> 00:01:07,760 Around a fifth of games in this $300 billion industry are shooters, most of them being
16 00:01:07,760 --> 00:01:09,480 first person shooters.
17 00:01:09,480 --> 00:01:14,379 The amount of money generated by and riding on the success of this singular genre in this
18 00:01:14,379 --> 00:01:16,359 industry is stupefying.
19 00:01:16,359 --> 00:01:21,200 The first person shooter as developed early on was a huge departure from every other kind
20 00:01:21,200 --> 00:01:23,560 of game that existed at the time.
21 00:01:23,560 --> 00:01:28,159 Games at the time, PC games especially, were often slow or methodical.
22 00:01:28,159 --> 00:01:34,040 The personal computer platform was known for careful and considered games, turn-based strategy,
23 00:01:34,040 --> 00:01:37,520 grand RPGs with a slowly unfolding world.
24 00:01:37,520 --> 00:01:41,240 Often times these games would be indistinguishable from a spreadsheet.
25 00:01:41,240 --> 00:01:43,600 Action was the realm of console gaming.
26 00:01:43,600 --> 00:01:47,640 Platformers were the most immediate real-time action-packed games available, and besides
27 00:01:47,640 --> 00:01:52,340 notable games like Duke Nukem and Commander Keen, people didn't really play those kind
28 00:01:52,340 --> 00:01:54,319 of action games on PC.
29 00:01:54,319 --> 00:01:57,319 This was until the advent of the first person shooter.
30 00:01:57,319 --> 00:02:01,299 Suddenly people were hit with this visceral representation of violence.
31 00:02:01,299 --> 00:02:03,599 They represented something which films could not.
32 00:02:03,599 --> 00:02:06,439 You inhabit a world through the lens of the character.
33 00:02:06,439 --> 00:02:10,920 You are closer to the action hero than ever before in any medium in history.
34 00:02:10,920 --> 00:02:16,199 What is widely agreed upon as the first first person shooter ever is Maze War, developed
35 00:02:16,199 --> 00:02:20,799 in 1973, that's the same year Britain joined the European Union, Dark Side of the Moon
36 00:02:20,799 --> 00:02:24,199 was released, and the United States announced it would withdraw from Vietnam.
37 00:02:24,199 --> 00:02:28,759 It was developed for NASA computers by Steve Colleade, Greg Thompson and Howard Palmer.
38 00:02:28,759 --> 00:02:31,679 It was constructed with simple wireframe graphics.
39 00:02:31,679 --> 00:02:36,120 People had the idea of adding multiple players using networking, then connecting over the
40 00:02:36,120 --> 00:02:37,120 ARPANET.
41 00:02:37,120 --> 00:02:38,120 Then it took off.
42 00:02:38,120 --> 00:02:40,799 We saw other first person shooter games after that point.
43 00:02:40,800 --> 00:02:46,600 Spasm or Space Sim in 1974, Battlezone for arcades in 1980.
44 00:02:46,600 --> 00:02:51,520 Besides these few examples, for the majority of the decades following its inception, the
45 00:02:51,520 --> 00:02:56,000 first person perspective was known mostly for its association with the role playing
46 00:02:56,000 --> 00:02:58,320 genre, for example games like Ultima.
47 00:02:58,320 --> 00:03:03,520 Now, first person shooting did technically exist, but you were merely shooting projectiles
48 00:03:03,520 --> 00:03:04,520 at your friends.
49 00:03:04,520 --> 00:03:08,439 You weren't inhabiting a character, you weren't the action hero fighting bad guys,
50 00:03:08,439 --> 00:03:11,359 That was until Wolfenstein 3D.
51 00:03:11,359 --> 00:03:17,000 id Software was founded in 1991 by four former soft disk employees, John and Adrian Carmack,
52 00:03:17,000 --> 00:03:21,240 no relation they just happened to have the same name, Tom Hall and John Romero.
53 00:03:21,240 --> 00:03:24,000 This was the same year the Soviet Union fell.
54 00:03:24,000 --> 00:03:26,340 Carmack is going to be more important later on.
55 00:03:26,340 --> 00:03:31,199 They originally began with a Mario clone named Dangerous Dave before the company was officially
56 00:03:31,199 --> 00:03:32,199 founded.
57 00:03:32,199 --> 00:03:35,560 This was mainly to show off the beginnings of John Carmack's technical wizardry, encoding
58 00:03:35,560 --> 00:03:38,560 an efficient 2D side scrolling graphics renderer.
59 00:03:38,560 --> 00:03:43,780 The early 90s, when everything was a dark and edgy statement, the satanic inversion
60 00:03:43,780 --> 00:03:46,539 between PC and console was no exception.
61 00:03:46,539 --> 00:03:49,560 PC graphics using software rendering were terrible.
62 00:03:49,560 --> 00:03:54,900 John Carmack developed his adaptive tile refresh for the PC to compete with the raw computational
63 00:03:54,900 --> 00:03:58,599 power of the Super Nintendo, a true beast.
64 00:03:58,599 --> 00:04:02,240 Adaptive tile refresh meant that slightly more of the game world could be included in
65 00:04:02,240 --> 00:04:04,640 the screen buffer, just outside of view.
66 00:04:04,640 --> 00:04:07,360 This meant they could render smooth 2D scrolling.
67 00:04:07,360 --> 00:04:11,400 It also made the sprite animations independent from screen scrolling.
68 00:04:11,400 --> 00:04:16,120 This little bit of code magic powered their games, including the Commander Keen series.
69 00:04:16,120 --> 00:04:20,520 The Commander Keen series was spread through shareware, with subsequent episodes releasing
70 00:04:20,520 --> 00:04:24,340 over the next year or so for purchase from Apogee, their publisher.
71 00:04:24,340 --> 00:04:29,280 This shareware model would be important because it would be used in their subsequent games.
72 00:04:29,280 --> 00:04:33,200 Speaking of subsequent games…
73 00:04:33,279 --> 00:04:36,879 Wolfenstein 3D began development in 1991.
74 00:04:36,879 --> 00:04:42,199 It would use the raycasting technique earlier employed in id's Catacomb 3D.
75 00:04:42,199 --> 00:04:47,079 Raycasting was a rendering technique necessitated again by the limited processing power of PCs
76 00:04:47,079 --> 00:04:48,079 at the time.
77 00:04:48,079 --> 00:04:51,319 PC master race just can't stop losing.
78 00:04:51,319 --> 00:04:55,800 PCs almost all used software rendering, rather than a dedicated graphics chip.
79 00:04:55,800 --> 00:05:00,879 The shareware model involved getting the game on as many PCs as possible.
80 00:05:00,879 --> 00:05:03,079 Raycasting was the solution to help them do this.
81 00:05:03,079 --> 00:05:07,479 Raycasting allowed their game to run on basically any PC.
82 00:05:07,479 --> 00:05:11,740 Raycasting means you're able to draw only the surfaces which are in the player's field
83 00:05:11,740 --> 00:05:12,740 of view.
84 00:05:12,740 --> 00:05:15,000 This helped massively in saving processing power.
85 00:05:15,000 --> 00:05:16,000 But how does it work?
86 00:05:16,000 --> 00:05:21,519 In effect, a ray is cast, from the player to the geometry, to the nearest object blocking
87 00:05:21,519 --> 00:05:22,519 its path.
88 00:05:22,519 --> 00:05:25,839 In Wolfenstein, none of the levels were truly 3D.
89 00:05:25,839 --> 00:05:28,799 Every level was drawn out on a flat 2D plane.
90 00:05:28,800 --> 00:05:34,120 The program scans horizontally, checking that every pixel on the horizontal axis has something
91 00:05:34,120 --> 00:05:35,120 drawn in it.
92 00:05:35,120 --> 00:05:39,480 If there's nothing drawn in a position, a pixel column will be drawn out.
93 00:05:39,480 --> 00:05:43,699 This is simplified from the process of ray tracing, where this process is done for every
94 00:05:43,699 --> 00:05:46,600 single pixel, rather than every pixel column.
95 00:05:46,600 --> 00:05:51,040 The distance between the viewer, or the camera or player, they all have the same meaning,
96 00:05:51,040 --> 00:05:53,439 and the nearest piece of geometry, is obtained.
97 00:05:53,439 --> 00:05:58,400 The height of the pixel column is calculated using the distance from point of intersection
98 00:05:58,400 --> 00:06:00,880 in the direction the player is facing.
99 00:06:00,880 --> 00:06:04,440 It uses trigonometry to find this point of intersection.
100 00:06:04,440 --> 00:06:09,240 This effectively allowed them to give the illusion of distance, to render a 3D scene.
101 00:06:09,240 --> 00:06:14,620 This makes the process of rendering 3D much easier, as a line, that line being distance
102 00:06:14,620 --> 00:06:20,560 from player to geometry, directly transforms to a line, that being the height of the rendered
103 00:06:20,560 --> 00:06:21,560 column.
104 00:06:21,560 --> 00:06:24,480 This process is done multiple times every single second.
105 00:06:24,560 --> 00:06:29,360 The planes in the scene had been texture mapped, where an image is applied to a 3D surface.
106 00:06:29,360 --> 00:06:34,560 When the columns are drawn, they are really drawing slices of these wall textures at different
107 00:06:34,560 --> 00:06:35,560 sizes.
108 00:06:35,560 --> 00:06:40,319 The height of the column being drawn is smaller when the plane, that being the wall, is further
109 00:06:40,319 --> 00:06:41,319 away from you.
110 00:06:41,319 --> 00:06:45,879 The textures are scaled appropriately to the size of the wall, relative to the player.
111 00:06:45,879 --> 00:06:49,319 This gave the world of Wolfenstein so much believability for the time.
112 00:06:49,319 --> 00:06:52,839 You were no longer just navigating wireframe mazes.
113 00:06:52,839 --> 00:06:57,959 You were an action hero, BJ Blazkowicz, infiltrating a Nazi castle.
114 00:06:57,959 --> 00:07:00,920 The walls were adorned with flags of the German Reich.
115 00:07:00,920 --> 00:07:03,199 You felt closer to the world than ever before.
116 00:07:03,199 --> 00:07:06,240 You were interacting with a true 3D space.
117 00:07:06,240 --> 00:07:08,500 This process was, however, flawed.
118 00:07:08,500 --> 00:07:13,279 In Wolfenstein 3D, there was no verticality at all, no difference in elevation, only the
119 00:07:13,279 --> 00:07:16,800 walls had texture, the ceiling and floor had to be flat colours.
120 00:07:16,800 --> 00:07:20,680 If they wanted texture on the ceiling and floor, they would have had to add horizontal
121 00:07:20,680 --> 00:07:21,680 scanlines.
122 00:07:21,680 --> 00:07:24,079 You were still ultimately navigating a maze.
123 00:07:24,079 --> 00:07:27,879 A colourful maze with Nazis in it, but a maze nonetheless.
124 00:07:27,879 --> 00:07:31,319 Wolfenstein 3D was released in May 1992.
125 00:07:31,319 --> 00:07:34,960 The sequel, Spear of Destiny, was released later in the same year.
126 00:07:34,960 --> 00:07:39,560 While the rest of the id team was working on Spear of Destiny, John Carmack, the ascetic,
127 00:07:39,560 --> 00:07:42,560 high priest of technology, locked himself away to study.
128 00:07:42,560 --> 00:07:47,199 He would brainstorm the revolutionary tech that would power their next massive game,
129 00:07:47,199 --> 00:07:51,480 the next game that the rest of the team would start working on in September 1992.
130 00:07:51,480 --> 00:07:56,079 It would be something inspired by evil dead, brutal and violent.
131 00:07:56,079 --> 00:08:02,040 The name, Green and Pissed, was ultimately passed up for the much snappier Doom.
132 00:08:02,040 --> 00:08:08,000 Doom would launch in 1993, the game would truly be able to transport you into a world.
133 00:08:08,000 --> 00:08:10,620 The levels truly felt like places.
134 00:08:10,660 --> 00:08:16,100 The architecture of Doom consisted of supernatural science facilities, with Giger-esque and hellish
135 00:08:16,100 --> 00:08:17,740 environments as well.
136 00:08:17,740 --> 00:08:23,060 The enemies were a combination of horror and sci-fi with cybernetically enhanced demons.
137 00:08:23,060 --> 00:08:27,959 The architecture, over the top setting and violence was inspired by films such as Evil
138 00:08:27,959 --> 00:08:29,180 Dead and Alien.
139 00:08:29,180 --> 00:08:34,620 The floors could now be angled, they could now have multiple levels with stairs and elevators.
140 00:08:34,620 --> 00:08:38,120 Tools of toxic fluid surrounded these risen platforms.
141 00:08:38,120 --> 00:08:40,220 It was truly 3D.
142 00:08:40,220 --> 00:08:41,540 But it wasn't really.
143 00:08:41,540 --> 00:08:46,180 They were yet to achieve the full 6 degrees of freedom that John Romero wanted.
144 00:08:46,180 --> 00:08:47,899 This wouldn't happen until Quake.
145 00:08:47,899 --> 00:08:51,700 Rumours couldn't be stacked on top of each other, there was no vertical aim, the game
146 00:08:51,700 --> 00:08:54,779 was entirely played on the horizontal axis.
147 00:08:54,779 --> 00:08:58,560 The thing is, vertical aim was actually possible at the time.
148 00:08:58,560 --> 00:09:02,860 They could have limited the enemies' vertical hitboxes to the size of the sprite, but they
149 00:09:02,860 --> 00:09:03,860 didn't.
150 00:09:03,860 --> 00:09:06,620 Not because they couldn't, but to save processing power.
151 00:09:06,620 --> 00:09:09,779 You see, Doom was still using software rendering.
152 00:09:09,819 --> 00:09:13,939 This shareware model relied on getting their games on as many computers as possible, like
153 00:09:13,939 --> 00:09:14,939 I said.
154 00:09:14,939 --> 00:09:18,740 It was essentially the beginning of the free to play game model we have today.
155 00:09:18,740 --> 00:09:22,980 They aimed for the IBM PC, for machines running DOS.
156 00:09:22,980 --> 00:09:27,480 They had to sell their game to university students and wagies who were bored at work
157 00:09:27,480 --> 00:09:29,500 so they could run office tournaments.
158 00:09:29,500 --> 00:09:33,860 They didn't calculate the enemies' vertical hitbox so that they could save memory.
159 00:09:33,860 --> 00:09:38,419 They didn't want to give the enemies' hitboxes a height value, just have another factor to
160 00:09:38,419 --> 00:09:39,419 calculate.
161 00:09:39,459 --> 00:09:43,059 These levels were drawn on a 2D plane, like Wolfenstein.
162 00:09:43,059 --> 00:09:46,099 Just this time, the map creator is quite different.
163 00:09:46,099 --> 00:09:50,659 The ground is divided into sectors, this will be very important later.
164 00:09:50,659 --> 00:09:55,139 Each sector has two associated values, ceiling height and floor height.
165 00:09:55,139 --> 00:09:59,099 Well, it has several associated values, but those are two important ones.
166 00:09:59,099 --> 00:10:03,339 This is also why one room could not be placed above another and why every surface had to
167 00:10:03,339 --> 00:10:06,339 be made out of a flat square or rectangle.
168 00:10:06,340 --> 00:10:10,500 Another reason that vertical aim couldn't have worked is due to how the texture mapping
169 00:10:10,500 --> 00:10:11,500 worked.
170 00:10:11,500 --> 00:10:16,060 One game that did have vertical aim and levels on top of each other, before Quake and not
171 00:10:16,060 --> 00:10:19,420 that long after Doom, was Bungie's Marathon.
172 00:10:19,420 --> 00:10:22,820 And look what happens when you look up and down in that game.
173 00:10:22,820 --> 00:10:25,379 The textures start to distort.
174 00:10:25,379 --> 00:10:29,379 This is because the game, like Doom, uses affine texture mapping.
175 00:10:29,379 --> 00:10:34,780 This, like many of the other methods, was done to save memory on the processor by taking
176 00:10:34,779 --> 00:10:37,139 advantage of CPU caching.
177 00:10:37,139 --> 00:10:43,699 Basically, what happens is that texture coordinates are linearly interpolated, using screen space
178 00:10:43,699 --> 00:10:49,699 distance between vertices, rather than the actual 3D in-engine distance between them.
179 00:10:49,699 --> 00:10:54,000 The distance between points on a plane remains the same when you look up and down.
180 00:10:54,000 --> 00:10:58,379 What this means is that perspective when looking up and down is not accounted for.
181 00:10:58,379 --> 00:11:02,240 You know how pixels on a texture start to warp as you get closer?
182 00:11:02,240 --> 00:11:07,639 It looks like a straight line from far away begins to turn inward as closer pixels get
183 00:11:07,639 --> 00:11:11,399 larger, while more distant pixels get smaller.
184 00:11:11,399 --> 00:11:17,279 This doesn't happen in Doom, because accounting for perspective is taxing on 90s computers.
185 00:11:17,279 --> 00:11:21,539 You know how the game only draws things in columns to save processing time?
186 00:11:21,539 --> 00:11:25,840 They'd have had to do vertical scans as well as horizontal scans.
187 00:11:25,840 --> 00:11:32,080 Newer ports of Doom with newer rendering engines made for new hardware like GZDoom obviously
188 00:11:32,920 --> 00:11:33,920 don't have this limitation.
189 00:11:33,920 --> 00:11:37,960 As such, they use more current texture mapping and don't have this issue.
190 00:11:37,960 --> 00:11:40,720 But all of these concessions weren't enough.
191 00:11:40,720 --> 00:11:45,560 John Carmack's coding brilliance met its most devious enemy yet.
192 00:11:45,560 --> 00:11:46,560 Stairs.
193 00:11:46,560 --> 00:11:51,720 John Romero came out with a really way-out and strange idea on his early incarnation
194 00:11:51,720 --> 00:11:52,720 of E1M2.
195 00:11:52,720 --> 00:11:58,720 Yes, he wanted to mix things up with the earth-shattering invention of stairs.
196 00:11:58,720 --> 00:12:03,600 You see, just raycasting alone wasn't enough to efficiently optimise the game.
197 00:12:03,600 --> 00:12:07,680 Raycasting saves memory by only rendering things which are visible to the player.
198 00:12:07,680 --> 00:12:12,519 However, surfaces on the inside of these stairs were visible to the existing algorithm.
199 00:12:12,519 --> 00:12:15,320 Thus, they were drawn when they shouldn't have been.
200 00:12:15,320 --> 00:12:20,360 You see, for 3D rendering to not waste performance, they need to draw as few surfaces, as few
201 00:12:20,360 --> 00:12:21,840 planes as possible.
202 00:12:21,840 --> 00:12:27,639 This necessitates occlusion culling, or visible surface determination, or backface culling.
203 00:12:27,679 --> 00:12:33,319 Basically, the renderer should only draw what is in the player's field of view.
204 00:12:33,319 --> 00:12:37,080 There need to be absolutely no overdraw whatsoever.
205 00:12:37,080 --> 00:12:43,080 Adding height as a variable, such as with Romero's stairs, requires a much more sophisticated
206 00:12:43,080 --> 00:12:49,000 algorithm than was present in Wolfenstein, and in id's existing rendering engine.
207 00:12:49,000 --> 00:12:53,240 There are many different rendering algorithms out there, it seems that we need to dip into
208 00:12:53,240 --> 00:12:58,680 the hypothetical algorithms to start trawling the literature for some better algorithms.
209 00:12:58,680 --> 00:13:00,440 Let's explore some of the options.
210 00:13:00,440 --> 00:13:04,120 There's the painter's algorithm, named so because, like in a painting, the background
211 00:13:04,120 --> 00:13:07,000 is rendered first with detail layered on top.
212 00:13:07,000 --> 00:13:10,960 Basically, the polygons are sorted by their distance from the viewer, and the more distant
213 00:13:10,960 --> 00:13:14,759 polygons are rendered first, and the closest polygon is rendered last.
214 00:13:14,759 --> 00:13:17,399 It is easily the most simple solution.
215 00:13:17,399 --> 00:13:24,159 It was developed in 1972, the year MASH started, as an easy to implement solution for CAD.
216 00:13:24,159 --> 00:13:29,399 It also has the worst possible case for space complexity, meaning it takes up as much memory
217 00:13:29,399 --> 00:13:31,819 as an algorithm possibly could.
218 00:13:31,819 --> 00:13:34,639 Every single surface in the field of view is drawn.
219 00:13:34,639 --> 00:13:37,120 Obviously, this isn't a good fit.
220 00:13:37,120 --> 00:13:41,399 It's more of an example from the early days of exactly what not to do.
221 00:13:41,399 --> 00:13:43,039 There's also Warnock's algorithm.
222 00:13:43,240 --> 00:13:47,439 John Warnock was the founder of Adobe, and this algorithm originated in his doctoral
223 00:13:47,439 --> 00:13:52,879 thesis in 1969, the year man landed on the moon and In the Court of the Crimson King
224 00:13:52,879 --> 00:13:53,879 was released.
225 00:13:53,879 --> 00:13:57,899 Essentially, it recursively subdivides the screen into four parts.
226 00:13:57,899 --> 00:14:01,839 What this means is it splits the screen into four windows and splits each window into four
227 00:14:01,839 --> 00:14:02,959 smaller windows.
228 00:14:02,959 --> 00:14:08,000 It does this again and again until each window is trivial to render, meaning it has only
229 00:14:08,000 --> 00:14:10,719 one or zero polygons present.
230 00:14:10,720 --> 00:14:14,960 The algorithm also checks if multiple polygons are within one window.
231 00:14:14,960 --> 00:14:19,040 If the closest polygon covers the whole window, then it is drawn.
232 00:14:19,040 --> 00:14:23,440 This is more efficient than Painter's algorithm as it renders front to back, but it's still
233 00:14:23,440 --> 00:14:25,160 not very well suited.
234 00:14:25,160 --> 00:14:30,800 It will eventually keep subdividing to a ridiculous degree, to the point where a window is smaller
235 00:14:30,800 --> 00:14:31,800 than a pixel.
236 00:14:31,800 --> 00:14:34,399 Yeah, this ain't a good fit for a game.
237 00:14:34,399 --> 00:14:38,180 You could do a z-buffer for every pixel you want to draw, check if there's anything in
238 00:14:38,180 --> 00:14:39,180 front of it.
239 00:14:39,180 --> 00:14:40,740 And check on every single pixel?
240 00:14:40,740 --> 00:14:42,740 Yeah, there's no chance in hell.
241 00:14:42,740 --> 00:14:47,700 The final solution does kinda use a z-buffer, but it doesn't do that check on every single
242 00:14:47,700 --> 00:14:50,780 pixel, it finds a much more efficient way to do it.
243 00:14:50,780 --> 00:14:51,780 No.
244 00:14:51,780 --> 00:14:57,820 In order to truly revolutionize not just gaming, but 3D graphics forever, our protagonist,
245 00:14:57,820 --> 00:15:03,340 John Carmack, needs to go to a much more inspired source, something that hadn't actually been
246 00:15:03,340 --> 00:15:04,960 implemented before.
247 00:15:04,960 --> 00:15:06,960 Something he'd just read in a white paper.
248 00:15:06,960 --> 00:15:07,960 Just a concept.
249 00:15:07,960 --> 00:15:13,019 Yes, how common is it in gaming to see people run into optimization issues and seek out
250 00:15:13,019 --> 00:15:17,259 a white paper to solve their problem, because nobody else had done it before?
251 00:15:17,259 --> 00:15:19,139 Yes, that's Carmack for you.
252 00:15:19,139 --> 00:15:23,720 We needed a renderer that would draw objects closest to the player to furthest away until
253 00:15:23,720 --> 00:15:27,580 every pixel was written to, that had no overdraw.
254 00:15:27,580 --> 00:15:32,879 The solution was in a 1980 white paper, as the same year Genesis released the reclaimed
255 00:15:32,879 --> 00:15:36,580 album, Duke, where they really came into their own.
256 00:15:37,960 --> 00:15:45,820 This 1980 white paper by Bruce Nailup was given the humble title, On Visible Surface
257 00:15:45,820 --> 00:15:49,060 Generation by Apriori Tree Structures.
258 00:15:49,060 --> 00:15:55,460 It described a rendering model we know as Binary Space Partitioning, or BSP for short.
259 00:15:55,460 --> 00:15:58,700 This was the method that would change gaming for years.
260 00:15:58,700 --> 00:16:02,300 This wasn't the first time Binary Space Partitioning was alluded to.
261 00:16:02,300 --> 00:16:08,520 A 1969 study by the Air Force of the good ol' US of A alluded to the use of partitioning
262 00:16:08,520 --> 00:16:12,000 3D scenes to solve the visible surface problem.
263 00:16:12,000 --> 00:16:17,340 The study was conducted to determine the viability of 3D for flight simulation.
264 00:16:17,340 --> 00:16:21,380 We can thank the armed forces of the United States for giving us doom.
265 00:16:21,380 --> 00:16:24,900 They explored using a matrix to track which objects are occluded.
266 00:16:24,900 --> 00:16:29,160 These of course wouldn't do so well as the size of the matrix would need to be the square
267 00:16:29,160 --> 00:16:31,760 of the number of objects in a scene.
268 00:16:31,759 --> 00:16:33,620 That wouldn't scale very well.
269 00:16:33,620 --> 00:16:39,200 It wasn't until 1980 that Binary Space Partitioning was properly realized in the white paper that
270 00:16:39,200 --> 00:16:43,860 would reach John Carmack alongside its core tenant, the binary tree.
271 00:16:43,860 --> 00:16:46,740 But what is Binary Space Partitioning anyway?
272 00:16:46,740 --> 00:16:48,779 Well, the name gives you a clue.
273 00:16:48,779 --> 00:16:52,080 Is partitioning space in a 3D environment?
274 00:16:52,080 --> 00:16:54,939 This is done using a BSP tree.
275 00:16:54,939 --> 00:16:57,100 What is a BSP tree you may ask?
276 00:16:57,100 --> 00:17:02,040 In computer science, a tree is a data structure used as a mathematical model for displaying
277 00:17:02,040 --> 00:17:03,560 certain data types.
278 00:17:03,560 --> 00:17:08,720 It's separated into nodes with parent nodes that have child nodes.
279 00:17:08,720 --> 00:17:13,279 BSP uses binary trees, binary essentially meaning two.
280 00:17:13,279 --> 00:17:18,759 A binary tree is a tree where there are two or less child nodes stemming from any given
281 00:17:18,759 --> 00:17:20,559 parent, from any node.
282 00:17:20,559 --> 00:17:23,380 There are never more than two child nodes.
283 00:17:23,380 --> 00:17:28,160 This is as opposed to a non-binary tree, which is a tree that has dyed hair and a gender
284 00:17:28,160 --> 00:17:29,320 studies degree.
285 00:17:29,320 --> 00:17:34,800 The data stored in the nodes of the binary tree are the sub-sectors of the map.
286 00:17:34,800 --> 00:17:39,640 Sub-sectors being smaller parts of those map sectors I spoke about earlier.
287 00:17:39,640 --> 00:17:45,360 Remember, each map is designed on a flat 2D map editor, with each sector having associated
288 00:17:45,360 --> 00:17:46,600 height values.
289 00:17:46,600 --> 00:17:51,880 The genius is that the map is sliced up via binary space partitioning after the map is
290 00:17:51,880 --> 00:17:52,880 built.
291 00:17:52,880 --> 00:17:58,580 Hard work is done when the map is created, rather than by the processor at runtime while
292 00:17:58,580 --> 00:18:00,340 the player is playing the game.
293 00:18:00,340 --> 00:18:06,180 The map is already split, already partitioned when the player loads it, reducing processing
294 00:18:06,180 --> 00:18:07,840 needed at runtime.
295 00:18:07,840 --> 00:18:12,980 To create the binary tree, a root node is established covering the whole map.
296 00:18:12,980 --> 00:18:19,460 After this, the map is recursively subdivided along every plane until only convex sub-sectors
297 00:18:19,460 --> 00:18:20,460 are left.
298 00:18:20,480 --> 00:18:24,120 The sectors are carved into smaller sub-sectors.
299 00:18:24,120 --> 00:18:28,900 The entire map is essentially cut in two along every single wall.
300 00:18:28,900 --> 00:18:33,799 Every time the map is cut in half, the two halves are added as nodes at the bottom of
301 00:18:33,799 --> 00:18:34,840 the tree.
302 00:18:34,840 --> 00:18:40,400 By the end, you're left with a tree where each node at the bottom of the tree represents
303 00:18:40,400 --> 00:18:42,000 a distinct sub-sector.
304 00:18:42,000 --> 00:18:45,539 Remember, this tree is entirely conceptual.
305 00:18:45,539 --> 00:18:47,500 It doesn't actually exist.
306 00:18:47,519 --> 00:18:52,920 So long as the planes don't move – vertical movement is accepted from this because vertical
307 00:18:52,920 --> 00:18:57,960 movement is a separate value – the same BSP tree can be used.
308 00:18:57,960 --> 00:19:03,480 Doom's BSP tree generation was done after levels were complete and would search for
309 00:19:03,480 --> 00:19:09,000 the best possible tree, that being the one that generates the fewest binary tree nodes.
310 00:19:09,000 --> 00:19:13,240 A binary search is performed to determine what sector the player is in.
311 00:19:13,259 --> 00:19:18,660 A binary search is when an array of pre-sorted data is searched through by continually halving
312 00:19:18,660 --> 00:19:19,660 said array.
313 00:19:19,660 --> 00:19:24,740 A search through a binary tree is, by its nature, a binary search, because every time
314 00:19:24,740 --> 00:19:29,079 you go down a node, you're removing half of the possibilities.
315 00:19:29,079 --> 00:19:33,859 After the player's sector is determined using this binary search, the sub-sectors
316 00:19:33,859 --> 00:19:38,279 are then sorted by their distance from the player – closest to furthest.
317 00:19:38,279 --> 00:19:42,279 The tree is iterated through to determine which planes to draw.
318 00:19:42,299 --> 00:19:47,220 The horizontal scanlines from raycasting are still used to track the parts of the screen
319 00:19:47,220 --> 00:19:48,859 that have been drawn over.
320 00:19:48,859 --> 00:19:53,619 This way they are able to render front to back and ensure that there is no overdraw.
321 00:19:53,619 --> 00:19:57,859 When each node is passed over in the iteration, a few things are checked.
322 00:19:57,859 --> 00:20:00,019 Has that area already been painted over?
323 00:20:00,019 --> 00:20:02,339 If so, don't bother drawing it.
324 00:20:02,339 --> 00:20:08,660 When a plane, polygon or wall is drawn, it is akin to a curtain being drawn left to right.
325 00:20:08,720 --> 00:20:14,200 To unveil an area, so to speak, whenever a curtain is seen by the player, it is unveiled
326 00:20:14,200 --> 00:20:16,040 from closest to the furthest.
327 00:20:16,040 --> 00:20:20,759 To be exact, it's the closest 256 walls that are displayed.
328 00:20:20,759 --> 00:20:26,240 Remember how height of the pixel columns drawn on screen depended on distance from the player?
329 00:20:26,240 --> 00:20:32,120 For Doom, this required determining the angle of both ends of every wall, relative to the
330 00:20:32,120 --> 00:20:33,440 player's field of view.
331 00:20:33,440 --> 00:20:38,580 In the early 90s, most processors didn't have dedicated floating-point capability.
332 00:20:38,599 --> 00:20:41,599 This is a float in programming, if you've ever heard of that.
333 00:20:41,599 --> 00:20:45,359 Basically a data type for very precise decimal numbers.
334 00:20:45,359 --> 00:20:51,659 The Doom engine had to use binary angle measurements, which avoid floats, and used a lookup table
335 00:20:51,659 --> 00:20:54,000 to determine the x coordinates.
336 00:20:54,000 --> 00:20:56,960 A lookup table is essentially a cheat sheet.
337 00:20:56,960 --> 00:21:02,359 Instead of the processor doing the maths itself, it just looks up the answer in this lookup table.
338 00:21:02,359 --> 00:21:08,559 They also used these angles for backface culling, with a simple and elegant piece of mathematics
339 00:21:08,559 --> 00:21:14,379 Backface culling basically means the renderer doesn't draw the inside of every polygon.
340 00:21:14,379 --> 00:21:18,099 It only draws the part on the outside that you actually see.
341 00:21:18,099 --> 00:21:23,339 The walls are rendered first as pixel columns from front to back, then the ceilings and
342 00:21:23,339 --> 00:21:25,819 floors using pixel rows.
343 00:21:25,819 --> 00:21:31,500 The objects such as barrels and enemies are rendered from the furthest to the closest.
344 00:21:31,500 --> 00:21:37,599 The ceilings and floors are determined using visplane underscore t, or visplanes.
345 00:21:37,600 --> 00:21:41,900 Plains were determined using height values within each sector.
346 00:21:41,900 --> 00:21:46,940 Visplanes are not constrained to single sectors, and will be continuous provided they all possess
347 00:21:46,940 --> 00:21:50,600 the same height, illumination and textures.
348 00:21:50,600 --> 00:21:53,180 Pixel rows are drawn top to bottom.
349 00:21:53,180 --> 00:21:59,080 One final thing you may wonder about Doom's graphics is why are all the enemies just pictures
350 00:21:59,080 --> 00:22:00,780 facing towards you?
351 00:22:00,780 --> 00:22:04,880 Probably something to do with them being what we call front-facing sprites.
352 00:22:04,880 --> 00:22:08,980 They're rendered last, and like I said, furthest to closest.
353 00:22:08,980 --> 00:22:11,340 That's the opposite order to the geometry.
354 00:22:11,340 --> 00:22:16,340 They are just pictures, taken from the data files and projected onto screen.
355 00:22:16,340 --> 00:22:18,840 Of course, they're a range of pictures.
356 00:22:18,840 --> 00:22:24,020 The one that is drawn depends on the player's location relative to the enemy, and the direction
357 00:22:24,020 --> 00:22:25,500 the enemy is facing.
358 00:22:25,500 --> 00:22:29,040 The enemies do actually have a full 3D hitbox.
359 00:22:29,040 --> 00:22:34,260 The pictures, as most fans know, are actually from real pictures taken of sculptures made
360 00:22:34,259 --> 00:22:35,400 by the artists.
361 00:22:35,400 --> 00:22:41,519 So, John Carmack was faced with a fierce issue in the problem of visible surface determination.
362 00:22:41,519 --> 00:22:46,759 He had to find a solution that was both incredibly fast and very accurate.
363 00:22:46,759 --> 00:22:53,000 BSP doesn't completely solve the visible surface determination problem, but it is one
364 00:22:53,000 --> 00:22:56,900 of the most reliable and efficient methods of optimisation.
365 00:22:56,900 --> 00:22:59,039 It saw massive acceptance.
366 00:22:59,039 --> 00:23:04,339 BSPs were evolved and made their way into Quake's dramatically improved game engine
367 00:23:04,339 --> 00:23:09,819 when they took on Michael Abrash and finally figured out the full six degrees of freedom.
368 00:23:09,819 --> 00:23:15,220 From there, it was in every FPS, and I mean all of them.
369 00:23:15,220 --> 00:23:16,859 Half-Life and Half-Life 2?
370 00:23:16,859 --> 00:23:17,859 Every Source game.
371 00:23:17,859 --> 00:23:19,819 Counter-Strike to Left 4 Dead.
372 00:23:19,819 --> 00:23:21,500 The Halo series used it.
373 00:23:21,500 --> 00:23:24,579 You know the Scarab from Halo 2 is actually a BSP object?
374 00:23:24,579 --> 00:23:27,980 Yes, it's a moving piece of level geometry.
375 00:23:27,980 --> 00:23:33,039 Many would say it's a sign of Carmack's genius that he took an idea from concept to
376 00:23:33,039 --> 00:23:34,839 mainstream solution.
377 00:23:34,839 --> 00:23:41,839 He did all this crazy work in between supercharging Ferraris and becoming a judo master.
378 00:23:41,839 --> 00:23:44,519 One time, he got locked inside a building.
379 00:23:44,519 --> 00:23:49,759 Instead of, say, waiting for security or calling a locksmith, he devised a brilliant solution.
380 00:23:49,759 --> 00:23:54,759 He'd luckily gone to a Renaissance Fair earlier, where he bought a medieval battle axe.
381 00:23:54,759 --> 00:23:58,779 Also, naturally, he smashed down the door with his mighty axe.
382 00:23:58,779 --> 00:24:01,539 He was rich so he could afford to get the door fixed.
383 00:24:01,539 --> 00:24:07,379 He truly is a unique figure in the gaming industry, and you can see why he's so highly
384 00:24:07,379 --> 00:24:08,379 respected.
385 00:24:08,379 --> 00:24:12,019 If you made it this far, comment, thank you John Carmack.
386 00:24:12,019 --> 00:24:17,700 The use of BSP trees has begun to be replaced over the last few years.
387 00:24:17,700 --> 00:24:20,740 Developers instead opt for things like static meshes.
388 00:24:20,740 --> 00:24:25,620 With more powerful hardware now, they can afford some level of overdraw.
389 00:24:25,620 --> 00:24:30,400 Other methods give artists more creative freedom and a much quicker workflow.
390 00:24:30,400 --> 00:24:36,420 BSP often leads to the distinct blocky look that many old games had.
391 00:24:36,420 --> 00:24:41,839 One could certainly argue that these technical limitations are what gave source maps and
392 00:24:41,839 --> 00:24:45,720 early 2000s maps in general their distinct charm.
393 00:24:45,720 --> 00:24:47,079 Their soul.
394 00:24:47,079 --> 00:24:54,079 With stark and distinct architectural choices, some magic is truly lost in busy modern day
395 00:24:54,079 --> 00:24:55,399 maps.
396 00:24:55,399 --> 00:25:01,119 Many new games have actually tried to go back to recreating these older, cleaner, more distinct
397 00:25:01,119 --> 00:25:02,119 visuals.
398 00:25:02,119 --> 00:25:07,639 BSP is still occasionally used today in prototyping levels for games, quickly blocking them out.
399 00:25:07,639 --> 00:25:11,279 It's of course still used in games such as Counter Strike Go.
400 00:25:11,279 --> 00:25:16,000 This was a big video, and naturally it took a bit of research, which I've provided links
401 00:25:16,000 --> 00:25:17,539 to in the description.
402 00:25:17,539 --> 00:25:23,960 If I got anything wrong, please feel free, in fact feel obligated, to call me out in
403 00:25:23,960 --> 00:25:24,960 the comments.
404 00:25:24,960 --> 00:25:31,359 Like, join the Discord server and subscribe with notifications on to join the nerd army
405 00:25:31,359 --> 00:25:33,000 and become a sigma male.
406 00:25:33,000 --> 00:25:34,799 Thanks for watching, goodbye.
407 00:26:03,000 --> 00:26:24,680 Bye.