Missing paren

aldenstpage · aldenstpage · commit 68ce11bcc462 · 2020-08-13T18:33:03.000-04:00
diff --git a/content/blog/entries/crawling-500-million/contents.lr b/content/blog/entries/crawling-500-million/contents.lr
@@ -201,7 +201,7 @@ The final gotcha in the design of our crawler is that we want to crawl every sin
 
 For instance, imagine that each worker is able to handle 5000 simultaneous crawling tasks, and every one of those tasks is tied to a tiny website with a very low rate limit. That means that our entire worker, which is capable of handling hundreds of crawl and analysis jobs per second, is stuck making one request per second until some faster tasks appear in the queue.
 
-In other words, we need to make sure that each worker process isn't jamming itself up with a single source. We have a [scheduling problem](https://en.wikipedia.org/wiki/Scheduling_(computing). We've naively implemented first-come-first-serve and need to switch to a different scheduling strategy.
+In other words, we need to make sure that each worker process isn't jamming itself up with a single source. We have a [scheduling problem](https://en.wikipedia.org/wiki/Scheduling_(computing)). We've naively implemented first-come-first-serve and need to switch to a different scheduling strategy.
 
 There are innumerable ways to address scheduling problems. Since there are only a few dozen sources in our system, we can get away with using a stupid scheduling algorithm: give each source equal capacity in every worker. In other words, if there are 5000 tasks to distribute and 30 sources, we can allocate 166 simultaneous tasks to each source per worker. That's plenty for our purposes. There are obvious drawbacks of this approach in that eventually there will be so many sources that we start starving high rate limit sources of work. We'll cross that bridge when we come to it; it's better to use the simplest possible approach we can get away with instead of spending all of our time on solving hypothetical future problems.