Rob Miller
Rob Miller
Our EMR scripts are currently dependent on external hosting (PyPi, RPMs, Scala/SBT, github), which we'd like to bring in house so we don't have bootstrap failures we can't control.
From https://bugzilla.mozilla.org/show_bug.cgi?id=1373631: Py4JJavaErrorTraceback (most recent call last) in () ----> 1 serialized_beta_full[1].count() /usr/lib/spark/python/pyspark/rdd.py in count(self) 1006 3 1007 """ -> 1008 return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum() 1009...
From https://bugzilla.mozilla.org/show_bug.cgi?id=1373633 Sequence of events: In Spark: serialized_beta_full[1].saveAsTextFile("s3://net-mozaws-prod-us-west-2-pipeline-analysis/ekr/serialized-beta-full.out") In hadoop: hadoop fs -getmerge s3://net-mozaws-prod-us-west-2-pipeline-analysis/ekr/serialized-beta-full.out serialized-beta-full.out This claims to copy a lot of files, but the result is 0-length.
We want datadog to send a notifier when the number of Sentry exceptions in a day spikes.
Would be nice to specify a particular product version for which to fetch probes.
Currently fetching all probes relevant to a particular Glean product involves getting the top level probes for the product, then extracting the dependencies from the repositories json file and separately...
As the probe counts for specific components continue to grow, support for pagination in the probe fetching API will become more valuable.
SooperLooper won't compile against liblo-0.32 (in use by Arch Linux) because the library changed the type of one of the handler function args from *void to *lo_message_. This change gets...