• How to fix a terrible on-call system
    • Published by LeadDev on April 08, 2021
  • Women of Tech Meets: Molly - Lead Site Reliability Engineer
    • Published by Elastic on April 17, 2019
  • How Kenna Security Speeds Up Elasticsearch Indexing at Scale (Part 1)
    • Published by Elastic on February 05, 2019

      Everyone wants their Elasticsearch cluster to index and search faster, but optimizing both at scale can take planning. In 2015, Kenna’s cluster held 500 million documents, a million of which were processed every day. At the time, our poorly configured Elasticsearch cluster was the least stable piece of our infrastructure and could barely keep up as our data size grew. Today, our cluster holds 4 billion documents and we process over 200 million of them a day, with ease. Building a cluster to meet all of our indexing and searching demands was not easy. But with a lot of persistence and a few “OH CRAP!” moments, we got it done and learned a hell of a lot along the way.

  • How Kenna Security Speeds Up Search at Scale using Elasticsearch (Part 2)
    • Published by Elastic on February 07, 2019

      In part one of this blog series I laid out all the techniques my company, Kenna Security, used to speed up indexing while scaling its cluster. In part two, I want to share some of the techniques we used to speed up search while increasing our document count to over four billion documents.

  • What if I called FLUSHALL on your Redis instance?
    • Published in Honeybadger’s Level-Up Newsletter on December 04, 2018

      We have a “Dashboard” page where clients can load part of ALL their reports (think hundreds), and when clients started to hit that without the cache, Elasticsearch lit up like a Christmas Tree. CPU maxed out on all nodes across the board. In the end it was a mad scramble to open multiple consoles to re-cache the reports.

      After Kenna’s systems were restored, Molly worked with the development team to identify steps they could take to prevent the same thing from happening in the future. They came up with a creative safeguard for new developers who might not realize that clearing the Rails cache is a destructive action: they made all production application consoles read-only by default.