Episode 7: The one with Olga Hall

Notes:

In this episode, we sit down with Olga Hall, Global Head of Amazon Video Availability and Scale to discuss Chaos Engineering, experimentation, and how to do it at scale. You won’t want to miss this episode as we learn from Olga about availability fairytales and how failure really is inevitable. We give some inside scoop on what is happening Chaos Community Day London.

[Read More]

Episode 6: The one with Peter Alvaro

Notes:

In this episode, we sit down with Dr. Peter Alvaro of Disorderly Labs and discuss chaos, failure, academics, and computer science. The topics range from data provenance to fault injection but hinge on how Peter’s LDFI research is impacting the field of chaos engineering. Also, Casey tries to talk Peter into a way to use his research as a way to pull off a bank heist, and of course, James tries to be the voice of reason.

[Read More]

Episode 5: The one with Crystal Hirschorn

Notes:

In this episode, we experience a royal reception with Crystal Hirschorn, the Director Of Engineering at Snyk, to talk about chaos engineering and how to put those practices to work in organizations of any size. Crystal shares advice with us on how to get VP’s and others on board with chaos engineering, which looks quite different from bringing engineers along on the journey.

Throughout the conversation you will see how chaos engineering has matured from the early days of Chaos Monkey from Netflix and how it is in practice at some of the largest organizations across the globe. Crystal discusses how resilience engineering moves from academia and can be put into practice in real organizations. It helps us understand socio-technical systems and how they come into play when we write software.

[Read More]

Episode 4: The one with John Allspaw

Notes:

In this episode, we are joined by John Allspaw the Co-Founder of Adaptive Capacity Labs. John instructs us on how to deal with the bad apples in our organizations, and debates whether or not we can find root cause and human error—whether the person is embodied or not. If you are interested in accident investigations, uncovering truth in complex systems, or using heuristics and correlation to find meaning, then this is the episode for you. Special appearances include Dr. Richard Cook and John’s sister, Sue Allspaw, asking hard-hitting questions that leave John speechless.

[Read More]

Episode 3: The one with Julie Gunderson

Notes:

In this episode Julie explains how communication fits in your DevOps (and Chaos Engineering) efforts and she dissects the misuse of ITIL and root cause thinking as challenges to building a learning organization. After hearing her talk The Psychology of Chaos Engineering at DevOpsDays Raleigh, we knew she had to come on the show. Julie explains how to approach digital and cloud transformations as well as delivers some hilarious one-liners, making this a show to remember.

[Read More]

Episode 2: The one with Russ Miles

Notes:

Russ Miles is a prolific writer and speaker on Chaos Engineering, especially well-known as an evangelist of the discipline in his home town of London. James and Casey discuss his chapter in Chaos Engineering: System Resiliency in Practice which is titled, “Open Minds, Open Science, and Open Chaos.” And as if that isn’t a mouthful, we discuss his previous published works and his Chaos Engineering-themed tour of the United States.

[Read More]

Episode 1: The one with Nora Jones

Notes:

Our special guest Nora Jones and our host Casey Rosenthal are the two authors of O’Reilly’s Chaos Engineering: System Resiliency in Practice, the definitive book on the subject. We ask Nora about learning from incidents, how to facilitate a healthy Chaos Engineering program, the path that led to her writing this book with Casey, how her thinking has changed over the years, and her startup Jeli.

[Read More]

Episode 0: The one with Nathan Aschbacher

Notes:

Introducing the first contributing author from Chaos Engineering: System Resiliency in Practice in our series on the book, Nathan wrote Chapter 17: “Let’s Get Cyber-Physical.”

[Read More]