incident-lifecycle-model by preed

A lifecycle model for describing incident management

updated at March 10, 2024, 12:03 p.m.

Unknown languages

2 +0

31 +0

7 +0

GitHub
oncall-handbook by alicegoldfuss

Tips and tricks for getting through on-call

updated at March 23, 2024, 5:56 a.m.

Unknown languages

12 +0

396 +0

43 +0

GitHub
SRE-cheat-sheet by shibumi

A vocabulary collection for SREs

updated at May 3, 2024, 1:26 p.m.

Unknown languages

11 +0

183 +0

29 +0

GitHub
run-book-template by SkeltonThatcher

Run Book / Operations Manual template for modern software systems

updated at May 5, 2024, 7:54 p.m.

Unknown languages

38 +0

698 +1

345 +1

GitHub
postmortem-templates by dastergon

A collection of postmortem templates

updated at May 10, 2024, 10:11 a.m.

Unknown languages

36 +1

1,231 +1

414 +1

GitHub
awesome-sre-tools by SquadcastHub

A curated list of Site Reliability and Production Engineering Tools

updated at May 10, 2024, 5:43 p.m.

Unknown languages

37 +0

1,126 +11

159 +3

GitHub
awesome-ci by ligurio

List of Continuous Integration services

updated at May 11, 2024, 1:09 a.m.

Unknown languages

130 -1

3,510 +9

257 +1

GitHub
awesome-chaos-engineering by dastergon

A curated list of Chaos Engineering resources.

updated at May 11, 2024, 9:53 a.m.

Unknown languages

311 +0

5,812 +16

639 +1

GitHub
post-mortems by danluu

A collection of postmortems. Sorry for the delay in merging PRs!

updated at May 11, 2024, 9:07 p.m.

Unknown languages

558 +0

11,106 +7

432 +1

GitHub