awesome-chaos-engineering by dastergon

A curated list of Chaos Engineering resources.

updated at Dec. 1, 2024, 1:10 p.m.

Unknown languages

308 +1

6,008 +5

649 +0

GitHub
awesome-ci by ligurio

The list of continuous integration services and tools

updated at Dec. 1, 2024, 12:06 p.m.

Unknown languages

133 -1

3,701 +6

262 +1

GitHub
awesome-sre-tools by SquadcastHub

A curated list of Site Reliability and Production Engineering Tools

updated at Dec. 1, 2024, 1:17 a.m.

Unknown languages

36 +0

1,239 +8

170 +2

GitHub
post-mortems by danluu

A collection of postmortems. Sorry for the delay in merging PRs!

updated at Nov. 30, 2024, 4:28 p.m.

Unknown languages

558 +0

11,322 +6

436 +0

GitHub
og-aws by open-guides

📙 Amazon Web Services — a practical guide

updated at Nov. 30, 2024, 4:02 a.m.

Shell

1,208 +0

35,773 +11

3,880 -2

GitHub
run-book-template by SkeltonThatcher

Run Book / Operations Manual template for modern software systems

updated at Nov. 29, 2024, 7:13 a.m.

Unknown languages

38 +0

706 +1

343 -2

GitHub
postmortem-templates by dastergon

A collection of postmortem templates

updated at Nov. 27, 2024, 6:05 p.m.

Unknown languages

37 +0

1,318 +3

421 +0

GitHub
kubernetes-failure-stories by hjacobs

Compilation of public failure/horror stories related to Kubernetes

updated at Nov. 19, 2024, 6:08 a.m.

HTML

468 +0

6,233 +0

309 +0

GitHub
SRE-cheat-sheet by shibumi

A vocabulary collection for SREs

updated at Nov. 13, 2024, 2:40 p.m.

Unknown languages

11 +0

203 +0

29 +0

GitHub
oncall-handbook by alicegoldfuss

Tips and tricks for getting through on-call

updated at Oct. 16, 2024, 8:06 p.m.

Unknown languages

10 +0

401 +0

43 +0

GitHub
incident-lifecycle-model by preed

A lifecycle model for describing incident management

updated at Sept. 30, 2024, 7:35 p.m.

Unknown languages

3 +0

36 +0

6 +0

GitHub