Sunday, April 7, 2024
GCP GKE Juara GCP

Troubleshooting Workloads on GKE for Site Reliability Engineers

Site

“Troubleshooting Workloads on GKE for Site Reliability Engineers”

Pengantar

Insinyur Keandalan Lokasi (SRE) memiliki serangkaian tanggung jawab yang luas, dan mengelola insiden adalah bagian penting dari peran mereka. Anda akan mempelajari cara memanfaatkan kemampuan terintegrasi rangkaian operasi Google Cloud yang mencakup pembuatan log, pemantauan, dan dasbor siap pakai yang kaya.

Praktiktikum

Task 1. Navigating Google Kubernetes Engine (GKE) resource pages

  • Klik Navigation menu > Kubernetes Engine > Clusters
  • Klik cloud-ops-sandbox
  • Klik Node

Task 2. Accessing operational data through GKE Dashboards

  • Buka Navigation menu > Kubernetes Engine > Services & Ingress
  • Buka Enpoint frontend-external
  • Buka Cloudshell
git clone --depth 1 --branch cloudskillsboost_asm https://github.com/GoogleCloudPlatform/cloud-ops-sandbox.git
cd cloud-ops-sandbox/sre-recipes
  • Masuk ke menu Navigation Menu > Kubernetes Engine > Clusters
  • Buka titik 3 , lalu klik connect
  • Masuk ke cloud shell lagi
./sandboxctl sre-recipes restore "recipe3"
  • masuk ke Navigation Menu > Kubernetes Engine > Services & Ingress.
  • Klik enpoind frontend-external

Task 3. Proactive monitoring with logs-based metrics

  • Buka Navigation Menu > Logging > Logs Explorer
  • Pada Query results , Klik +Create metric
  • Masukan pilihan berikut
Metric Type: Counter
Log metric name: Error_Rate_SLI
Filter Selection: (Copy and paste the filter below)
resource.labels.cluster_name="cloud-ops-sandbox" AND resource.labels.namespace_name="default" AND resource.type="k8s_container" AND labels.k8s-pod/app="recommendationservice" AND severity>=ERROR
  • Klik Create Metric

Task 4. Creating a SLO

  • Klik Navigation menu > Monitoring > Services
  • Pilih recommendationservice
  • Klik Create SLO
  • Pilih Continue

Task 5. Define an alert on the SLO

  • Buka Navigation menu > Monitoring > Services
  • Klik recommendationservice
  • Klik CREATE SLO ALER

Penutup

Sahabat Blog Learning & Doing demikianlah penjelasan mengenai Troubleshooting Workloads on GKE for Site Reliability Engineers. Semoga Bermanfaat . Sampai ketemu lagi di postingan berikut nya.

(Visited 94 times, 1 visits today)

Similar Posts