Monitoring

Stackdriver

It is the tool provided by GCP for monitoring, logging and debugging
It gives insides to application health, performance and availability
Stackdriver Logging lets define metrics based on logs. These metrics can be displayed on dashboards
Stackdirver Error Reporting: tracks errors in applications and it can notify us when new errors are detected
Stackdriver Trace: report on application latency and sampling
Stackdriver Debugger: it connects application production data to the source code for inspecting application state
Dynamically discovers cloud resources and application services
We can have deep visibility into our applications in minutes
Provides us access to powerful data an analytics tools
Stackdriver offers services for:
- Monitoring
- Logging
- Error Reporting
- Tracing
- Debugging
Supports several third party integrations

Is at the base os SRE
It dynamically configures monitoring after resources are deployed
Allows us to monitor platform, systems an application metrics
Workspace:
- Is a root entity that holds monitoring and configuration information in Stackdriver Monitoring
- We can have as many workspaces as we want, GCP projects can’t be monitored by more than one workspace
- The first monitored project is the Hosting Project and needs to be specified at the workspace creation
Stackdriver allows us to create custom dashboards and charts based on the monitored data
Alerting policies: we can create alerting policies based on monitored data. We can create notifications based on these alerting policies
Update checks: test the availability of the public services
Monitoring agent: used to access system resources and application services. It can be installed in compute resources and application services

Allows us to store, search, analyze and alert on log data on events from GCP and AWS
Logging includes storage for logs, an user interface Logs Viewer and an API to manage logs programmatically
Logs are retained for 30 days, we can export logs into Cloud Storage, BigQuery and Cloud Pub/Sub
Logging Agent: can be installed on VM instances for gathering logs

It is a distributed tracing system that collects latency data from application systems and generates in-depth latency reports
Can collect data from App Engine, Google HTTP(S) load balances and applications implementing Cloud Trace SDKs

Let’s us inspect the state of a running application in real-time without stopping it or slowing it down significantly
It can capture debug snapshots, call stack and local variables of a running application