How to Install GCP Ops Agent
Introduction to Google Cloud Monitoring
Google Cloud Monitoring is part of Google Cloud’s Operations Suite, providing visibility into the performance, uptime, and overall health of your cloud infrastructure, applications, and services. It collects and analyzes metrics, logs, and traces, enabling you to detect, troubleshoot, and optimize your cloud environment.

Key Features of Google Cloud Monitoring
- Real-Time Observability: Monitor CPU, memory, disk, and network usage across Compute Engine instances.
- Custom Dashboards: Visualize key performance metrics in one place.
- Alerting: Set up automated alerts to stay ahead of performance issues.
- Log Aggregation: Collect and analyze logs for troubleshooting.
- Hybrid and Multi-Cloud: Monitor AWS, Azure, and on-prem resources.
What is the Ops Agent?
The Ops Agent is a unified agent designed to collect logs and metrics from Compute Engine instances, consolidating the functionality of the legacy monitoring and logging agents.
| Feature | Ops Agent | Legacy Agents |
|---|---|---|
| Unified Logs & Metrics | Yes | No |
| Efficiency | High | Moderate |
| Custom Configuration | Yes | Limited |

How to Install GCP Ops Agent on Compute Instance
Step 1: Preparing for Installation
Ensure that your user account has the following IAM roles:
roles/logging.adminroles/monitoring.admin
Step 2: Connect to the VM Instance
1. In GCP Console, go to Compute Engine → VM Instances.
2. Click SSH next to the instance you want to install the Ops Agent on.
Step 3: Install the Ops Agent
For Debian/Ubuntu:
curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash add-google-cloud-ops-agent-repo.sh --also-install
For RHEL/CentOS:
sudo yum install google-cloud-ops-agent
Step 4: Verify Installation
sudo systemctl status google-cloud-ops-agent
The service should show as active (running).
Configuring the Ops Agent
Edit the configuration file to define log and metric collection:
sudo nano /etc/google-cloud-ops-agent/config.yaml
Example Configuration
logging:
receivers:
syslog:
type: files
include_paths: [/var/log/syslog]
service:
pipelines:
default_pipeline:
receivers: [syslog]
metrics:
receivers:
hostmetrics:
type: hostmetrics
collection_interval: 60s
service:
pipelines:
default_pipeline:
receivers: [hostmetrics]
Restart the agent:
sudo systemctl restart google-cloud-ops-agent
Confirm Data in Cloud Monitoring:
- Go to Google Cloud Console > Monitoring > Metrics Explorer.
- Search for VM instance metrics (e.g., CPU, memory).
Installing Ops Agent on AWS EC2 instance
Follow these steps to install the Ops Agent on AWS EC2 instances:
Step 1: Connect to the Instance
Log in to the AWS console, select an EC2 instance, and SSH into it.
Step 2: Install the Agent
curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash add-google-cloud-ops-agent-repo.sh --also-install
Comprehensive Monitoring Options in Google Cloud:
Beyond Ops Agent, Google Cloud Monitoring offers multiple tools and approaches to monitor resources, collect logs, analyze data, and set alerts. These options ensure full observability across cloud and hybrid environments.
1. Metrics Monitoring:
Metrics are essential performance indicators collected from Google Cloud services, third-party applications, and VM instances.
- System Metrics: CPU, memory, disk, network traffic.
- Application Metrics: Custom application performance data.
- GKE (Kubernetes) Metrics: Pod CPU usage, memory consumption, and cluster health.
- Cloud Function Metrics: Execution times, error counts, and invocation rates.
How to Access:
- Google Cloud Console > Monitoring > Metrics Explorer
- Select a resource (VM, database, load balancer), then choose metrics to visualize.
Key Benefits:
- Real-time Data: Monitor metrics in near real-time.
- Custom Metrics: Define and publish custom application metrics using APIs.
- Granular Filters: Use filters and labels to monitor specific instances or regions.
2. Logging with Cloud Logging:
Cloud Logging is integrated with Google Cloud Monitoring to collect, analyze, and manage logs from infrastructure, applications, and services.
Log Sources:
- GCE VMs: System logs, application logs.
- Cloud Run: Service access logs.
- Kubernetes: Pod and node logs.
- Network Services: VPC flow logs, firewall logs.
- Security Logs: IAM activity logs, audit logs.
How to Use:
- Google Cloud Console > Logging > Logs Explorer
- Query logs by severity, resource, or timestamp.
Advanced Logging Options:
- Log-Based Alerts: Trigger alerts when specific patterns or errors appear in logs.
- Log Exclusions: Filter out unnecessary logs to reduce storage costs.
- Retention Policies: Customize how long logs are stored (default: 30 days).
3. Alerting Policies:
Alerts notify you when systems or services operate outside expected parameters.
Types of Alerts:
- Metric-based Alerts: Triggered by CPU, memory, or disk spikes.
- Uptime Checks: Monitors service availability by pinging URLs or IPs.
- Log-based Alerts: Detects errors, security breaches, or unusual activity.
- Incident Detection: Automatically escalates to PagerDuty, Slack, or email.
Configuration:
- Google Cloud Console > Monitoring > Alerting
- Define conditions (thresholds) and notification channels.
Examples:
- Notify when CPU exceeds 80% for 10 minutes.
- Alert on HTTP 500 errors in load balancers.
- Detect downtime for key APIs.
4. Dashboards for Visualization:
Dashboards provide a visual representation of your infrastructure health and performance.
Dashboard Types:
- Pre-configured Dashboards: For Google Cloud services (GKE, VMs, databases).
- Custom Dashboards: Fully customizable to display metrics relevant to your app.
- Multi-cloud Dashboards: Import AWS and on-prem data for hybrid observability.
Setup:
- Google Cloud Console > Monitoring > Dashboards > Create Dashboard.
- Add widgets, select resources, and apply filters.
Widget Examples:
- Line charts of CPU usage over time.
- Pie charts showing resource distribution.
- Tables listing VMs sorted by memory consumption.
5. Uptime Checks and Synthetic Monitoring:
Uptime checks actively monitor the availability of your services.
Options:
- HTTP(s) Uptime Check: Monitor web services for 200 OK responses.
- TCP Checks: Ensure VMs or databases respond to network traffic.
- ICMP Ping: Verify server response times.
Configuration:
- Google Cloud Console > Monitoring > Uptime Checks.
- Add target URLs or IP addresses.
Integration with Alerting:
Trigger alerts if uptime checks fail or return slow response times.
6. Service Monitoring for Microservices:
For microservices and distributed systems, Service Monitoring provides automatic dependency mapping and SLO (Service Level Objective) tracking.
Key Features:
- Distributed Tracing: Track requests across multiple services.
- Latency Analysis: Detect service bottlenecks.
- Error Reporting: Identify and group application errors.
Setup:
- Install OpenTelemetry or use Cloud Trace to instrument services.
7. Profiler and Debugger:
- Cloud Profiler: Analyzes the CPU and memory usage of your production applications, identifying code hot spots.
- Cloud Debugger: Allows you to debug applications in production without stopping or slowing down services.
Usage:
- Attach Profiler to JVM, Python, or Go apps.
- Set breakpoints and inspect variables live using Debugger.
Conclusion
Google Cloud Monitoring, paired with the Ops Agent, provides robust monitoring capabilities for your infrastructure, whether in GCP or AWS. Installing and configuring the Ops Agent ensures detailed log collection and performance monitoring across environments. Following this guide helps improve visibility, streamline issue detection, and maintain optimal application performance.
