I’m continuing with these articles on Cloud Run REST API that nobody really needs to read. This time, I’m back with a Go code walkthrough that shows how to deploy and manage services in Cloud Run through its Go API client library.
Cloud Run already offers deployment via mainstream method like CLI, web UI, IDEs and Terraform. So this article is dedicated for the <%0.1 of Cloud Run users (a.k.a. the ones keeping it real) out there who needs to use Cloud Run API with Go. Let’s begin.
You can find the code examples below as a complete program in this repository.
Agenda
- Checking if a Cloud Run service exists
- Deploying a new Cloud Run service
- Waiting for the service to become “ready”
- Making a service public (IAM)
- Releasing a new revision and traffic splitting
- Deleting a service
The package we will mainly use is google.golang.org/api/run/v1
.
go get -u google.golang.org/api/run/v1
We will utilize a small utility function that gives us a regional API endpoint
for Cloud Run, as the default endpoint run.googleapis.com
does not offer the
Knative endpoints for service management:
func client(region string) (*run.APIService, error) {
return run.NewService(context.TODO(),
option.WithEndpoint(fmt.Sprintf("https://%s-run.googleapis.com", region)))
}
Checking if a service exists
This involves querying the service and looking for a 404 Not Found status code.
func serviceExists(c *run.APIService, region, project, name string) (bool, error) {
_, err := c.Namespaces.Services.Get(fmt.Sprintf("namespaces/%s/services/%s", project, name)).Do()
if err == nil {
return true, nil
}
// not all errors indicate service does not exist, look for 404 status code
v, ok := err.(*googleapi.Error)
if !ok {
return false, fmt.Errorf("failed to query service: %w", err)
}
if v.Code == http.StatusNotFound {
return false, nil
}
return false, fmt.Errorf("unexpected status code=%d from get service call: %w", v.Code, err)
}
Deploying a new service
First, we need to initialize a Service object. This is the Go representation of
Knative Service YAML manifest you see on Cloud Console. It can have very few
fields (e.g. doesn’t need namespace
as it is inferred by the API).
svc := &run.Service{
ApiVersion: "serving.knative.dev/v1",
Kind: "Service",
Metadata: &run.ObjectMeta{
Name: name,
},
Spec: &run.ServiceSpec{
Template: &run.RevisionTemplate{
Metadata: &run.ObjectMeta{Name: name + "-v1"},
Spec: &run.RevisionSpec{
Containers: []*run.Container{
{
Image: "gcr.io/google-samples/hello-app:1.0",
},
},
},
},
},
}
Then we make an API call, that returns a populated a Service
object that
is not all that useful because it is missing many fields like status.url
.
Note that this call succeeding does not mean the deployed application works fine, it only indicates that the API has objected the object. The readiness occurs asynchronously.
_, err = c.Namespaces.Services.Create("namespaces/"+project, svc).Do()
// TODO handle err
Waiting for readiness
Knative services that have all their Revision
s “ready” and the routes configured
to serve traffic are shown on Kubernetes API as:
status:
conditions:
- lastTransitionTime: "2021-04-06..."
status: "True"
type: "Ready"
- lastTransitionTime: "2021-04-06..."
status: "True"
type: "RoutesReady"
- We need to wait both of these
Ready
andRoutesReady
conditions to beTrue
. - When something fails
status: False
will be set along withmessage
field containing the error. - We will have
status: Unknown
(or missing condition) while the deployment is in progress.
Let’s write a Go method for this that checks for the status every few seconds and quits with a timeout based on given context:
func waitForReady(ctx context.Context, c *run.APIService, region, project, name, condition string) error {
t := time.NewTicker(time.Second * 5)
defer t.Stop()
for {
select {
case <-ctx.Done():
return ctx.Err()
case <-t.C:
svc, err := getService(c, region, project, name)
if err != nil {
return fmt.Errorf("failed to query service for readiness: %w", err)
}
for _, c := range svc.Status.Conditions {
if c.Type == condition {
if c.Status == "True" {
return nil
} else if c.Status == "False" {
return fmt.Errorf("service could not become %q (status:%s) (reason:%s) %s",
condition, c.Status, c.Reason, c.Message)
}
}
}
}
}
}
You can use this method like:
err = waitForReady(ctx, c, region, project, name, "Ready")
// TODO handle err
err = waitForReady(ctx, c, region, project, name, "RoutesReady")
// TODO handle err
Configuring access
To make the application publicly accessible or only to some service accounts we use the IAM endpoints that are available in global Run API endpoint:
gc, err := run.NewService(context.TODO())
// TODO handle err
Giving public access to all visitors looks like this:
_, err = gc.Projects.Locations.Services.SetIamPolicy(
fmt.Sprintf("projects/%s/locations/%s/services/%s", project, region, name),
&run.SetIamPolicyRequest{
Policy: &run.Policy{Bindings: []*run.Binding{{
Members: []string{"allUsers"},
Role: "roles/run.invoker",
}}},
},
).Do()
// TODO handle err
It might take a few seconds for the IAM changes to take effect, so don’t be surprised if you immediately query the service URL and get an HTTP 403.
Releasing a new Revision and splitting traffic
To make an update to the deployment, you just need to update the Service
and save it with a different Revision
name via spec.template.metadata.name
.
Cloud Run creates a new Revision
under the covers and starts sending all the
traffic to the latest ready revision. But here we will do custom traffic
splitting.
The caveat here is that we need to retrieve the Service
from the API first
and modify it in memory and save it. This provides an optimistic concurrency
control built into the API and it prevents the update call from succeeding if
somebody else has updated the object since you queried it. So it is ideal to
add retries around this, which I omitted here.
svc, err = getService(c, region, project, name)
// TODO handle err
svc.Spec.Template.Metadata.Name = name + "-v2"
svc.Spec.Template.Spec.Containers[0].Image = "gcr.io/google-samples/hello-app:2.0"
svc.Spec.Template.Spec.Containers[0].Env = []*run.EnvVar{{Name: "FOO", Value: "bar"}}
svc.Spec.Template.Spec.Containers[0].Resources.Limits = map[string]string{
"cpu": "2",
"memory": "1Gi"}
// let's split traffic as v1=90% v2=10%
svc.Spec.Traffic = []*run.TrafficTarget{{
RevisionName: name + "-v1",
Percent: 90,
}, {
RevisionName: name + "-v2",
Percent: 10,
}}
_, err = c.Namespaces.Services.ReplaceService(
fmt.Sprintf("namespaces/%s/services/%s", project, name), svc).Do()
// TODO handle err
// wait for the service to become ready and start serving the route changes
err = waitForReady(ctx, c, region, project, name, "Ready")
// TODO handle err
err = waitForReady(ctx, c, region, project, name, "RoutesReady")
// TODO handle err
Deleting the service
op, err := c.Namespaces.Services.Delete(
fmt.Sprintf("namespaces/%s/services/%s", project, name)).Do()
// TODO handle err
Here, you can check for op.Status="Success"
here to see if the deletion is
accepted, and the deletion will happen asynchronously and the Service
object will eventually disappear from the API. I’m not implementing that
here for brevity.
That’s it! Most up to date code will be in the repository as I probably won’t update this article.