Generate and Sign JWT with Elixir for Google Cloud

Published July 04, 2019 by Toran Billups

If you prefer to skip the story and see the code you can find everything on github

Text to Speech

I was writing my first text to speech implementation with google cloud recently and found something interesting reading through the documentation.

    curl -H "Authorization: Bearer $(gcloud auth application-default print-access-token)"
  
The command uses the gcloud auth application-default print-access-token command to retrieve an authorization token for the request.

This only works if you've installed the Google Cloud SDK of course so I went ahead and installed it to see if I could get a simple "hello world" example put together. The first working prototype used `System.cmd` to get an access token with the sdk just like the curl example from the documentation.

    defmodule VoiceWeb.PageController do
      use VoiceWeb, :controller
    
      def create(conn, _params) do
        url = "https://texttospeech.googleapis.com/v1/text:synthesize"
        token = get_token()
        params = get_params()
        %HTTPoison.Response{body: body} = post(url, token, params)
    
        %{"audioContent" => audio} =
          body
          |> Poison.decode!
    
        json(conn, %{audio: audio})
      end
    
      def post(url, token, body) do
        headers = [
          {"Authorization", "Bearer #{token}"},
          {"Content-Type", "application/json; charset=utf-8"}
        ]
    
        HTTPoison.post!(url, body, headers)
      end
    
      def get_token() do
        {token, _} = System.cmd("gcloud", ["auth", "application-default", "print-access-token"])
        String.slice(token, 0..-2)
      end
    
      def get_params() do
        body = %{}
    
        body
        |> Poison.encode!
      end
    end
  

Reversing with Charles Proxy

While this worked well enough for my proof of concept I started to think more about deployment and knew it wasn't ideal to require the command line sdk in production. From my experience reversing apis for mobile development years ago I remembered that Charles Proxy could provide further insight by showing any outbound network request from my machine.

After I configured SSL Proxying it was easy to see the sdk was making a POST request with 2 parameters and the response contained an access token. The first parameter was a static value for `grant_type` but the other labeled `assertion` was more interesting.

Making sense of the JWT

It turns out this `assertion` parameter was a JSON Web Token or JWT for short. To properly generate and sign one ourselves we first need to pull apart the JWT shown in Charles Proxy.

Any number of tools can decode this JWT so choose from the ecosystem you are most comfortable and take my recommendations with a grain of salt. I'll share a few of the options I had success with but at the end of the day deconstructing this JWT is more important than any specific technology used in the process.

When privacy is less important the JWT website offers an easy way to decode it. If you prefer not to share the details of this JWT and you are familiar with the nodeJS ecosystem you might give the jwt decode package a try. And finally, if you are an Elixir purist you can decode this token with JOSE.

The JWT header includes 3 different key/value pairs that will be required when we generate a token ourselves in just a few moments. The one dynamic value here worth mentioning is `kid` and you will find this in the JSON file from google under the key `private_key_id` (more details about this JSON file below)

    { alg: 'RS256',
      kid: 'gybvpalazzkk5pudd3yweme8jrh9ai5dnqsvwx',
      typ: 'JWT' }
  

The JWT body includes 6 different key/value pairs that will be required but luckily only 1 of these is dynamic. The `iss` value can be found in the JSON file from google under the key `client_email`.

    { aud: 'https://www.googleapis.com/oauth2/v4/token',
      exp: 1561386062,
      iat: 1561382462,
      iss: '[email protected]',
      jti: 'vwxlt9efnmi2hi2ef38jnkp2e3',
      nbf: 1357323219,
  

For a detailed explanation of each key/value be sure to read the JWT auth documentation from google. Also, if you plan to follow this integration yourself you will need to create a service account with google to generate the JSON file I reference throughout.

Joken

Now that we know the details of this JWT all that remains is the Elixir code to generate and sign it with our private key. To do the heavy lifting I'll be using the Elixir library Joken.

We start with `create` from the `Joken.Signer` module and explicitly set the first argument to `RS256` because earlier we saw the header had a key `alg` with the value `RS256`. We also set the `pem` using our `private_key`, found in the JSON file from google, to properly sign the token.

    def generate_jwt do
      extra_claims = %{
        "iss" => iss(),
        "scope" => "https://www.googleapis.com/auth/cloud-platform",
        "aud" => "https://www.googleapis.com/oauth2/v4/token"
      }
      signer = Joken.Signer.create("RS256", %{"pem" => pem()}, %{"kid" => kid()})
      VoiceWeb.Auth.Token.generate_and_sign!(extra_claims, signer)
    end
  

note: the iss, pem and kid functions above return the environment variable(s) associated. The full source is available on github if you want more detail.

Next add a `Token` module and implement the `token_config` function to set the `default_exp` at 1 hour. Without this configuration Joken will default the expiration at 2 hours and any request with an expiration > 1 hour results in an error from google.

    defmodule VoiceWeb.Auth.Token do
      use Joken.Config
    
      @impl true
      def token_config do
        default_claims(default_exp: 3600)
      end
    end
  

With a correctly signed JWT we can make a request to google for an access token just like the sdk did previously.

    defmodule VoiceWeb.Auth.Google do
      def get_token() do
        jwt = generate_jwt()
    
        params = %{
          assertion: jwt,
          grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer"
        }
    
        url = "https://www.googleapis.com/oauth2/v4/token"
        response = post(url, params)
        parse_access_token(response)
      end
    
      def post(url, params) do
        headers = [
          {"Content-Type", "application/x-www-form-urlencoded"}
        ]
    
        HTTPoison.post!(url, URI.encode_query(params), headers)
      end
    
      def parse_access_token(%HTTPoison.Response{body: body}) do
        %{"access_token" => token} =
          body
          |> Poison.decode!()
    
        token
      end
    
      def generate_jwt do
        extra_claims = %{
          "iss" => iss(),
          "scope" => "https://www.googleapis.com/auth/cloud-platform",
          "aud" => "https://www.googleapis.com/oauth2/v4/token"
        }
        signer = Joken.Signer.create("RS256", %{"pem" => pem()}, %{"kid" => kid()})
        VoiceWeb.Auth.Token.generate_and_sign!(extra_claims, signer)
      end
    
      def iss do
        System.get_env("GOOGLE_CLIENT_EMAIL")
      end
    
      def kid do
        System.get_env("GOOGLE_PRIVATE_KEY_ID")
      end
    
      def pem do
        System.get_env("GOOGLE_PRIVATE_KEY")
        |> String.replace(~r/\\n/, "\n")
      end
    end
  

The last step is to remove `System.cmd` from the controller and use this new module to fetch the access token.

    defmodule VoiceWeb.PageController do
      use VoiceWeb, :controller
    
      alias VoiceWeb.Auth.Google
    
      def create(conn, _params) do
        url = "https://texttospeech.googleapis.com/v1/text:synthesize"
        token = Google.get_token()
        params = get_params()
        %HTTPoison.Response{body: body} = post(url, token, params)
    
        %{"audioContent" => audio} =
          body
          |> Poison.decode!
    
        json(conn, %{audio: audio})
      end
    
      def post(url, token, body) do
        headers = [
          {"Authorization", "Bearer #{token}"},
          {"Content-Type", "application/json; charset=utf-8"}
        ]
    
        HTTPoison.post!(url, body, headers)
      end
    
      def get_params() do
        body = %{}
    
        body
        |> Poison.encode!
      end
    end
  

If you are asking yourself why I hand crafted a solution instead of using a standard client from google ... in my defense the text to speech website didn't list anything for Elixir specifically. Some time later a friend did point me toward an early beta client that seems to be in the wild for those interested.

You can find the source code from my adventure on github.


Buy Me a Coffee

Twitter / Github / Email