AWS AppSync - can we fetch the schema from S3?

The aws_appsync_graphql_api resource for AWS AppSync has a schema attribute for the GraphQL schema, that can be either multi-line heredoc or loaded via a file function. I’m currently using a file function, but as I’m using Terraform Cloud, that means my GraphQL schema file needs to be committed in my Terraform repository, as opposed to in a code repository where it belongs - it’s application code, not configuration.

Is there a way to specify that the schema should be loaded from S3? That’s what I do with my Lambda functions, as aws_lambda_function allows you to specify which aws_s3_bucket_object you want to load your function from, as an alternative to using a straight file reference - however to do that, aws_lambda_function has explicit s3_bucket, s3_key and s3_object_version attributes which are not available in aws_appsync_graphql_api.

Hi @bengiddins,

I’m not familiar with AWS AppSync, but from looking at the aws_appsync_graphql_api docs it looks like the schema is in a text-based format, which means we can read it into a Terraform string using a data source and then assign it to the aws_appsync_graphql_api resource:

data "aws_s3_bucket_object" "schema" {
  bucket = "example"
  key    = "example"

resource "aws_appsync_graphql_api" "example" {
  authentication_type = "AWS_IAM"
  name                = "example"

  schema = data.aws_s3_bucket_object.schema.body

The aws_s3_bucket_object data source requires that you give the S3 object a content-type starting with text/ in order for body to be populated, because Terraform strings cannot represent arbitrary binary data (they are sequences of Unicode characters, not sequences of bytes).

The aws_lambda_function resource has built-in support for reading from S3 mainly because the underlying AWS Lambda API also has that built-in support: when you use those options, it is the remote AWS Lambda service requesting the data from S3, not Terraform itself.

The underlying AppSync API doesn’t seem to have an equivalent capability, but using Terraform data sources we can approximate capabilities like that by having Terraform itself retrieve needed data and use it to configure some other object. This means that the IAM user/role used by the Terraform AWS provider will need to have access to the S3 object in question, whereas in the Lambda case you’d instead grant the Lambda function’s IAM role access to the S3 object.

1 Like

Thanks - great explanation and a likely solution for what I’m trying to achieve.

Yep, worked perfectly.

My build script calls a local Terraform configuration that uploads my Lambda builds to S3, I added the schema in a similar fashion so it gets uploaded to S3 if there’s a change detected:

resource "aws_s3_bucket_object" "client_schema" {
  bucket       = data.aws_s3_bucket.deployment.bucket
  key          = "appsync/client/schema.graphql"
  source       = "../client-resolver/graphql/schema.graphql"
  content_type = "application/json"
  etag         = filemd5("../client-resolver/graphql/schema.graphql")

The configuration in Terraform Cloud picks it up:

data "aws_s3_bucket_object" "graphql_schema" {
  bucket = var.s3_bucket_deployment_bucket
  key    = "appsync/client/schema.graphql"

resource "aws_appsync_graphql_api" "client" {
  authentication_type = "AMAZON_COGNITO_USER_POOLS"
  name                = "client"
  #schema              = file("${path.module}/schema.graphql")
  schema = data.aws_s3_bucket_object.graphql_schema.body

This eliminates the schema from the configuration repository!