Skip to main content

GraphQL API Standards

Goal

The goal of this document is to ensure CRUK GraphQL APIs can be easily and consistently consumed by any client with basic GraphQL support.

Rationale

To provide the smoothest possible experience, it's important all our GraphQL APIs follow consistent design guidelines, making using them easy and intuitive. This document establishes the guidelines to be followed by CRUK engineers for developing APIs consistently.

Consistency allows teams to leverage common code, patterns, documentation, and design decisions.

Description

This guide aims to achieve the following:

  • Define consistent practices and patterns for all API endpoints across CRUK.
  • Adhere as closely as possible to accepted GraphQL best practices in the industry.
  • Make accessing CRUK Services via GraphQL interfaces easy for all our engineers.
  • Allow engineers to leverage the prior work of other teams to implement, test, and document GraphQL endpoints defined consistently.

Summary

DoDon't
✔️ Use well-structured queries and mutations❌ Create unnecessary API endpoints
✔️ Support queries, mutations, and subscriptions as needed❌ Rely on the order of API response data
✔️ Safely ignore API response data that is not expected/needed❌ Use non-standard GraphQL types
✔️ Use HTTPS❌ Redirect HTTP requests to their equivalent HTTPS resource
✔️ Use standard GraphQL error formats❌ Expose internal server details in error messages
✔️ Use error codes to categorize errors❌ Make breaking changes without versioning
✔️ Return validation errors for invalid input data❌ Change the behaviour of a field or mutation unexpectedly
✔️ Handle server errors gracefully❌ Remove or change mutations that clients depend on
✔️ Implement retry mechanisms for transient errors❌ Modify the structure of the response unexpectedly
✔️ Provide clear upgrade paths for new versions❌ Remove fields or types without proper deprecation
✔️ Perform synchronous validation on requests❌ Alter field arguments without versioning
✔️ Support explicit versioning❌ Change field types without versioning
✔️ Handle partial failures gracefully❌ Modify error responses without versioning
✔️ Use canonical identifiers❌ Expose internal server details in error messages
✔️ Return an error if a non-supported feature is requested❌ Make breaking changes without versioning
✔️ Ensure secure communication over HTTPS❌ Change the behaviour of a field or mutation unexpectedly
✔️ Provide meaningful messages for errors❌ Remove or change mutations that clients depend on
✔️ Use standard HTTP status codes❌ Modify the structure of the response unexpectedly
✔️ Use standard request headers❌ Remove fields or types without proper deprecation

Methods

Supported Methods

Operations should use the proper GraphQL methods whenever possible.

Below is a list of methods that CRUK services could support. Not all resources will support all methods, but all resources using the methods below should conform to their usage.

MethodDescriptionIs Idempotent
QueryReturn the current value of an objectTrue
MutationModify an object, or create a named object, when applicableFalse
SubscriptionSubscribe to changes in an objectFalse

Query Structure

To ensure clarity and maintainability in your GraphQL queries, it is essential to follow these best practices:

  • Use Aliases: Rename fields to avoid conflicts and enhance the readability of responses.
  • Leverage Fragments: Reuse query parts to reduce duplication and simplify maintenance.
  • Limit Scope: Request only the necessary fields to avoid over-fetching and potential performance issues.
  • Utilize Variables: Pass dynamic values to improve readability and reusability of your queries and mutations.
  • Handle Errors Gracefully: Provide meaningful error messages to ensure a robust and user-friendly experience.

Example Queries

Fetching Activities

This query fetches a list of activities with their IDs, names, and statuses. It uses an alias to rename the result and a fragment to reuse the activity fields.

query {
allActivities: activities {
...ActivityFields
}
}

fragment ActivityFields on Activity {
id
name
status
}
Fetching a Single Activity by ID

This query fetches details of a single activity using its ID. It uses a variable for the ID and a fragment to reuse the activity fields.

query GetActivity($id: ID!) {
activity(id: $id) {
...ActivityDetails
}
}

fragment ActivityDetails on Activity {
id
name
description
startDate
endDate
}

Mutations

When designing and implementing mutations in GraphQL APIs, follow these best practices to ensure consistency, reliability, and ease of use:

  • Use input types for structured mutation arguments.
  • Return relevant data after a mutation.
  • Provide clear error messages for invalid inputs or failed operations.
  • Validate input data thoroughly.
  • Name mutations descriptively (e.g., createActivity, updateActivity).
  • Support partial updates to objects.
  • Implement authorization checks for mutations.
  • Document each mutation comprehensively.

Example Mutations

Creating a New Activity

Creates a new activity with the specified details. It uses variables for the input values.

mutation CreateNewActivity($input: CreateActivityInput!) {
createActivity(input: $input) {
id
name
status
}
}

Variables:

{
"input": {
"name": "New Activity",
"description": "Description of the new activity",
"startDate": "2023-01-01",
"endDate": "2023-01-02"
}
}
Updating an Existing Activity

Updates an existing activity's details.

mutation {
updateActivity(
id: "123"
input: { name: "Updated Activity Name", description: "Updated description" }
) {
id
name
status
}
}

Multiple Identifiers

When possible, our entities should have a single identifier. However, there might be instances where an entity might have multiple IDs that need to be searchable.

In this case, we suggest the following:

If we want to return a specific entity object given a unique identifier, pass this identifier in the query as previously advised.

For example:

query {
activity(id: "123") {
name
status
}
}

But if we instead would like to filter our entity objects and return only those that match specific criteria, please use query parameters to achieve this.

For example:

query {
activities(status: "active") {
id
name
}
}

This will return all objects that match the criteria.

Versioning

All APIs compliant with the CRUK APIs should support explicit versioning. It's critical that clients can count on services to be stable over time, and it's critical that services can add features and make changes.

Versioning Formats

Services are versioned using a MAJOR versioning scheme as defined by Semantic Versioning. The version is to be specified in the URL, prefixed with a 'v':

query {
version: "v1"
activities {
id
name
}
}

When to Version

Services should increment their version number in response to any breaking API change. See the following section for a detailed discussion of what constitutes a breaking change. Services may increment their version number for non-breaking changes as well, if desired.

Use a new major version number to signal that support for existing clients will be deprecated in the future. When introducing a new major version, services should provide a clear upgrade path for existing clients and develop a plan for deprecation that is consistent with their business group's policies.

Online documentation of versioned services should indicate the current support status of each previous API version and provide a path to the latest version.

Definition of a Breaking Change

A breaking change is any modification to the API that could potentially disrupt existing clients. This includes, but is not limited to:

  • Removing a Field or Type
  • Changing Field Types
  • Renaming Fields or Types
  • Changing Field Arguments
  • Modifying the Structure of the Response
  • Removing or Changing Mutations
  • Changing the Behaviour of a Field or Mutation
  • Changing Error Responses

Errors and Faults

Handling errors and faults in CRUK GraphQL APIs is crucial for providing a robust and user-friendly experience. Here are some guidelines for managing errors and faults:

Error Handling

Standard Error Format

All errors should be returned in a standard GraphQL error format. This includes a message field and optionally locations, path, and extensions fields.

Example:

{
"errors": [
{
"message": "Field 'email' is required.",
"locations": [{ "line": 2, "column": 3 }],
"path": ["createUser", "email"]
}
]
}

Error Codes

Use error codes to categorize errors. This helps clients to programmatically handle different types of errors.

Example:

{
"errors": [
{
"message": "Unauthorized access.",
"extensions": {
"code": "UNAUTHORIZED"
}
}
]
}

Validation Errors

Return validation errors for invalid input data. Provide clear messages to help clients understand what went wrong.

Example:

{
"errors": [
{
"message": "Invalid email format.",
"extensions": {
"code": "VALIDATION_ERROR"
}
}
]
}

Server Errors

Handle server errors gracefully and provide meaningful messages. Avoid exposing internal server details.

Example:

{
"errors": [
{
"message": "Internal server error. Please try again later.",
"extensions": {
"code": "INTERNAL_SERVER_ERROR"
}
}
]
}

Fault Tolerance

Partial Failures

In cases where part of a query fails, return the successful parts of the response along with error details for the failed parts.

Example:

{
"data": {
"activity": {
"id": "123",
"name": "Fundraising Event"
}
},
"errors": [
{
"message": "Failed to fetch activity's description.",
"path": ["activity", "description"]
}
]
}

Retry Mechanism

Implement a retry mechanism for transient errors. Clients should be able to retry failed operations after a certain period.

Graceful Degradation

Ensure the API can handle high load or partial outages gracefully. Provide fallback responses or degrade functionality to maintain service availability.

By following these guidelines, you can ensure that your GraphQL API handles errors and faults effectively, providing a better experience for clients and users.

Long Running Operations

Handling long-running operations in any GraphQL APIs can be challenging due to the synchronous nature of typical GraphQL requests. Here are some strategies to manage long-running processes effectively:

Polling

Clients can repeatedly query the server at intervals to check the status of a long-running operation. This approach is simple but can lead to increased load on the server and network traffic.

Example:

query {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}

Subscriptions

Use GraphQL subscriptions to push updates to the client when the status of a long-running operation changes. This approach is more efficient than polling but requires the server to support WebSockets.

Example:

subscription {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}

Async Responses

The server can immediately return a response indicating that the operation has started, along with an operation ID. The client can then use this ID to query the status of the operation.

Example:

mutation {
startLongRunningOperation(input: { ... }) {
operationId
status
}
}

The client can then query the status:

query {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}

Webhooks

The server can call a client-provided webhook URL when the long-running operation completes. This approach offloads the responsibility of checking the status from the client to the server.

Example:

mutation {
startLongRunningOperation(input: { ... }, webhookUrl: "https://client.example.com/webhook") {
operationId
status
}
}

Each of these strategies has its own trade-offs, and the best approach depends on the specific requirements and constraints of your project.

Security

Authentication

Token-Based Authentication

Use token-based authentication mechanisms such as OAuth2 or JWT to secure API access. Ensure tokens are securely stored and transmitted.

Example:

{
"Authorization": "Bearer <token>"
}

Session Management

Implement secure session management practices, including session expiration and renewal mechanisms.

Authorization

Role-Based Access Control (RBAC)

Implement RBAC to restrict access to API operations based on user roles and permissions.

Field-Level Authorization

Ensure that sensitive fields are only accessible to authorized users. Implement checks at the resolver level to enforce this.

Example:

const resolvers = {
Query: {
sensitiveData: (parent, args, context) => {
if (!context.user || !context.user.hasPermission("VIEW_SENSITIVE_DATA")) {
throw new Error("Unauthorized");
}
return getSensitiveData();
},
},
};

Input Validation

Sanitize Inputs

Sanitize all inputs to prevent injection attacks. Use libraries or frameworks that provide built-in sanitization.

Validate Inputs

Perform thorough validation of all input data to ensure it meets expected formats and constraints.

Example:

const { GraphQLString } = require("graphql");
const { GraphQLNonNull } = require("graphql");

const createUser = {
type: UserType,
args: {
email: { type: new GraphQLNonNull(GraphQLString) },
password: { type: new GraphQLNonNull(GraphQLString) },
},
resolve: (parent, args) => {
if (!isValidEmail(args.email)) {
throw new Error("Invalid email format");
}
return createUser(args);
},
};

Rate Limiting

Protect APIs from abuse by implementing rate limiting. Limit the number of requests a client can make within a specified time frame.

Logging and Monitoring

Log all security-related events, including authentication attempts, authorization failures, and input validation errors.

Continuously monitor logs for suspicious activities and set up alerts for potential security incidents.

Secure Communication

Ensure all communication between clients and servers is encrypted using HTTPS. Redirect HTTP requests to HTTPS.

If using subscriptions, ensure WebSocket connections are secured with WSS (WebSocket Secure).

Further Guidance

Client Guidance

To ensure the best possible experience for clients talking to a GraphQL service, clients should adhere to the following best practices:

Ignore Rule

For loosely coupled clients where the exact shape of the data is not known before the call, if the server returns something the client wasn't expecting, the client should safely ignore it.

Some services may add fields to responses without changing versions numbers. Services that do so should make this clear in their documentation and clients should ignore unknown fields.

Silent Fail Rule

Clients should handle errors gracefully and provide meaningful feedback to users. If a client encounters an error, it should:

  • Log the error for debugging purposes.
  • Display a user-friendly message indicating that an error occurred.

Example:

fetchGraphQL(query, variables)
.then((response) => {
if (response.errors) {
console.error("GraphQL errors:", response.errors);
alert(
"An error occurred while processing your request. Please try again later.",
);
} else {
// Process the response data
}
})
.catch((error) => {
console.error("Network error:", error);
alert(
"A network error occurred. Please check your connection and try again.",
);
});

Pre-Flighting

For CRUK GraphQL APIs, services should perform as much synchronous validation as practical on requests. This means that services should prioritize returning errors in a synchronous manner, ensuring that only valid operations proceed to be processed as long-running operations. This approach helps to quickly identify and reject invalid requests, improving the overall efficiency and reliability of the API.

Unsupported Requests

GraphQL API clients may request functionality that is currently unsupported. CRUK GraphQL APIs should respond to valid but unsupported requests consistent with this section.

Essential Guidance

CRUK GraphQL APIs will often choose to limit functionality that can be performed by clients. For instance, auditing systems allow records to be created but not modified or deleted. Similarly, some APIs will expose collections but require or otherwise limit filtering and ordering criteria, or may not support client-driven pagination.

Feature Allow List

If a GraphQL API does not support any of the following features, an error response should be provided if the feature is requested by a caller. The features are:

  • Filtering a collection by a property value
  • Filtering a collection by range
  • Client-driven pagination via offset and limit
  • Sorting by order
  • Subscriptions to real-time updates
  • Batch requests for multiple queries or mutations
  • Aliases for fields or operations
  • Fragments for reusable query parts
  • Inline arguments for fields
  • Custom directives for additional functionality

Error Response

Services should provide an error response if a caller requests an unsupported feature found in the feature allow list. The error response should be a GraphQL error object indicating that the request cannot be fulfilled. Services should include enough detail in the response message for a developer to determine exactly what portion of the request is not supported.

Example:

{
"errors": [
{
"message": "This value is not valid.",
"locations": [{ "line": 2, "column": 3 }],
"path": ["orderDirection"]
}
]
}