GraphQL API Standards
Goal
The goal of this document is to ensure CRUK GraphQL APIs can be easily and consistently consumed by any client with basic GraphQL support.
Rationale
To provide the smoothest possible experience, it's important all our GraphQL APIs follow consistent design guidelines, making using them easy and intuitive. This document establishes the guidelines to be followed by CRUK engineers for developing APIs consistently.
Consistency allows teams to leverage common code, patterns, documentation, and design decisions.
Description
This guide aims to achieve the following:
- Define consistent practices and patterns for all API endpoints across CRUK.
- Adhere as closely as possible to accepted GraphQL best practices in the industry.
- Make accessing CRUK Services via GraphQL interfaces easy for all our engineers.
- Allow engineers to leverage the prior work of other teams to implement, test, and document GraphQL endpoints defined consistently.
Summary
Do | Don't |
---|---|
✔️ Use well-structured queries and mutations | ❌ Create unnecessary API endpoints |
✔️ Support queries, mutations, and subscriptions as needed | ❌ Rely on the order of API response data |
✔️ Safely ignore API response data that is not expected/needed | ❌ Use non-standard GraphQL types |
✔️ Use HTTPS | ❌ Redirect HTTP requests to their equivalent HTTPS resource |
✔️ Use standard GraphQL error formats | ❌ Expose internal server details in error messages |
✔️ Use error codes to categorize errors | ❌ Make breaking changes without versioning |
✔️ Return validation errors for invalid input data | ❌ Change the behaviour of a field or mutation unexpectedly |
✔️ Handle server errors gracefully | ❌ Remove or change mutations that clients depend on |
✔️ Implement retry mechanisms for transient errors | ❌ Modify the structure of the response unexpectedly |
✔️ Provide clear upgrade paths for new versions | ❌ Remove fields or types without proper deprecation |
✔️ Perform synchronous validation on requests | ❌ Alter field arguments without versioning |
✔️ Support explicit versioning | ❌ Change field types without versioning |
✔️ Handle partial failures gracefully | ❌ Modify error responses without versioning |
✔️ Use canonical identifiers | ❌ Expose internal server details in error messages |
✔️ Return an error if a non-supported feature is requested | ❌ Make breaking changes without versioning |
✔️ Ensure secure communication over HTTPS | ❌ Change the behaviour of a field or mutation unexpectedly |
✔️ Provide meaningful messages for errors | ❌ Remove or change mutations that clients depend on |
✔️ Use standard HTTP status codes | ❌ Modify the structure of the response unexpectedly |
✔️ Use standard request headers | ❌ Remove fields or types without proper deprecation |
Methods
Supported Methods
Operations should use the proper GraphQL methods whenever possible.
Below is a list of methods that CRUK services could support. Not all resources will support all methods, but all resources using the methods below should conform to their usage.
Method | Description | Is Idempotent |
---|---|---|
Query | Return the current value of an object | True |
Mutation | Modify an object, or create a named object, when applicable | False |
Subscription | Subscribe to changes in an object | False |
Query Structure
To ensure clarity and maintainability in your GraphQL queries, it is essential to follow these best practices:
- Use Aliases: Rename fields to avoid conflicts and enhance the readability of responses.
- Leverage Fragments: Reuse query parts to reduce duplication and simplify maintenance.
- Limit Scope: Request only the necessary fields to avoid over-fetching and potential performance issues.
- Utilize Variables: Pass dynamic values to improve readability and reusability of your queries and mutations.
- Handle Errors Gracefully: Provide meaningful error messages to ensure a robust and user-friendly experience.
Example Queries
Fetching Activities
This query fetches a list of activities with their IDs, names, and statuses. It uses an alias to rename the result and a fragment to reuse the activity fields.
query {
allActivities: activities {
...ActivityFields
}
}
fragment ActivityFields on Activity {
id
name
status
}
Fetching a Single Activity by ID
This query fetches details of a single activity using its ID. It uses a variable for the ID and a fragment to reuse the activity fields.
query GetActivity($id: ID!) {
activity(id: $id) {
...ActivityDetails
}
}
fragment ActivityDetails on Activity {
id
name
description
startDate
endDate
}
Mutations
When designing and implementing mutations in GraphQL APIs, follow these best practices to ensure consistency, reliability, and ease of use:
- Use input types for structured mutation arguments.
- Return relevant data after a mutation.
- Provide clear error messages for invalid inputs or failed operations.
- Validate input data thoroughly.
- Name mutations descriptively (e.g.,
createActivity
,updateActivity
). - Support partial updates to objects.
- Implement authorization checks for mutations.
- Document each mutation comprehensively.
Example Mutations
Creating a New Activity
Creates a new activity with the specified details. It uses variables for the input values.
mutation CreateNewActivity($input: CreateActivityInput!) {
createActivity(input: $input) {
id
name
status
}
}
Variables:
{
"input": {
"name": "New Activity",
"description": "Description of the new activity",
"startDate": "2023-01-01",
"endDate": "2023-01-02"
}
}
Updating an Existing Activity
Updates an existing activity's details.
mutation {
updateActivity(
id: "123"
input: { name: "Updated Activity Name", description: "Updated description" }
) {
id
name
status
}
}
Multiple Identifiers
When possible, our entities should have a single identifier. However, there might be instances where an entity might have multiple IDs that need to be searchable.
In this case, we suggest the following:
If we want to return a specific entity object given a unique identifier, pass this identifier in the query as previously advised.
For example:
query {
activity(id: "123") {
name
status
}
}
But if we instead would like to filter our entity objects and return only those that match specific criteria, please use query parameters to achieve this.
For example:
query {
activities(status: "active") {
id
name
}
}
This will return all objects that match the criteria.
Versioning
All APIs compliant with the CRUK APIs should support explicit versioning. It's critical that clients can count on services to be stable over time, and it's critical that services can add features and make changes.
Versioning Formats
Services are versioned using a MAJOR versioning scheme as defined by Semantic Versioning. The version is to be specified in the URL, prefixed with a 'v':
query {
version: "v1"
activities {
id
name
}
}
When to Version
Services should increment their version number in response to any breaking API change. See the following section for a detailed discussion of what constitutes a breaking change. Services may increment their version number for non-breaking changes as well, if desired.
Use a new major version number to signal that support for existing clients will be deprecated in the future. When introducing a new major version, services should provide a clear upgrade path for existing clients and develop a plan for deprecation that is consistent with their business group's policies.
Online documentation of versioned services should indicate the current support status of each previous API version and provide a path to the latest version.
Definition of a Breaking Change
A breaking change is any modification to the API that could potentially disrupt existing clients. This includes, but is not limited to:
- Removing a Field or Type
- Changing Field Types
- Renaming Fields or Types
- Changing Field Arguments
- Modifying the Structure of the Response
- Removing or Changing Mutations
- Changing the Behaviour of a Field or Mutation
- Changing Error Responses
Errors and Faults
Handling errors and faults in CRUK GraphQL APIs is crucial for providing a robust and user-friendly experience. Here are some guidelines for managing errors and faults:
Error Handling
Standard Error Format
All errors should be returned in a standard GraphQL error format. This includes a message field and optionally locations, path, and extensions fields.
Example:
{
"errors": [
{
"message": "Field 'email' is required.",
"locations": [{ "line": 2, "column": 3 }],
"path": ["createUser", "email"]
}
]
}
Error Codes
Use error codes to categorize errors. This helps clients to programmatically handle different types of errors.
Example:
{
"errors": [
{
"message": "Unauthorized access.",
"extensions": {
"code": "UNAUTHORIZED"
}
}
]
}
Validation Errors
Return validation errors for invalid input data. Provide clear messages to help clients understand what went wrong.
Example:
{
"errors": [
{
"message": "Invalid email format.",
"extensions": {
"code": "VALIDATION_ERROR"
}
}
]
}
Server Errors
Handle server errors gracefully and provide meaningful messages. Avoid exposing internal server details.
Example:
{
"errors": [
{
"message": "Internal server error. Please try again later.",
"extensions": {
"code": "INTERNAL_SERVER_ERROR"
}
}
]
}
Fault Tolerance
Partial Failures
In cases where part of a query fails, return the successful parts of the response along with error details for the failed parts.
Example:
{
"data": {
"activity": {
"id": "123",
"name": "Fundraising Event"
}
},
"errors": [
{
"message": "Failed to fetch activity's description.",
"path": ["activity", "description"]
}
]
}
Retry Mechanism
Implement a retry mechanism for transient errors. Clients should be able to retry failed operations after a certain period.
Graceful Degradation
Ensure the API can handle high load or partial outages gracefully. Provide fallback responses or degrade functionality to maintain service availability.
By following these guidelines, you can ensure that your GraphQL API handles errors and faults effectively, providing a better experience for clients and users.
Long Running Operations
Handling long-running operations in any GraphQL APIs can be challenging due to the synchronous nature of typical GraphQL requests. Here are some strategies to manage long-running processes effectively:
Polling
Clients can repeatedly query the server at intervals to check the status of a long-running operation. This approach is simple but can lead to increased load on the server and network traffic.
Example:
query {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}
Subscriptions
Use GraphQL subscriptions to push updates to the client when the status of a long-running operation changes. This approach is more efficient than polling but requires the server to support WebSockets.
Example:
subscription {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}
Async Responses
The server can immediately return a response indicating that the operation has started, along with an operation ID. The client can then use this ID to query the status of the operation.
Example:
mutation {
startLongRunningOperation(input: { ... }) {
operationId
status
}
}
The client can then query the status:
query {
longRunningOperationStatus(id: "operation-id") {
status
result
}
}
Webhooks
The server can call a client-provided webhook URL when the long-running operation completes. This approach offloads the responsibility of checking the status from the client to the server.
Example:
mutation {
startLongRunningOperation(input: { ... }, webhookUrl: "https://client.example.com/webhook") {
operationId
status
}
}
Each of these strategies has its own trade-offs, and the best approach depends on the specific requirements and constraints of your project.
Security
Authentication
Token-Based Authentication
Use token-based authentication mechanisms such as OAuth2 or JWT to secure API access. Ensure tokens are securely stored and transmitted.
Example:
{
"Authorization": "Bearer <token>"
}
Session Management
Implement secure session management practices, including session expiration and renewal mechanisms.
Authorization
Role-Based Access Control (RBAC)
Implement RBAC to restrict access to API operations based on user roles and permissions.
Field-Level Authorization
Ensure that sensitive fields are only accessible to authorized users. Implement checks at the resolver level to enforce this.
Example:
const resolvers = {
Query: {
sensitiveData: (parent, args, context) => {
if (!context.user || !context.user.hasPermission("VIEW_SENSITIVE_DATA")) {
throw new Error("Unauthorized");
}
return getSensitiveData();
},
},
};
Input Validation
Sanitize Inputs
Sanitize all inputs to prevent injection attacks. Use libraries or frameworks that provide built-in sanitization.
Validate Inputs
Perform thorough validation of all input data to ensure it meets expected formats and constraints.
Example:
const { GraphQLString } = require("graphql");
const { GraphQLNonNull } = require("graphql");
const createUser = {
type: UserType,
args: {
email: { type: new GraphQLNonNull(GraphQLString) },
password: { type: new GraphQLNonNull(GraphQLString) },
},
resolve: (parent, args) => {
if (!isValidEmail(args.email)) {
throw new Error("Invalid email format");
}
return createUser(args);
},
};
Rate Limiting
Protect APIs from abuse by implementing rate limiting. Limit the number of requests a client can make within a specified time frame.
Logging and Monitoring
Log all security-related events, including authentication attempts, authorization failures, and input validation errors.
Continuously monitor logs for suspicious activities and set up alerts for potential security incidents.
Secure Communication
Ensure all communication between clients and servers is encrypted using HTTPS. Redirect HTTP requests to HTTPS.
If using subscriptions, ensure WebSocket connections are secured with WSS (WebSocket Secure).
Further Guidance
Client Guidance
To ensure the best possible experience for clients talking to a GraphQL service, clients should adhere to the following best practices:
Ignore Rule
For loosely coupled clients where the exact shape of the data is not known before the call, if the server returns something the client wasn't expecting, the client should safely ignore it.
Some services may add fields to responses without changing versions numbers. Services that do so should make this clear in their documentation and clients should ignore unknown fields.
Silent Fail Rule
Clients should handle errors gracefully and provide meaningful feedback to users. If a client encounters an error, it should:
- Log the error for debugging purposes.
- Display a user-friendly message indicating that an error occurred.
Example:
fetchGraphQL(query, variables)
.then((response) => {
if (response.errors) {
console.error("GraphQL errors:", response.errors);
alert(
"An error occurred while processing your request. Please try again later.",
);
} else {
// Process the response data
}
})
.catch((error) => {
console.error("Network error:", error);
alert(
"A network error occurred. Please check your connection and try again.",
);
});
Pre-Flighting
For CRUK GraphQL APIs, services should perform as much synchronous validation as practical on requests. This means that services should prioritize returning errors in a synchronous manner, ensuring that only valid operations proceed to be processed as long-running operations. This approach helps to quickly identify and reject invalid requests, improving the overall efficiency and reliability of the API.
Unsupported Requests
GraphQL API clients may request functionality that is currently unsupported. CRUK GraphQL APIs should respond to valid but unsupported requests consistent with this section.
Essential Guidance
CRUK GraphQL APIs will often choose to limit functionality that can be performed by clients. For instance, auditing systems allow records to be created but not modified or deleted. Similarly, some APIs will expose collections but require or otherwise limit filtering and ordering criteria, or may not support client-driven pagination.
Feature Allow List
If a GraphQL API does not support any of the following features, an error response should be provided if the feature is requested by a caller. The features are:
- Filtering a collection by a property value
- Filtering a collection by range
- Client-driven pagination via offset and limit
- Sorting by order
- Subscriptions to real-time updates
- Batch requests for multiple queries or mutations
- Aliases for fields or operations
- Fragments for reusable query parts
- Inline arguments for fields
- Custom directives for additional functionality
Error Response
Services should provide an error response if a caller requests an unsupported feature found in the feature allow list. The error response should be a GraphQL error object indicating that the request cannot be fulfilled. Services should include enough detail in the response message for a developer to determine exactly what portion of the request is not supported.
Example:
{
"errors": [
{
"message": "This value is not valid.",
"locations": [{ "line": 2, "column": 3 }],
"path": ["orderDirection"]
}
]
}