GraphQL Pagination with Apollo V3

Introduction

When you build a REST or GraphQL API, we understand that at some point in time the database will contain a great amount of data. And with each new addition of data, it becomes difficult to get all the results at one go.

If Google was to spit out every data related to the searches we made, then we’d keep scrolling forever. Or in more practical terms, imagine reading a book that has all of its content written on a single page, that will be like looking through a giant scroll.

Unfurling the Past: Ancient & Medieval Scrolls

This is where pagination comes in, pagination is the act of being paged. Take a typical book for example, the simple act of dividing the contents to fit into multiple pages of the book is what we mean by pagination. In the case of a web application the content is the returned data that is displayed on multiple pages within one web page.

Pagination also includes the logic of preparing and displaying the links to the various pages. With regards to web applications, pagination can be handled client-side or server-side.

GraphQL enables us to fetch the exact fields we need from our data graph, it allows us to query the fields we need. While most of the time this may give us the desired short response, there are cases where the data graph contains tons of data and the query returns much more data than we need.

Now with GraphQL, there’s the option to paginate and limit the query results to only a particular portion. There are different ways to go about this (i.e paginating a GraphQL server), this is where the Apollo Client comes in.

The Apollo GraphQL docs explain it best:

“Apollo Client provides flexible cache APIs that help you merge results from a paginated list field, regardless of which pagination strategy your GraphQL server uses. And because you can represent these custom pagination strategies with stateless functions, you can reuse a single function for every list field that uses the same strategy”

So enough about the definitions, let’s look at what we will be building:

A nodejs Apollo GraphQL server with pagination
An Apollo Client in react for sending GraphQL queries to the GraphQL Server

Building A Nodejs GraphQL Server

Our API folder structure will look like this:

api /
data / books.json
resolvers / book.resolver.js
app.js
schema.graphql

Create a new project

Let’s start by creating a new folder for our project. Run the following command to create a directory and then cd into it.

mkdir graphql-pagination

cd graphql-pagination

Use npm (or Yarn) to initialize a new Node.js project.

npm init --yes

Install the dependencies

We need to set up some dependencies for this project: First we’d install babel. Babel is a package that translates ES6 and above syntax to ES5 code. ES5 code is the JS syntax style that is readable to node.js, such as module.exports or var module = require('module')

So we’d do this:

npm i --save @babel/core @babel/node @babel/preset-env babel-plugin-import-graphql

Then we’d install the graphql dependencies

npm i --save graphql apollo-server-express

At this point, we should have a package.json file and a node_modules folder. Let’s continue by creating the file we’ll do our work in.

You can look up the complete server code on github here - https://github.com/antstackio/graphql-pagination/

app.js

This is the entry point of our server. The configuration should look like this:

import { ApolloServer } from "apollo-server-express"
import express from "express"
import { typeDefs, resolvers } from "./schema.js"

const server = new ApolloServer({ typeDefs, resolvers })
const app = express()
server.applyMiddleware({ app })

const port = process.env.PORT || 5000
app.listen({ port }, () => {
  console.log(
    `Server listening at http://localhost:${port}${server.graphqlPath}`
  )
})

Now to run this server, we’d add this command to start the script on the package.json file.

"server:start": "nodemon --exec babel-node ./api/app.js"

Nodemon is an utility that prevents us from having to constantly restart the server each time a new change is made. It does the job of automatically refreshing and running the server once a new change is detected.

With the above set up, we can start the server by running the following on our chosen command line utility

npm run server:start

schema.js

This file contains the Graphql schema. The Graphql schema gives an insight to how we would want the queries and their results to be shaped. It gives the exact description of the data we can ask for - what fields can we select? What kinds of objects might they return? What fields are available on those sub-objects? That’s where the schema comes in.

Every GraphQL service defines a set of types which completely describe the set of possible data you can query on that service. Then, when queries come in, they are validated and executed against that schema.

Now, there exist several ways to write schemas for pagination according to the GraphQL documentation. For the scope of this article we’d be using the cursor-based pagination, other techniques include the very famous offset-based pagination wherein the client requests parameters with a specific limit (the number of results) and offset (the number of records that need to be skipped)

Cursor-based pagination

I’d try to explain this approach first with Slack’s user API pagination as an example.

The cursor is the key parameter in this approach. The client receives a variable called cursor with the response. The cursor is a pointer to a specific item which is to be sent with the request.

SELECT * FROM users
WHERE team_id = %team_id
ORDER BY id DESC
LIMIT %limit

The cursor must be based on a unique or sequential column in the table. The server uses the cursor to fetch the next set of items.

Here’s an example of the response:

{
    "users": [...],
    "next_cursor": "123456",  # the user id of the next item
}

For example, this cursor could be the id of the first element in the next dataset. The simplified query,

select * from users where id >= 123456 limit 20;

Furthermore, if the visited page is the last page, then the next_cursor will be empty.

{
    "users": [...],
    "next_cursor": ""
}

The advantage of this approach over other techniques is that the server picks only the required records (20 records) from the database.

I believe we now have some idea on how the pagination schema should look, so let’s jump into our schema.js file.

import { gql } from "apollo-server-express"
import getBooks from "./resolvers/book.resolver"

export const typeDefs = gql`
  type Book {
    title: String
    subtitle: String
    isbn13: String
    price: String
    image: String
    url: String
  }
  type Edge {
    cursor: String
    node: Book
  }
  type PageInfo {
    endCursor: String
    hasNextPage: Boolean
  }
  type Response {
    edges: [Edge]
    pageInfo: PageInfo
  }
  type Query {
    books(first: Int, after: String): Response
  }
  schema {
    query: Query
  }
`

export const resolvers = {
  Query: {
    books: getBooks,
  },
}

We’re using the GraphQL Cursor Connections Specification, you can learn more about this specification here. The Connection model provides a standard way of providing cursors, and a way of telling the client when more results are available

You might have noticed that the keywords nodes, edges, and cursors appear in our schema, which implements the Connections Specification. If we were to query the server using the format of our pagination schema, we’d get the following response:

{
  "data": {
    "books": {
      "pageInfo": {
        "endCursor": "9781680502558",
        "hasNextPage": true
      },
      "edges": [
        {
          "cursor": "1001615902053",
          "node": {
            "title": "The Vue.js Handbook",
            "subtitle": "",
            "price": "$0.00"
          }
        },
        {
          "cursor": "9781484239032",
          "node": {
            "title": "Visual Design of GraphQL Data",
            "subtitle": "A Practical Introduction with Legacy Data and Neo4j",
            "price": "$24.01"
          }
        },
        {
          "cursor": "9781492030713",
          "node": {
            "title": "Learning GraphQL",
            "subtitle": "Declarative Data Fetching for Modern Web Apps",
            "price": "$22.99"
          }
        }
      ]
    }
  }
}

The response tells us it is possible to fetch more data than what was returned. This information can be seen inside pageInfo. There is a cursor (endCursor)of the last record that is returned, this is used as a parameter when we wish to make our next request, and it also states that it is possible to fetch more data ("hasNextPage": true).

The edges in our response body represent the array of book objects and metadata that match our query. Within each book object returned in the edges we can find a node and a cursor property. The cursor points directly to the item, while the node contains the actual book data.

Now that we know how the pagination schema affects how the results and format in which a query response is returned, let’s look at how queries are handled in the resolvers.

resolvers/resovers.js

When we make requests to our server with the books query, the resolver gets to handle the request. The resolver takes the parameter it is given and with that information it interacts with our data store(in this case, book.json) and then correctly returns the data that matches the query parameters.

type Query {
 books(first: Int, after: String): Response
}

The query parameters include first and after. first is used to specify the number of book data we wish to find, after is the cursor and it specifies the book position from which the query should begin picking the required number of books. In simpler terms we can read this as:

Return the first 20 books just after the cursor - 9781484239032

We had earlier established that a cursor is meant to be unique, our cursor in this case is the book’s isbn as returned in the book.json.

The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique.

Let’s write out the core functionality of our resolver.

let first = 5
if (args.first !== undefined) {
  const min_value = 1
  const max_value = 25
  if (args.first < min_value || args.first > max_value) {
    throw new UserInputError(
      `Invalid limit value (min value: ${min_value}, max: ${max_value})`
    )
  }
  first = args.first
}

Here, we initialise first to 5, this means that even if we do not provide a value for the number of books we wish to query, the resolver will return 5 book records by default. The next lines of code goes on to basically validate the user input. The resolver is set to accept a minimum of 1 and a maximum of 25. So if the value provided for first falls out of this range our resolver will throw an error.

 let after = 0;
  if (args.after !== undefined) {
    const index = list.findIndex((item) => item.isbn === args.after);
    if (index === -1) {
      throw new UserInputError(`Invalid after value: cursor not found.`);
    }

The next lines of code, we initialize after to zero. Then we go on to validate that the cursor value provided for the after parameter actually matches the cursor for a book in books.json.

after = index + 1;
    if (after === list.length) {
      throw new UserInputError(
        `Invalid after value: no items after provided cursor.`
      );
    }
  }

If the cursor is valid, we go on to check if there are books after that particular cursor, and If there aren’t then an error is thrown. When the values for first and after are valid, the resolver will pick out the books from the list starting from the specified cursor. This is how our complete resolver file would look:

import list from "./data/books.json";
import { UserInputError } from "apollo-server-express";

const getBooks = async(parent, args) {
 // initialise first
  let first = 5;
  if (args.first !== undefined) {
    const min_value = 1;
    const max_value = 25;
    if (args.first < min_value || args.first > max_value) {
      throw new UserInputError(
        `Invalid limit value (min value: ${min_value}, max: ${max_value})`
      );
    }
    first = args.first;
  }
  // initialise cursor
  let after = 0;
  if (args.after !== undefined) {
    const index = list.findIndex((item) => item.isbn === args.after);
    if (index === -1) {
      throw new UserInputError(`Invalid after value: cursor not found.`);
    }
    after = index + 1;
    if (after === list.length) {
      throw new UserInputError(
        `Invalid after value: no items after provided cursor.`
      );
    }
  }

  const books = list.slice(after, after + first);
  const lastBook = books[books.length - 1];

  return {
    pageInfo: {
      endCursor: lastBook.isbn,
      hasNextPage: after + first < list.length,
    },
    edges: books.map((book) => ({
      cursor: book.isbn,
      node: book,
    })),
  };
}

export default getBooks;

Yayyy, we are done with the backend and can now move to the frontend side of things. The article is already longer than expected so we’d continue with the frontend implementation in the part two of this series.

There will be less explanations in the section to follow because we will mainly be looking at how to query our server from the frontend.