Mastering GraphQL: How to Enable Arbitrary List Filtering with Sift.js.

Very often, there is a list data type in your GraphQL schema, and a common requirement is to filter the list based on some input variables. Filtering is a crucial feature that allows users to retrieve only the data they need, making applications more efficient and user-friendly.

There are many libraries and resources available for filtering when using an external datasource that supports querying, such as Prisma in front of a database. However, when writing your own resolvers that return a list of GraphQL objects, it would be beneficial to abstract away the filtering logic and make it reusable across your schema.

Let's consider a simple GraphQL schema for a list of books:

type Book {
  title: String
  price: Float
}

type Query {
  books: [Book]
}

And the following resolver that returns a list of books from a simple list. This could be any data source.

const books = [
  { title: 'The Great Gatsby', price: 10.99 },
  { title: 'To Kill a Mockingbird', price: 12.99 },
  // more books
];

const resolvers = {
  Query: {
    books: () => books,
  },
};

For our example, let's assume users need to filter books based on the following criteria:

  • What the title starts with

  • The price being within a range, less than and greater than

How to Define Individual Filters and Logic in GraphQL

One way to implement filters is to define each individually. This involves making changes to the GraphQL schema input types and implementing the filters in the resolver logic.

You could update your schema to include these new input variables, allowing you to express the filters that are allowed and the parameters needed to use them:

input BookFilter {
  titleStartsWith: String
  priceLessThan: Float
  priceGreaterThan: Float
}

type Query {
  books(filter: BookFilter): [Book]
}

An updated resolver could look like this:

const resolvers = {
  Query: {
    books: (_, { filter }) => {
      return books.filter(book => {
        if (filter.titleStartsWith && !book.title.startsWith(filter.titleStartsWith)) {
          return false;
        }
        if (filter.priceLessThan !== undefined && book.price >= filter.priceLessThan) {
          return false;
        }
        if (filter.priceGreaterThan !== undefined && book.price <= filter.priceGreaterThan) {
          return false;
        }
        return true;
      });
    },
  },
};

Making queries with this syntax is reasonably easy to understand. You would supply a filter argument to the GraphQL resolver, providing values for those filter input fields if required.

Benefits of This Approach

  • Only the filters you want to allow the user to use are supported.

  • This is backed by the GraphQL type validation system, which won't allow filtering outside of what is allowed. The resolver code in the backend itself does not even support filters that aren't allowed.

Drawbacks of This Approach

  • You have to define each filter individually in the GraphQL schema and in the implementation in code.

  • You can't easily share this code between different GraphQL objects. If you also had Videos and wanted to filter them, it would need a new filtering input type for videos. (You could generalize to a filter input, but then the Book and Video cannot differ.)

  • If there is a requirement for a new filter, it needs a code change to add to the input filter type and to update the resolver code to support it.

    • E.g., if you wanted to filter titles that included a substring anywhere, not just at the start, this is a new filter input and a new implementation in your resolvers.

Arbitrary Filtering by Accepting Sift Query Language as Filter Input

An interesting library I found, sift, allows using MongoDB query syntax to easily filter arbitrary lists of data in JavaScript. I think this is really cool and can enable arbitrary filtering in GraphQL. The headless CMS Strapi previously used Sift before moving on to a more custom solution to enable their GraphQL querying!

I was most excited by this because it seemed to be a way to somewhat reproduce the useful automatic filtering that some ORMs and providers have built into their GraphQL services. And it doesn't even matter if the data hasn't come from a certain database.

You could rewrite the above schema to the following:

input SiftQueryInput {
  field: String
  filter: String
}

type Query {
  books(filter: [SiftQueryInput]): [Book]
}

And the resolver to:

const sift = require('sift').default;

const resolvers = {
  Query: {
    books: (_, { filter }) => {
      const siftQuery = filter.reduce((acc, { field, filter }) => {
        acc[field] = JSON.parse(filter);
        return acc;
      }, {});
      return books.filter(sift(siftQuery));
    },
  },
};

So how does this work? Let's say you want to query all the books that start with 'The'. You could execute this query:

query {
  books(filter: [{ field: "title", filter: "{\"$regex\": \"^The\"}" }]) {
    title
    price
  }
}

With these variables:

{
  "filter": [
    { "field": "title", "filter": "{\"$regex\": \"^The\"}" }
  ]
}

And as expected, you would get back the list filtered to just 'The Great Gatsby'!

Another example, if you wanted to filter for books that include the letter 'i' and are greater than 10 in price, you would supply the following variables:

{
  "filter": [
    { "field": "title", "filter": "{\"$regex\": \"i\"}" },
    { "field": "price", "filter": "{\"$gt\": 10}" }
  ]
}

And you get back the book 'To Kill a Mockingbird'!

Notice that we did not have to change anything in the query, schema, or resolvers! We were able to express entirely new filters that would have needed new filter inputs in the other approach, just in the variables using the Sift query syntax!

Benefits of This Approach

  • Any filtering logic that Sift supports can now be expressed in your queries. If new requirements come in for different filters, it does not require updating with new input types and resolver logic.

  • The same method of filtering can be used across all your types! Just accepting a list of SiftQueryInputs, and the backend implementation to handle those Sift inputs and apply them to a list of objects is unchanged by what the type of list is.

  • This easily supports objects if their shapes change or become nested. The SiftQueryInput.field is of type String because you can access nested properties on the object with a dot syntax.

    • E.g., filtering by including this Sift query is possible: { field: 'author.name.last', filter: JSON.stringify({ $eq: "Orwell" }) }

Drawbacks and Caveats

  • Of course, this is using strings to express the Sift query language, which is error-prone—so careful validation and error handling would be required to use this approach.

  • By using a generic SiftQueryInput type to collect the user's filters, you are losing the type safety of GraphQL—it has no way to verify that the field exists or is being used in the correct way here by your filter.

  • The data of the list needs to be fully resolved at the point that the filtering resolver runs. It can't access fields further down the query that haven't been resolved yet. But for situations where the data is not coming from a DB with its own querying, maybe from a JSON file or REST API response, this is likely anyway.

Future Improvements

I think losing the GraphQL safety is a shame in this case. It could be possible to compile the possible Sift Query options into a GraphQL schema at build time, so the syntax reflects Sift's actual more similarly, without relying on strings.

Conclusion

In conclusion, using Sift.js in GraphQL provides a flexible and powerful way to implement arbitrary filtering. It brings the automatic querying benefits typically reserved for ORMs and certain GraphQL vendors to plain JavaScript objects in a list, regardless of their source.

By providing a generic filtering 'engine' in the GraphQL server, with a flexible query language that can be applied to any type, the logic of filtering is shifted to the GraphQL client. This allows for much faster iteration on filters and enables a much larger degree of expression in filters.

I'd love to hear your thoughts and experiences with implementing filtering in GraphQL—share them in the comments below!