Mindset for hacking GraphQL Applications

I’ve tried to summarize a lot of information from HackTricks, YouTube, HTB write-ups, disclosed vulnerabilities, and the GraphQL documentation to come up with succinct notes on GraphQL. This way you don’t need to be an expert to focus on what’s important.

I’m not claiming to be an expert on GraphQL, but enough to know what I’m trying to accomplish in the field (so be kind please 🙂). This is more of a mindset guide rather than a technical guide.

For more hands-on time, please see this writeup in my Github repository with a lightweight app demoing GraphQL.

What should pentesters know about GraphQL?

tl;dr — GraphQL is a querying tool. That’s it.

So when trying to distill what GraphQL is for our purposes, it’s important to call out that GraphQL’s purpose is to more efficiently query data sources on the back end of a web application. Instead of the client sending a bunch of separate API calls to the back-end, one carefully crafted GraphQL query can be sent to retrieve the data that the client needs. This, in theory, reduces the number of RESTful calls made to the server since the results can be batched together and results “picked” so we’re not returning too much information that isn’t necessary.

To make the most of GraphQL, there are a variety of drivers out there which make it possible to use GraphQL transactions with other kinds of storage mechanisms. SQL and No-SQL databases can be queried with GraphQL and, according to some docs, concurrently (neat!) which allows for multiple GraphQL “resolvers” to interact with multiple data sources for a single server.

To recap, multiple queries ➡️ one GraphQL query.

What kind of attack vectors apply to services using GraphQL?

This is where things get somewhat not interesting as far as general pentesting goes. I guess that, at the time of this being written, the concept is relatively newer with 2015 being an initial release and 2018 being a stable release. In short, here are a few attack vectors:

Data Disclosure and Enumeration via Introspection
Specific bugs with the package/module implementing GraphQL
Issues with the application itself

Data Disclosure and Enumeration via Introspection 👀

Introspection, as it relates to GraphQL, is a way of asking GraphQL what schemas are supported. As an attacker, the idea is that if you can query schemas then you can enumerate what else GraphQL returns to you as the end-user.

If you’re able to supply a query string to enumerate schemas, consider yourself lucky as a pentester because now you can figure out how to query the rest of the big data behind the service.

Example:

query={__schema{types{name,fields{name}}}}

HackTricks already covers this so I won’t beat the topic to death here.

In addition, since this is a glaring security issue, it’s worth noting that some libraries have implementations to turn off introspection. If introspection is turned off, you can still invoke error-based enumeration. If no errors are returned to you, your chances for enumeration have gone from super slim to none.

For more information on this technique, there are HackTheBox write-ups for the now-retired ‘Help’ lab.

Specific bugs with the package/module implementing GraphQL 🐛

As long as there are no misconfigurations with the GraphQL implementation itself, and you can’t expose sensitive or privilege information, next up would be to see if the package using GraphQL is vulnerable. At this time of this writing, there are only 32 CVEs according to cve.mitre.org and most of them have to do with services using GraphQL and poor implementation or underlying application issues.

My recommendation, stick to tracking down vulnerabilities that directly impact the tech stack you’re working with. For example if you’re attacking GraphQL implemented on Express, check for vulnerabilities in packages such as express-graphql or graphql for node.

Issues with the application itself ⁉️

This is where the basics come back into play. Forget the fact that you’re using GraphQL as a wrapper, think about the application itself.

Ask yourself questions like:

What database is on the back-end? Is it susceptible to injection attacks?
Who should be making this query? Are there any authentication or authorization checks that can be bypassed?
How fungible is the query? Is there poor schema validation where I can request additional information from GraphQL?

Bring yourself back to OWASP Top 10 territory rather on specific GraphQL attacks at this point.