OAuth 2 is one of the most successful security protocols in use today. Even so, in its wide use, the protocol has come up against some of its own limitations. This is a proposal for a transactional authorization protocol XYZ to address the things that OAuth 2 doesn't handle well on its own. Optimizations and decisions that made sense in OAuth 2 don't make as much sense today. We are in a different world, and perhaps it's time we started to look to a different protocol.
But we're not trying to invent something in a vacuum. In particular, OAuth 2 has been extended to cover a wide variety of client applications, deployments, and use cases. While this flexibility and extensibility is one of OAuth 2's strengths, this has unfortunately led to a number of components that are almost--but not quite--solving the same problems in similar ways. PKCE, UMA, CIBA, OBUK, FAPI, and a host of other extensions to OAuth 2 make use of temporary credentials to let the protocol behave in new ways.
This protocol, which I'm calling "XYZ" for the moment, is based on a transactional model. The client of the API declares who it is and what it wants, the AS figures out what information it needs to fulfill that (which might include interacting with a user), and ultimately a token is produced. All along the way, components have the opportunity to bind keys to different parts of the transaction so that attackers can't take over. This intent-based system takes in experience and feedback from other similar projects and protocols, but in a way that pulls together many different aspects.
The XYZ protocol is not intended to be directly compatible with OAuth 2, much in the same way that OAuth 2 was not directly compatible with OAuth 1. However, the concepts and many of the goals should feel familiar to developers used to these existing protocols and their extensions. The XYZ protocol seeks to make minimal use of the front-channel for passing information.
In XYZ, a client that needs to talk to an API first begins an authorization transaction with the authorization server. This request includes information about the client, the user, the API being requested, the ways the client can interact with the user, and any keys the client can prove possession of.
The authorization server processes this request and indicates to the client what its next steps are. If the authorization server requires external input from an authorized user, it will indicate that the client interact with the user or wait for a server-side action, depending on the nature of the client and what the authorization server needs. If the client is told to wait (as in a multi-device or multi-party authorization process), it can poll the transaction endpoint again after the indicated time window. After successful interaction, the client will continue a transaction with the authorization server. This can result in additional responses to continue the transaction, as needed.
The AS might instead determine that no interaction from a user is required, such as is the case in OAuth 2's client credentials and assertion flows, and UMA 2's push-based claims flows. In any of these cases, the AS can determine that the presentation of a particular set of keys and assertions for a certain set of resources is sufficient for issuance of an access token, and the process can continue.
Upon successful authorization, the authorization server will issue an access token for use at the API.
This proposal is far from perfect, and I'm not going to pretend it is. There are a lot of things that are under-defined, and a lot of things that might not make sense in real deployments. While I've been able to pull together a few test implementations, there is still a lot to be learned. I think there might be a grand unified authorization theory to be found, and this is just one early attempt at it, and this attempt is not alone. There have been some fantastic ideas on how to represent resources, actions, and scopes, some fresh thinking on how to bind keys, not to mention all of the existing work on intent-based security protocols in the past.
I've started an implementation of XYZ in Java. This project has both the client and AS portions, but they do not share data internally and only interact through the protocol. It's written in Java Spring Boot with a React front end and almost certainly has errors in it. But at least it will show a bit of the protocol working on its own.
The protocol and this site are not intended to be static. I want people to dig into the ideas collected here, improve them, adapt them, break them, and build them up again. Go read the draft specification text, propose changes to the site, and let's move this conversation forward together.