Interaction

During a transaction, the AS will often require interaction from a user. This user can be the same person running the client software, or it could be another party who's not necessarily available. In the request, the client instance tells the AS how it is able to interact with the user running the software by declaring how it can start an interaction, how it can finish an interaction, and any other interaction-based hints that might help the AS.

Start Modes

The client can indicate one or more modes to start an interaction.

  • The redirect mode is used when the client instance is capable of opening a browser on the same device for the user to interact with the AS, or otherwise directing the user to an arbitrary URL.
  • The app mode is used when the client instance can launch an application from the same device the user is interacting with. The way that application communicates with the AS is not defined by GNAP.
  • The user_code mode is used when the client instance can communicate an out-of-band code to the user for the user to provide to the AS (usually by typing it in).

The client indicates these in the interaction request's start field by listing out the modes in an array. The AS evaluates this array and determines which, if any, of the modes are appropriate for the request.

In the future, other start modes could indicate things like allowing the client to send a request to a digital wallet or sending a message to the user through another communication channel.

Finish Modes

In addition, the client can indicate how it can receive a signal from the AS once interaction is completed. Currently, there are two methods of getting a response:

  • The client instance can set a redirect method including a URL that the AS can send the user back to when done interacting
  • The client instance can set a push method including a URL that the AS can send an HTTP message to when the user is done interacting

The messages from the AS to any of these finish methods will contain an interact_ref parameter that the client instance presents to the AS when it continues the request at the AS.

The client can also poll the continuation endpoint as well, in which case a client wouldn't specify a finish mode. However, a client that's waiting for a response from interaction shouldn't poll before that interaction signal is completed because it won't have the interaction reference returned. A client that just wants to poll should leave the interaction finish field off of its request.

Interaction Reference Hash

In addition to the interaction reference, the AS will calculate a cryptographic hash, which is later verified by the client instance. The hash is created by first concatenating the following values with a newline \n between them:

  • the nonce value sent by the client instance in the initial request's finish section
  • the finish nonce value returned in the initial response
  • the unique interaction reference value interact_ref passed back to the client instance by the finish method
  • the URL of the grant request endpoint, which identifies the AS the client instance is talking to

For example, an input string could look like this:

VJLO6A4CAYLBXHTR0KRO
MBDOFXG4Y5CVJCX821LH
4IFWWIKYBC2PQ6U56NL1
https://server.example.com/tx

There's no trailing newline on this string, but conveying that in an example is really difficult to do.

The AS and client instance then pass this value through a hash function to get the interaction hash. By default, the finish methods use a SHA3 512-bit hash over this data. The client instance can choose a different hash algorithm by sending a hash_method parameter as part of the finish object with a different registered hash value. If the AS can't honor the client instance's requested hash method, it has to fail the request from the start.

UI Hints

The client instance can also give some hints to the AS about how to handle the interaction. Currently, the client can only send a set of preferred UI locales, but in the future other information the client might know could be set here as well.

Example Interaction Combinations

While the different interaction modes can be mixed and matched to fit different needs, as shown on the request demo page, there are a number of common combinations that are worth discussing.

Redirect with Callback

In fully redirect based interaction, the client sends the user to an interactive page at the AS. The AS then sends the user back to the client with an HTTP redirect.

To use this mode, the client's transaction request includes a section that includes a URI the client can receive requests at as well as a unique state value.

{
    "interact": {
        "start": [
            "redirect"
        ],
        "finish": {
            "method": "redirect",
            "uri": "https://client.example.net/return/123455",
            "nonce": "VJLO6A4CAYLBXHTR0KRO"
        }
    }
}

The URI in the finish method must be reachable from the user's system browser, and can be any https URL, an http URL for a local-to-the-device host, or a service-specific URI that the user's browser can open. The nonce must be a unique value for each transaction and cannot be re-used. The nonce value is opaque to the AS and should be random. The client remembers this nonce value and associates it with the current user. For a web server or other web application, this is generally done by placing it in the user's session. Native applications are generally considered single-user in nature, so the nonce value is remembered in the application's current state.

When the AS determines that the client's request needs user interaction, it creates a unique interaction URL and returns it to the client in the transaction response. The AS associates this unique URI with the transaction. The interaction URI is for a user-facing page at the AS.

{
    "interact": {
        "redirect": "https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ",
        "finish": "MBDOFXG4Y5CVJCX821LH"
    },
    "continue": {
        "access_token": {
            "value": "80UPRY5NM33OMUKMKSKU"
        },
        "uri": "https://server.example.com/continue"
    }
}

Much like the callback URI, the interaction URI must be reachable from the user's system browser, and can be any https URL, an http URL for a local-to-the-device host, or a service-specific URI that the user's browser can open.

The client sends the user to the URL completely as-is, without adding any query parameters, fragments, paths, or other components. If the client is a web server, it can send the user to the remote site via an HTTP redirect.

HTTP 302 Found
Location: https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ

If the client is a mobile application, such as an Android app, it can launch the system browser for interaction.

Intent browserIntent = 
  new Intent(Intent.ACTION_VIEW, 
    Uri.parse("https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ"));
startActivity(browserIntent);

Obviously, any method including an embedded security tab such as used by the AppAuth implementation of OAuth 2 is also acceptable. The important thing is to get the current user to the AS, where they can start interacting.

Once at the AS, the AS can ask the user for authentication, and to authorize the application itself. The AS could prompt the user to provide additional claims or proofs however it sees fit, and this interaction is ultimately outside of the protocol.

When the AS has collected whatever input it needs from the user, it sends the user back to the client with an HTTP redirect. First, the AS creates a unique, unguessable interaction handle that the client can use in its subsequent interaction request. The AS also calculates an interaction hash as described above.

The AS creates the URL for the redirect by taking the callback URI presented in the transaction request and appending two parameters:

  • a unique interaction reference, in this example 4IFWWIKYBC2PQ6U56NL1
  • the Base64 URL encoded hash value calculated as described above
HTTP 302 Found
Location: https://client.example.net/return/123455
  ?hash=p28jsq0Y2KK3WS__a42tavNC64ldGTBroywsWxT4md_jZQ1R2HZT8BOWYHcLmObM7XHPAdJzTZMtKBsaraJ64A
  &interact_ref=4IFWWIKYBC2PQ6U56NL1

Note that the AS has to use a proper URL builder in case the client's callback URL contains existing query parameters, is lacking a root path, or has some other anomaly which would make it problematic to simply concatenate the strings.

Once the AS creates the redirect response to the interaction request, the AS deletes or otherwise deactivates the interaction URL to prevent phishing and replay attacks.

If the AS determines that there is an error during the interaction, its response depends on whether or not the incoming interaction URL was valid.

  • If the URL was valid and bound to an active transaction, the AS can return to the client as in a successful transaction. The client needs to send data to the transaction endpoint to determine next steps.
  • If the URL was not valid or not bound to an active transaction, the AS can only display to the user an error. Since the client has no way to inject a redirect URI alongside an invalid interaction request, there is reduced risk of open redirection attack.

The client receives a request from the user's browser through the callback URL. If the client is a web server, this comes in as any other request. If the client is a native application, it usually receives the full URL from the system in whatever function it has registered to receive incoming requests. In any event, the client needs to parse the hash parameter and compare its value to a hash calculated in the same way that the AS created it. Since both the AS and the client instance have access to all forms of input in the hash calculation, this is possible. If these hash values don't match the client returns an error to the user and does not call the AS again.

The client then sends a continue request to the AS including the interaction reference as a parameter.

{
    "interact_ref": "4IFWWIKYBC2PQ6U56NL1"
}

The AS looks up the transaction from a combination of the continuation request and fetches the interaction reference associated with that transaction. The AS compares the presented reference to the stored interaction reference it appended to the client's callback with interact. If they match, the AS continues processing as normal, likely issuing a token.

User Code with Polling

The user-code-based interaction is intended for clients that need to have the user use a secondary device of some kind to interact with the authorization server. Unlike the redirect based interaction, the client does not send the user to the AS directly. Instead, the client presents the user with a code and instructs the user to interact with the server. Meanwhile, the client polls the AS in the background until the transaction is approved.

The client starts this mode by sending a transaction request indicating that the user will interact with the AS through a secondary device.

{
    "interact": {
        "start": [
            "user_code"
        ]
    }
}

When the AS determines that the client's request needs user interaction, it sends an interaction URI in the transaction response. The interaction URI is for a user-facing page at the AS, and this can be a static URI for this mode. The AS also includes a short user-facing code. This code must be random, short-lived, and easily typed by a user. It must be processed in a case-insensitive way, and should use characters that are unambiguous when displayed even at low resolution (e.g., not using both the 0 (zero) and O (letter O) characters as options).

{
    "interact": {
        "user_code": {
            "url": "https://server.example.com/interact/device",
            "code": "A1BC-3DFF"
        }
    },
    "continue": {
        "access_token": {
            "value": "80UPRY5NM33OMUKMKSKU"
        },
        "uri": "https://server.example.com/continue",
        "wait": 30
    }
}

The client presents the user code to the user. The client also indicates to the user that they need to go to the interaction URL. As this is likely to be a static URL, the AS can provide a static link to this page.

Once at the AS, the AS can ask the user for authentication, and to authorize the application itself. The AS could prompt the user to provide additional claims or proofs however it sees fit, and this interaction is ultimately outside of the protocol. At the interaction page, the user is prompted to enter the code from their device. The AS uses this code to look up the associated transaction to determine which actions it needs to take.

When the AS has collected whatever additional input it needs from the user, it displays to the user that they can return to their device and continue operation.

To completely close a session-fixation attack, we could require the AS to present another code to the user and have the user enter that into the client device. This only works on some kinds of devices, though, so it would need to be an option. Additionally, we might want to consider a kind of "secondary application" based interaction that isn't web-based, such as a CIBA login.

Meanwhile, the client can poll the AS using the transaction handle to see if the user has authorized them yet. This request includes proofs of any keys that the client sent during the original transaction request.

HTTP POST /continue
Authorization: GNAP 80UPRY5NM33OMUKMKSKU
Signature-Input: gnap=("host" "authorization" "@request-target");created=1624564850
Signature: gnap=:N0MynxZllMLOTxgl4PmgHZjQ+gwHKGFGeg6wD5FXKtMM25PfcU2eLVMF9rPuZTguUKnEbvrY7spXlJDZ0jrKZQ==:

If the user has yet to approve the transaction, the AS sends back a response telling the client to wait. This response can contain a new continuation access token and URL which replace the client's original ones.

{
    "continue": {
        "access_token": {
            "value": "BI9QNW6V9W3XFJK4R02D"
        },
        "uri": "https://server.example.com/continue",
        "wait": 30
    }
}

The wait parameter tells the client the number of integer seconds it must wait before polling again. The client can continue to poll in this manner until it receives either a token response or the user code times out and the client receives an error response.

Redirect with Polling

For some kinds of clients, it is possible to show the user a visual signal such as a QR code that the user can scan on a secondary device. This allows the client to send the user to an arbitrary URL but not receive a callback in the front channel, so polling is necessary.

{
    "interact": {
        "start": [
            "redirect"
        ]
    }
}

As above, when the AS determines that the client's request needs user interaction, it creates a unique interaction URL and returns it to the client in the response. The AS associates this unique URI with the transaction. The interaction URI is for a user-facing page at the AS. Note that because the client did not send a callback to the AS, the AS does not generate or return a callback_server_nonce parameter.

{
    "interact": {
        "redirect": "https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ"
    },
    "continue": {
        "access_token": {
            "value": "BI9QNW6V9W3XFJK4R02D"
        },
        "uri": "https://server.example.com/continue",
        "wait": 30
    }
}

The client then communicates this arbitrary URL to the user, perhaps using a QR code.

qr code

While the client waits for the user to complete the interaction, it polls the AS using the transaction handle.

HTTP POST /continue
Authorization: GNAP 80UPRY5NM33OMUKMKSKU
Signature-Input: gnap=("host" "authorization" "@request-target");created=1624564850
Signature: gnap=:N0MynxZllMLOTxgl4PmgHZjQ+gwHKGFGeg6wD5FXKtMM25PfcU2eLVMF9rPuZTguUKnEbvrY7spXlJDZ0jrKZQ==:

If the user has yet to approve the transaction, the AS sends back a response telling the client to wait. This response contains a new transaction handle which replaces the client's original one.

{
    "continue": {
        "access_token": {
            "value": "ZI9QNW6V9W3XFJK4R02D"
        },
        "uri": "https://server.example.com/continue",
        "wait": 30
    }
}