Interaction

During a transaction, the AS will often require interaction from a user. This user can be the same person running the client software, or it could be another party who's not necessarily available. In the transaction request, the client tells the AS how it is able to interact with the user. Several modes of interaction are defined, and a client can support multiple forms of interaction simultaneously.

  • The redirect mode is used when the client is capable of opening a browser on the same device for the user to interact with the AS, or otherwise directing the user to an arbitrary URL.
  • The user_code mode is used when the client cannot open a browser for the user but can communicate an out-of-band code for the user to provide to the AS.
  • The didcomm mode is used when the client can relay a DIDComm message to the user's Agent.
  • The dodcomm_query mode is used when the client can relay a query to a DIDComm Agent.

In addition, the client can indicate how it can receive responses from the interaction request. Currently, there are two modes of getting a response:

  • The client can set a callback section including a URL that the AS can send the user back to when done interacting
  • The client can poll the AS using a transaction handle

These can be used independently of the interaction method, but it will likely be most common to use a callback with the redirect method, polling with the device method, and in limited cases polling with the redirect method. These combinations are shown here.

Redirect with Callback

In redirect based interaction, the client sends the user to an interactive page at the AS. The AS then sends the user back to the client with an HTTP redirect.

To use this mode, the client's transaction request includes a section that includes a URI the client can receive requests at as well as a unique state value.

{
    "interact": {
        "redirect": true,
        "callback": {
            "uri": "https://client.foo",
            "nonce": "VJLO6A4CAYLBXHTR0KRO"
        }
    }
}

The URI must be reachable from the user's system browser, and can be any https URL, an http URL for a local-to-the-device host, or a service-specific URI that the user's browser can open. The nonce must be a unique value for each transaction and cannot be re-used. The nonce value is opaque to the AS and should be random. The client remembers this nonce value and associates it with the current user. For a web server or other web application, this is generally done by placing it in the user's session. Native applications are generally considered single-user in nature, so the nonce value is remembered in the application's current state.

When the AS determines that the client's request needs user interaction, it creates a unique interaction URL and returns it to the client in the transaction response. The AS associates this unique URI with the transaction. The interaction URI is for a user-facing page at the AS.

{
    "interaction_url": "https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ",
    "server_nonce": "MBDOFXG4Y5CVJCX821LH",
    "handle": {
        "value": "80UPRY5NM33OMUKMKSKU",
        "type": "bearer"
    }
}

Much like the callback URI, the interaction URI must be reachable from the user's system browser, and can be any https URL, an http URL for a local-to-the-device host, or a service-specific URI that the user's browser can open.

The client sends the user to the URL completely as-is, without adding any query parameters, fragments, paths, or other components. If the client is a web server, it can send the user to the remote site via an HTTP redirect.

HTTP 302 Found
Location: https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ

If the client is a mobile application, such as an Android app, it can launch the system browser for interaction.

Intent browserIntent = 
  new Intent(Intent.ACTION_VIEW, 
    Uri.parse("https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ"));
startActivity(browserIntent);

Obviously, any method including an embedded security tab such as used by the AppAuth implementation of OAuth 2 is also acceptable. The important thing is to get the current user to the AS, where they can start interacting.

Once at the AS, the AS can ask the user for authentication, and to authorize the application itself. The AS could prompt the user to provide additional claims or proofs however it sees fit, and this interaction is ultimately outside of the protocol.

When the AS has collected whatever input it needs from the user, it sends the user back to the client with an HTTP redirect. First, the AS creates a unique, unguessable interaction handle that the client can use in its subsequent interaction request. The AS also calculates a hash to allow the client to confirm the transaction by concatentating the following values with a newline character and computing the SHA3_512 hash:

  • the nonce value sent by the client in the initial transaction request
  • the server_nonce value returned in the initial transaction response
  • the unique interaction handle

The AS creates the URL for the redirect by taking the callback URI presented in the transaction request and appending two parameters:

  • the Base64 URL encoded hash value calculated above
  • a unique interaction handle, in this example 4IFWWIKYBC2PQ6U56NL1
HTTP 302 Found
Location: https://client.foo/
  ?hash=p28jsq0Y2KK3WS__a42tavNC64ldGTBroywsWxT4md_jZQ1R2HZT8BOWYHcLmObM7XHPAdJzTZMtKBsaraJ64A
  &interact=4IFWWIKYBC2PQ6U56NL1

Note that the AS has to use a proper URL builder in case the client's callback URL contains existing query parameters, is lacking a root path, or has some other anomaly which would make it problematic to simply concatenate the strings.

This can't be a plain string concatenation because it would potentially let a malicious client inject a bad callback URI that looks safe. Alternatively we could use URI templates but that seems a lot of complexity to enforce with its own set of problems.

Once the AS creates the redirect response to the interaction request, the AS deletes or otherwise deactivates the interaction URL to prevent phishing and replay attacks.

If the AS determines that there is an error during the interaction, its response depends on whether or not the incoming interaction URL was valid.

  • If the URL was valid and bound to an active transaction, the AS can return to the client as in a successful transaction. The client needs to send data to the transaction endpoint to determine next steps.
  • If the URL was not valid or not bound to an active transaction, the AS can only display to the user an error. Since the client has no way to inject a redirect URI alongside an invalid interaction request, there is reduced risk of open redirection attack.

The client receives a request from the user's browser through the callback URL. If the client is a web server, this comes in as any other request. If the client is a native application, it usually receives the full URL from the system in whatever function it has registered to receive incoming requests. In any event, the client needs to parse the hash parameter and compare its value to a hash calculated by its originally chosen nonce value, the server's returned server_nonce value from the original transaction request, and the value of the interact handle parameter from the callback request. If these hash values don't match the client returns an error to the user and stops the transaction.

The client then sends a transaction continue request to the AS including the interaction handle as a parameter. The client also includes the current value of the transaction handle (here as a bearer key) as well as any proofs for keys presented during the transaction start.

Here we've hashed the interaction handle before sending it. It's still in the air whether this hashing is helpful to security or is an unnecessary step.

{
    "handle": "80UPRY5NM33OMUKMKSKU",
    "interact_handle": "CuD9MrpSXVKvvI6dN1awtNLx-HhZy46hJFDBicG4KoZaCmBofvqPxtm7CDMTsUFuvcmLwi_zUN70cCvalI6ENw"
}

The AS looks up the transaction from the transaction handle and fetches the interaction handle associated with that transaction. The AS compares the presented handle to hash of the stored interaction handle it appended to the client's callback. If they match, the AS continues processing as normal, likely issuing a token.

User Code with Polling

The user-code-based interaction is intended for clients that need to have the user use a secondary device of some kind to interact with the authorization server. Unlike the redirect based interaction, the client does not send the user to the AS directly. Instead, the client presents the user with a code and instructs the user to interact with the server. Meanwhile, the client polls the AS in the background until the transaction is approved.

The client starts this mode by sending a transaction request indicating that the user will interact with the AS through a secondary device.

{
    "interact": {
        "user_code": true
    }
}

When the AS determines that the client's request needs user interaction, it sends an interaction URI in the transaction response. The interaction URI is for a user-facing page at the AS, and this can be a static URI for this mode. The AS also includes a short user-facing code. This code must be random, short-lived, and easily typed by a user. It must be processed in a case-insensitive way, and should use characters that are unambiguous when displayed even at low resolution (e.g., not using both the 0 (zero) and O (letter O) characters as options).

{
    "user_code": {
        "url": "https://server.example.com/interact/device",
        "code": "A1BC-3DFF"
    },
    "handle": {
        "value": "80UPRY5NM33OMUKMKSKU",
        "type": "bearer"
    }
}

The client presents the user code to the user. The client also indicates to the user that they need to go to the interaction URL. As this is likely to be a static URL, the AS can provide a static link to this page.

Once at the AS, the AS can ask the user for authentication, and to authorize the application itself. The AS could prompt the user to provide additional claims or proofs however it sees fit, and this interaction is ultimately outside of the protocol. At the interaction page, the user is prompted to enter the code from their device. The AS uses this code to look up the associated transaction to determine which actions it needs to take.

When the AS has collected whatever additional input it needs from the user, it displays to the user that they can return to their device and continue operation.

To completely close a session-fixation attack, we could require the AS to present another code to the user and have the user enter that into the client device. This only works on some kinds of devices, though, so it would need to be an option. Additionally, we might want to consider a kind of "secondary application" based interaction that isn't web-based, such as a CIBA login.

Meanwhile, the client can poll the AS using the transaction handle to see if the user has authorized them yet. This request includes proofs of any keys that the client sent during the original transaction request.

{
    "handle": "80UPRY5NM33OMUKMKSKU"
}

If the user has yet to approve the transaction, the AS sends back a response telling the client to wait. This response contains a new transaction handle which replaces the client's original one.

{
    "wait": 30,
    "handle": {
        "value": "BI9QNW6V9W3XFJK4R02D",
        "method": "bearer"
    }
}

The wait parameter tells the client the number of integer seconds it must wait before polling again. The client can continue to poll in this manner until it receives either a token response or the user code times out and the client receives an error response.

Redirect with Polling

For some kinds of clients, it is possible to show the user a visual signal such as a QR code that the user can scan on a secondary device. This allows the client to send the user to an arbitrary URL but not receive a callback in the front channel, so polling is necessary.

{
    "interact": {
        "redirect": true
    }
}

As above, when the AS determines that the client's request needs user interaction, it creates a unique interaction URL and returns it to the client in the transaction response. The AS associates this unique URI with the transaction. The interaction URI is for a user-facing page at the AS. Note that because the client did not send a callback to the AS, the AS does not generate or return a server_nonce parameter.

{
    "interaction_url": "https://server.example.com/interact/4CF492MLVMSW9MKMXKHQ",
    "handle": {
        "value": "80UPRY5NM33OMUKMKSKU",
        "type": "bearer"
    }
}

The client then communicates this arbitrary URL to the user, perhaps using a QR code.

qr code

While the client waits for the user to complete the interaction, it polls the AS using the transaction handle.

{
    "handle": "80UPRY5NM33OMUKMKSKU"
}

If the user has yet to approve the transaction, the AS sends back a response telling the client to wait. This response contains a new transaction handle which replaces the client's original one.

{
    "wait": 30,
    "handle": {
        "value": "ZI9QNW6V9W3XFJK4R02D",
        "method": "bearer"
    }
}

It would be interesting to figure out if we can combine both the user code and arbitrary redirect URL methods, giving the user a few options. This should be as simple as allowing more flexibility on the interaction request portion and having the server return all possible options.

DIDComm Messages and Queries

Some clients will be able to pass a message from the AS to another piece of software using a secondary protocol, such as DIDComm.

If the client can pass a complete DIDComm message to the agent, it sends the didcomm flag. If it can pass along a query, it sends the didcomm_query flag.

{
    "interact": {
        "didcomm": true,
        "didcomm_query": true
    }
}

In either case, the agent might send its response directly to the AS through some external means, or the client might get a message back from the agent. This message is to be included in its transaction continuation request.

{
    "handle": "80UPRY5NM33OMUKMKSKU",
    "didcomm": {
        "...": "..."
    }
}

If the client has no response message from the agent, it can poll the transaction as usual.

{
    "handle": "80UPRY5NM33OMUKMKSKU"
}