5.8. Retransmissions, timeouts & response caching
Due to potential network instability, the need to retransmit a message between a Client and a Server may sometimes occur. Detecting retransmissions is especially important if the operation has some observable side effects.
5.8.1. Motivational examples
Imagine a LwM2M Object with a numeric, executable Resource, whose value is incremented every time a LwM2M Server performs Execute on it. Now, consider the following scenario:
the LwM2M Server performs an Execute on this specific Resource,
the LwM2M Client receives the request, bumps the value, and sends a response,
the response is lost due to unfortunate network conditions,
the LwM2M Server attempts to increment the Resource again (sending exactly the same Execute request as before),
the LwM2M Client receives the request.
LwM2M Client obtained both requests, and if it was unable to classify the second one as a retransmission of the first, the resource would be incremented twice, even though it would make much more sense to increment it just once. On the other hand, caching the response, and detecting a retransmission, would improve Client-Server communication integrity by preventing this from happening.
Another scenario could be that the response is computationally expensive (and time-consuming) to generate. In this case caching mechanism would yield measurable performance benefits.
5.8.2. Caching mechanism
Anjay provides a built-in message cache - when the request is received, Anjay checks if there exists an appropriate response to it in the cache already. In case there is one, it is retransmitted. Otherwise Anjay processes the request as usual, in the end placing response in the cache for future use.
Note
Cached response, matching a specific CoAP Request is identified by the following triplet:
CoAP Message Token,
CoAP Message ID,
Server endpoint name (host and port).
Every response in the cache sits there for at most MAX_TRANSMIT_SPAN
as defined in RFC7252, in
Section 4.8.2. Time Values Derived from Transmission Parameters, and after that time
it is automatically removed.
5.8.3. Cache size
The size of the cache is specified at Anjay instantiation time by setting
anjay_configuration_t::msg_cache_size
to a non-zero value (zero disables
any caching). This limits the number of bytes used to store cached responses.
Note
The cache size limit is global for all Servers - i.e. all responses, to all Servers are stored within a single cache.
5.8.4. Limitations
If a response is too big to fit into the cache, it is not cached,
If a response would fit into the cache, but the cache is currently full, responses (starting from the oldest) are dropped from the cache (even if they are still considered valid in terms of mentioned
MAX_TRANSMIT_SPAN
), till the new response fits.
5.8.5. Configuring retransmissions and timeouts
5.8.5.1. Background
To provide custom retransmission policy, affecting CoAP layer across
the library, one needs to set anjay_configuration_t::udp_tx_params
accordingly prior library instantiation with anjay_new()
.
anjay_configuration_t::udp_tx_params
is a avs_coap_udp_tx_params_t
structure, defined as follows:
/** CoAP transmission params object. */
typedef struct {
/** RFC 7252: ACK_TIMEOUT */
avs_time_duration_t ack_timeout;
/** RFC 7252: ACK_RANDOM_FACTOR */
double ack_random_factor;
/** RFC 7252: MAX_RETRANSMIT */
unsigned max_retransmit;
/** RFC 7252: NSTART */
size_t nstart;
} avs_coap_udp_tx_params_t;
It should be noted that without any additional configuration, Anjay uses default values as specified in the Section 4.8 of RFC7252:
Parameter |
Default value |
Corresponding field in |
---|---|---|
|
2 seconds |
|
|
1.5 |
|
|
4 |
|
|
1 |
|
5.8.5.2. Meaning of each parameter, calculations of timeouts and the number of retransmissions
5.8.5.2.1. ACK_RANDOM_FACTOR
Configures the amount of random perturbation to a timeout to a response to
an initial message (ACK_TIMEOUT
, see next subsection). Its value has to
be at least 1.0
. The randomness is mixed in as follows:
generate a random number
r
from a closed range[1.0, ACK_RANDOM_FACTOR]
,multiply the
ACK_TIMEOUT
byr
and use it as initial timeout.
Example
Say the library has ACK_TIMEOUT
set to 16s.
Now, if the ACK_RANDOM_FACTOR
is 1.0
, no random behavior is
introduced, because the library is forced to pick a random number from
a trivial interval [1.0, 1.0]
.
However, if the ACK_RANDOM_FACTOR
is, say, 1.5
, the number picked
may lie in range [1.0, 1.5]
, thus the actual time the library would wait
may vary between [16, 24]
seconds.
5.8.5.2.2. ACK_TIMEOUT
Configures the amount of time the library shall wait for the response to the initial confirmable message (not retransmission).
Example
Say the library wants to send a confirmable message.
If ACK_TIMEOUT
is set to, say, 10 seconds, the library sends the
message and then waits 10 * r
seconds (r
is defined as in the
above discussion about ACK_RANDOM_FACTOR
) for the initial response.
5.8.5.2.3. MAX_RETRANSMIT
Configures the total number of retransmissions the library is allowed to perform before giving up on message delivery.
Example
If MAX_RETRANSMIT
is set to, say, 4, the library would send 1
initial message + up to 4 retransmissions, accounting for up to 5
messages in total.
If MAX_RETRANSMIT
is set to 0, no retransmission would be attempted,
and the library would give up if no response arrived after ACK_TIMEOUT *
r
seconds.
5.8.5.2.4. NSTART
Configures the maximum number of exchanges that may be ongoing at the same time with a given remote CoAP endpoint (i.e., a LwM2M Server).
In Anjay, it is mostly ignored. It is not recommended to set it to any other value than the default of 1.
Higher values may be useful when writing applications using the low-level CoAP APIs.
5.8.5.2.5. Exponential back-off
After waiting for a response for t
seconds , the wait time for the next
retransmission (in the absence of response) would be 2 * t
seconds. In
other words, retransmissions are performed with exponential back-off.
5.8.5.3. Example configuration
As an example, we may configure the library as follows:
avs_coap_udp_tx_params_t udp_tx_params = {
// Wait at least 4 seconds for the initial response.
.ack_timeout = avs_time_duration_from_scalar(4, AVS_TIME_S),
// Do not randomize wait times for simplicity of the discussion,
// thus "at least" in the comment above should be thought of as
// "exactly".
.ack_random_factor = 1.0,
// Allow up to 4 retransmissions.
.max_retransmit = 4,
// leave the NSTART parameter at the default value of 1
.nstart = 1
};
anjay_configuration_t configuration = {
// Some other configuration ...
.udp_tx_params = &udp_tx_params
};
// Create Anjay instance with custom transmission parameters
anjay_t *anjay = anjay_new(&configuration);
The above configuration would result in the following retransmission times to a confirmable message:
Time [s] |
Retry number |
Wait time for the response [s] |
Action by the library |
---|---|---|---|
0 |
0 |
4 |
send initial message |
4 |
1 |
8 |
1st retransmission |
12 |
2 |
16 |
2nd retransmission |
28 |
3 |
32 |
3rd retransmission |
60 |
4 |
64 |
4th (final) retransmission |
124 |
– |
– |
give up |
5.8.6. Other retransmission parameters
While setting anjay_configuration_t::udp_tx_params
parameter
covers most cases, there are also means to configure:
DTLS handshake retransmissions (
anjay_configuration_t::udp_dtls_hs_tx_params
docs),firmware update module retransmissions (by implementing custom
anjay_fw_update_get_coap_tx_params_t
handler docs),additional fields in
anjay_configuration_t
that configure transmission parameters for non-UDP transports.
We recommend to refer to the doxygen documentation for more details.