MPack 1.1.1
A C encoding/decoding library for the MessagePack serialization format.
Loading...
Searching...
No Matches
Using the Expect API

The Expect API is used to imperatively parse data of a fixed (hardcoded) schema. It is most useful when parsing very large MessagePack files, parsing in memory-constrained environments, or generating parsing code from a schema. The API is similar to CMP, but has many helper functions especially for map keys and expected value ranges. Some of these will be covered below.

Check out the Reader API guide first for information on setting up a reader and reading strings.

If you are not writing code for an embedded device or generating parsing code from a schema, you should not follow this guide. You should most likely be using the Node API instead.

A simple example

Suppose we have data that we know will have the following schema:

an array containing three elements
a UTF-8 string of at most 127 characters
a UTF-8 string of at most 127 characters
an array containing up to ten elements
where all elements are ints

For example, we could have the following bytes in a MessagePack file called example.mp:

93 # an array containing three elements
a5 68 65 6c 6c 6f # "hello"
a6 77 6f 72 6c 64 21 # "world!"
94 # an array containing four elements
01 # 1
02 # 2
03 # 3
04 # 4

In JSON this would look like this:

[
"hello",
"world!",
[
1,
2,
3,
4
]
]

You can use msgpack-tools with the above JSON to generate example.mp. The below code demonstrates reading this data from a file using the Expect API:

#include "mpack.h"
int main(void) {
// Initialize a reader from a file
mpack_reader_init_file(&reader, "example.mp");
// The top-level array must have exactly three elements
// The first two elements are short strings
char first[128];
char second[128];
mpack_expect_utf8_cstr(&reader, first, sizeof(first));
mpack_expect_utf8_cstr(&reader, second, sizeof(second));
// Next we have an array of up to ten ints
int32_t numbers[10];
size_t count = mpack_expect_array_max(&reader, sizeof(numbers) / sizeof(numbers[0]));
for (size_t i = 0; i < count; ++i)
numbers[i] = mpack_expect_i32(&reader);
mpack_done_array(&reader);
// Done reading the top-level array
mpack_done_array(&reader);
// Clean up and handle errors
if (error != mpack_ok) {
fprintf(stderr, "Error %i occurred reading data!\n", (int)error);
return EXIT_FAILURE;
}
// We now know the data was parsed correctly and can safely
// be used. The strings are null-terminated and valid UTF-8,
// the array contained at most ten elements, and the numbers
// are all within the range of an int32_t.
printf("%s\n", first);
printf("%s\n", second);
for (size_t i = 0; i < count; ++i)
printf("%i ", numbers[i]);
printf("\n");
return EXIT_SUCCESS;
}
mpack_error_t
Error states for MPack objects.
Definition: mpack-common.h:161
@ mpack_ok
No error.
Definition: mpack-common.h:162
uint32_t mpack_expect_array_max(mpack_reader_t *reader, uint32_t max_count)
Reads the start of an array with a number of elements at most max_count, returning its element count.
Definition: mpack-expect.h:801
void mpack_expect_utf8_cstr(mpack_reader_t *reader, char *buf, size_t size)
Reads a string into the given buffer, ensures it is a valid UTF-8 string without NUL characters,...
void mpack_expect_array_match(mpack_reader_t *reader, uint32_t count)
Reads the start of an array of the exact size given.
int32_t mpack_expect_i32(mpack_reader_t *reader)
Reads a 32-bit signed integer.
void mpack_reader_init_file(mpack_reader_t *reader, const char *filename)
Deprecated.
Definition: mpack-reader.h:217
mpack_error_t mpack_reader_destroy(mpack_reader_t *reader)
Cleans up the MPack reader, ensuring that all compound elements have been completely read.
void mpack_done_array(mpack_reader_t *reader)
Finishes reading an array.
Definition: mpack-reader.h:761
struct mpack_reader_t mpack_reader_t
A buffered MessagePack decoder.
Definition: mpack-reader.h:84

With the file given above, this example will print:

hello
world!
1 2 3 4

Note that there is only a single error check in this example. In fact each call to the reader is checking for errors and storing any error in the reader. These could be errors from reading data from the file, from invalid or corrupt MessagePack, or from not matching our expected types or ranges. On any call to the reader, if the reader was already in error or an error occurs during the call, a safe value is returned.

For example the mpack_expect_array_max() call above will return zero if the element is not an array, if it has more than ten elements, if the MessagePack data is corrupt, or even if the file does not exist. The mpack_expect_utf8_cstr() calls will also place a null-terminator at the start of the given buffer if any error occurs just in case the data is used without an error check. The error check can be performed later at a more convenient time.

Maps

Maps can be more complicated to read because you usually want to safely handle keys being re-ordered. MessagePack itself does not specify whether maps can be re-ordered, so if you are sticking only to MessagePack implementations that preserve ordering, it may not be strictly necessary to handle this. (MPack always preserves map key ordering.) However many MessagePack implementations will ignore the order of map keys in the original data, especially in scripting languages where the data will be parsed into or encoded from an unordered map or dict. If you plan to interoperate with them, you will need to allow keys to be re-ordered.

Suppose we expect to receive a map containing two key/value pairs: a key called "compact" with a boolean value, and a key called "schema" with an int value. The example on the MessagePack homepage fits this schema, which looks like this in JSON:

{
"compact": true,
"schema": 0
}

If we also expect the key called "compact" to always come first, then parsing this is straightforward:

mpack_expect_cstr_match(&reader, "compact");
bool compact = mpack_expect_bool(&reader);
mpack_expect_cstr_match(&reader, "schema");
int schema = mpack_expect_int(&reader);
mpack_done_map(&reader);
void mpack_expect_cstr_match(mpack_reader_t *reader, const char *cstr)
Reads a string, ensuring it exactly matches the given null-terminated string.
Definition: mpack-expect.h:1079
void mpack_expect_map_match(mpack_reader_t *reader, uint32_t count)
Reads the start of a map of the exact size given.
int mpack_expect_int(mpack_reader_t *reader)
Reads a signed int.
Definition: mpack-expect.h:528
bool mpack_expect_bool(mpack_reader_t *reader)
Reads a boolean.
void mpack_done_map(mpack_reader_t *reader)
Finishes reading a map.
Definition: mpack-reader.h:772

If we expect the "schema" key to be optional, but always after "compact", then parsing this is longer but still straightforward:

size_t count = mpack_expect_map_max(&reader, 2);
mpack_expect_cstr_match(&reader, "compact");
bool compact = mpack_expect_bool(&reader);
bool has_schema = false;
int schema = -1;
if (count == 0) {
mpack_expect_cstr_match(&reader, "schema");
schema = mpack_expect_int(&reader);
}
mpack_done_map(&reader);
uint32_t mpack_expect_map_max(mpack_reader_t *reader, uint32_t max_count)
Reads the start of a map with a number of elements at most max_count, returning its element count.
Definition: mpack-expect.h:691

If however we want to allow keys to be re-ordered, then parsing this can become a lot more verbose. You need to switch on the key, but you also need to track whether each key has been used to prevent duplicate keys and ensure that required keys were found. Using the mpack_expect_cstr() directly for keys, this would look like this:

bool has_compact = false;
bool compact = false;
bool has_schema = false;
int schema = -1;
for (size_t i = mpack_expect_map_max(&reader, 100); i > 0 && mpack_reader_error(&reader) == mpack_ok; --i) {
char key[20];
mpack_expect_cstr(&reader, key, sizeof(key));
if (strcmp(key, "compact") == 0) {
if (has_compact)
mpack_flag_error(&reader, mpack_error_data); // duplicate key
has_compact = true;
compact = mpack_expect_bool(&reader);
} else if (strcmp(key, "schema") == 0) {
if (has_schema)
mpack_flag_error(&reader, mpack_error_data); // duplicate key
has_schema = true;
schema = mpack_expect_int(&reader);
} else {
mpack_discard(&reader);
}
}
mpack_done_map(&reader);
// compact is not optional
if (!has_compact)
@ mpack_error_data
The contained data is not valid.
Definition: mpack-common.h:170
void mpack_expect_cstr(mpack_reader_t *reader, char *buf, size_t size)
Reads a string into the given buffer, ensures it has no null bytes, and adds a null-terminator at the...
void mpack_discard(mpack_reader_t *reader)
Reads and discards the next object.
void mpack_reader_flag_error(mpack_reader_t *reader, mpack_error_t error)
Places the reader in the given error state, calling the error callback if one is set.
mpack_error_t mpack_reader_error(mpack_reader_t *reader)
Queries the error state of the MPack reader.
Definition: mpack-reader.h:391

This is obviously way too verbose. In order to simplify this code, MPack includes an Expect function called mpack_expect_key_cstr() to switch on string keys. This function should be passed an array of key strings and an array of bool flags storing whether each key was found. It will find the key in the given string array, check for duplicate keys, and return the index of the found key (or the key count if it is unrecognized or if an error occurs.) You would use it with an enum and a switch, like this:

enum key_names {KEY_COMPACT, KEY_SCHEMA, KEY_COUNT};
const char* keys[] = {"compact" , "schema" };
bool found[KEY_COUNT] = {0};
bool compact = false;
int schema = -1;
size_t i = mpack_expect_map_max(&reader, 100); // critical check!
for (; i > 0 && mpack_reader_error(&reader) == mpack_ok; --i) { // critical check!
switch (mpack_expect_key_cstr(&reader, keys, found, KEY_COUNT)) {
case KEY_COMPACT: compact = mpack_expect_bool(&reader); break;
case KEY_SCHEMA: schema = mpack_expect_int(&reader); break;
default: mpack_discard(&reader); break;
}
}
// compact is not optional
if (!found[KEY_COMPACT])
size_t mpack_expect_key_cstr(mpack_reader_t *reader, const char *keys[], bool found[], size_t count)
Expects a string map key matching one of the strings in the given key list, marking it as found in th...

In the above examples, the call to mpack_discard(&reader); skips over the value for unrecognized keys, allowing the format to be extensible and providing forwards-compatibility. If you want to forbid unrecognized keys, you can flag an error (e.g. mpack_reader_flag_error(&reader, mpack_error_data);) instead of discarding the value.

WARNING: See above the importance of using a reasonable limit on mpack_expect_map_max(), and of checking for errors in each iteration of the loop. If we were to leave these out, an attacker could craft a message declaring an array of a billion elements, forcing this code into a very long loop. We specify a size of 100 here as an arbitrary limit that leaves enough space for the schema to grow in the future. If you forbid unrecognized keys, you could specify the key count as the limit.

Unlike JSON, MessagePack supports any type as a map key, so the enum integer values can themselves be used as keys. This reduces message size at some expense of debuggability (losing some of the value of a schemaless format.) There is a simpler function mpack_expect_key_uint() which can be used to switch on small non-negative enum values directly.

On the surface this doesn't appear much shorter than the previous code, but it becomes much nicer when you have many possible keys in a map. Of course if at all possible you should consider using the Node API which is much less error-prone and will handle all of this for you.