Recently I've encountered an interesting problem. My task was to deserialize a JSON file which I at first assumed was invalid. The file looked like this:
{
"1": {
"name": "Erik",
"age": 30
},
"2": {
"name": "Jane",
"age": 25
},
"3": {
"name": "Alex",
"age": 35
}
}
Since it was not an internal file but an external one over which I had no control, at first I got a bit frustrated and started to look for an opportunity to say "This file is not following the specification, fix it!". I mean, just look at this! Have you seen a JSON file with keys being numbers? How am I supposed to access this data? My first thought was that it was invalid JSON because the keys need to be a valid variable name. And I don't know any programming language that allows for a variable name to start with a number, not even mentioning that it's a single-digit number.
So I went to the Internet browser and started reading the JSON specification. This is what I've found:
Member Names
Implementation and profile defined member names used in a JSON:API document MUST be treated as case sensitive by clients and servers, and they MUST meet all of the following conditions:
Member names MUST contain at least one character.
Member names MUST contain only the allowed characters listed below.
Member names MUST start and end with a “globally allowed character”, as defined below.
To enable an easy mapping of member names to URLs, it is RECOMMENDED that member names use only non-reserved, URL safe characters specified in RFC 3986.
Allowed Characters
The following “globally allowed characters” MAY be used anywhere in a member name:
U+0061 to U+007A, “a-z”
U+0041 to U+005A, “A-Z”
U+0030 to U+0039, “0-9”
U+0080 and above (non-ASCII Unicode characters; not recommended, not URL safe)
Additionally, the following characters are allowed in member names, except as the first or last character:
U+002D HYPHEN-MINUS, “-“
U+005F LOW LINE, “_”
U+0020 SPACE, “ “ (not recommended, not URL safe)
This was followed by a long list of illegal characters. But let's slowly break it down, shall we?
Member names MUST contain at least one character Okay, so one character key is allowed. Fine, I've already seen variables named
x
ory
in the code, so I can live with that.Member names MUST contain only the allowed characters listed below Member names MUST start and end with a “globally allowed character”, as defined below. Easy, still no problem. They are a-z, A-Z, right?
U+0061 to U+007A, “a-z” U+0041 to U+005A, “A-Z” U+0030 to U+0039, “0-9”
Okay, now I'm confused. But there are characters not allowed in the first and last character, it cannot be that the name can start with a number, right?
Hell no! It just cannot be a minus, underscore, and space. But it can be a number. I was sitting shocked. How am I supposed to deserialize it into any object?
The realisation
I was sitting shocked. How am I supposed to deserialize it into any object? I've never seen a JSON file like this before. I started googling it and the solution is simple. It's a dictionary. It's a dictionary with keys being numbers. And it's a valid JSON file. It's a valid JSON file that I can deserialize into a dictionary.
So basically such JSON can be deserialized into a Dictionary<string, MyObject>
where MyObject
is a class with name
and age
in this case. My main mistake in the whole thing is that usually if there is a list of objects, it's an array, not a dictionary, and within this array, I'm used to having objects with keys being strings, not numbers.
I feel like this might be an obvious thing for front-end developers, but I'm not one of them. I'm a backend developer. Even though I now know the answer, I'm still shocked that "1" and "2" are valid keys in JSON.