133 Commits

Author SHA1 Message Date
Rosen Penev
1d4885ad09 fix compilation with clang-cl
Something in the configure stage goes wrong where it believe strncasecmp
is present but the header defining it is not. Work around this.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2026-02-05 13:46:00 -08:00
Shane F. Carr
7974657c56 Fix code and update tests 2025-07-30 17:40:56 -07:00
Eric Hawicz
7cee5237dc Issue #867 - also disallow control characters in keys 2025-04-03 21:16:29 -04:00
Cameron Armstrong (Nightfox)
254b5abef8 Do not use duplocale if building for libnix because it isnt supported yet 2024-12-24 10:09:50 +08:00
Eric Hawicz
961c31f8ed Merge pull request #879 from janotomko/null
Handle NULL gracefully in json_tokener_free
2024-12-07 13:33:05 -05:00
Eric Hawicz
ff8ed0f094 Issue #881: don't allow json_tokener_new_ex() with a depth < 1 2024-11-17 22:11:24 -05:00
Eric Hawicz
565f181f65 Fix issue #875: cast to unsigned char so bytes above 0x7f aren't interpreted as negative, which was causing the strict-mode control characters check to incorrectly trigger. 2024-11-08 22:20:40 -05:00
Ján Tomko
828c12b226 Handle NULL gracefully in json_tokener_free
Similarly to glibc's free, make json_tokener_free(NULL)
a no-op, to simplify cleanup paths.

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2024-11-06 15:23:00 +01:00
Eric Hawicz
6bfab90c87 Issue #867: disallow control characters in strict mode. 2024-09-15 10:37:45 -04:00
Bruno Haible
833233faa8 Handle yet another out-of-memory condition.
duplocale() can return NULL, with errno set to ENOMEM.
In this case, bail out and set the current error code to
json_tokener_error_memory.
2024-04-22 01:54:10 +02:00
Eric Hawicz
31a22fb2da Issue #857: fix a few places where json_tokener should have been returning json_tokener_error_memory but wasn't. 2024-04-21 10:37:16 -04:00
Eric Hawicz
e93ae70417 Fix issue #854: Set error=json_tokener_error_memory in json_tokener_parser_verbose() when allocating the tokener fails. 2024-03-29 18:09:12 -04:00
Eric Haszlakiewicz
6dd8618170 Also fix messages returned from json_tokener_error_desc() (broke due to the ordering change of enum json_tokener_error) 2023-08-12 18:52:32 +00:00
Eric Hawicz
71d845e819 Issue #668: add the option to specify "cmake -DUSELOCALE_NEEDS_FREELOCALE=1" to work around a bug in older versions of FreeBSD (<12.4). 2023-07-30 13:38:15 -04:00
Eric Hawicz
e9d3ab209a Merge pull request #759 from c3h2-ctf/truncation
json_tokener_parse_ex: handle out of memory errors
2023-07-04 11:45:57 -04:00
Eric Haszlakiewicz
d6f46ae104 Explicitly check for integer overflow/underflow when parsing integers with JSON_TOKENER_STRICT. 2022-10-30 19:39:30 +00:00
Tobias Stoeckmann
9e6acc9a4e json_tokener_parse_ex: handle out of memory errors
Do not silently truncate values or skip entries if out of memory errors
occur.

Proof of Concept:

- Create poc.c, a program which creates an eight megabyte large json
  object with key "A" and a lot of "B"s as value, one of them is
  UTF-formatted:

```c
 #include <err.h>
 #include <stdio.h>
 #include <string.h>

 #include "json.h"

 #define STR_LEN (8 * 1024 * 1024)
 #define STR_PREFIX "{ \"A\": \""
 #define STR_SUFFIX "\\u0042\" }"

int main(void) {
  char *str;
  struct json_tokener *tok;
  struct json_object *obj;

  if ((tok = json_tokener_new()) == NULL)
    errx(1, "json_tokener_new");

  if ((str = malloc(STR_LEN)) == NULL)
    err(1, "malloc");
  memset(str, 'B', STR_LEN);
  memcpy(str, STR_PREFIX, sizeof(STR_PREFIX) - 1);
  memcpy(str + STR_LEN - sizeof(STR_SUFFIX), STR_SUFFIX, sizeof(STR_SUFFIX));

  obj = json_tokener_parse(str);
  free(str);

  printf("%p\n", obj);
  if (obj != NULL) {
    printf("%.*s\n", 50, json_object_to_json_string(obj));
    json_object_put(obj);
  }

  json_tokener_free(tok);
  return 0;
}
```
- Compile and run poc, assuming you have enough free heap space:
```
gcc $(pkg-config --cflags --libs) -o poc poc.c
./poc
0x559421e15de0
{ "A": "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
```
- Reduce available heap and run again, which leads to truncation:
```
ulimit -d 10000
./poc
0x555a5b453de0
{ "A": "B" }
```
- Compile json-c with this change and run with reduced heap again:
```
ulimit -d 10000
./poc
(nil)
```

The output is limited to 70 characters, i.e. json-c parses the 8 MB
string correctly but the poc does not print all of them to the screen.

The truncation occurs because the parser tries to add all chars up
to the UTF-8 formatted 'B' at once. Since memory is limited to 10 MB
there is not enough for this operation. The parser does not fail but
continues normally.

Another possibility is to create a json file close to 2 GB and run a
program on a system with limited amount of RAM, i.e. around 3 GB. But
ulimit restrictions are much easier for proof of concepts.

Treat memory errors correctly and abort operations.
2022-03-20 17:42:47 +01:00
Tobias Stoeckmann
cbc603b587 Adjusted URLs
Most of these sites support HTTPS (some forward to HTTPS when accessing
the HTTP versions). Use HTTPS directly if supported.

Some URLs led to 404 error pages. Adjusted the links to point to
new locations.

I did not adjust the Microsoft HTML Help Workshop link because it seems
that this software is not available anymore. Instead of removing the
link entirely I kept it there in case it helps someone to find the
software on archived websites.
2022-03-19 10:34:55 +01:00
Even Rouault
3bb54f97e7 Fix typos in code comments and ChangeLog 2022-02-25 00:14:47 +01:00
Juuso Alasuutari
9361d8d3a8 Fix use-after-free in json_tokener_new_ex()
The failure path taken in the event of printbuf_new() returning NULL
calls free() on tok->stack after already having freed tok. Swap the
order of the two calls to fix an obvious memory access violation.

Fixes: bcb6d7d347 ("Handle allocation failure in json_tokener_new_ex")
Signed-off-by: Juuso Alasuutari <juuso.alasuutari@gmail.com>
2021-09-04 20:14:30 +03:00
Eric Haszlakiewicz
f61f1a7a91 Add workaround for Visual Studio not knowing about "inline". 2021-07-25 20:31:59 +00:00
Eric Haszlakiewicz
6a0df2609e Merge some old work to include (some of) PR #464 into the current master branch. 2021-07-25 19:07:06 +00:00
Tobias Stoeckmann
bcb6d7d347 Handle allocation failure in json_tokener_new_ex
The allocation of printbuf_new might fail. Return NULL to indicate tis
error to the caller. Otherwise later usage of the returned tokener would
lead to null pointer dereference.
2020-08-22 13:18:10 +02:00
Eric Haszlakiewicz
8c7849e6e3 Eliminate use of ctype.h and replace isdigit() and tolower() with non-locale-sensitive approaches. 2020-08-02 04:06:44 +00:00
Eric Haszlakiewicz
f3d8006d34 Neither vertical tab nor formfeed are considered whitespace per the JSON spec, remove them from is_ws_char(). 2020-08-02 03:59:56 +00:00
Eric Haszlakiewicz
8b43ff0c22 Merge the is_ws_char() and is_hex_char() changes to json_tokener from branch 'ramiropolla/for_upstream' (PR #464) 2020-08-02 02:55:45 +00:00
Pascal Cuoq
1962ba7de3 Fixes #645 2020-07-21 17:54:26 +02:00
Eric Haszlakiewicz
a4e3700972 Fix code formatting 2020-06-29 02:31:18 +00:00
Eric Haszlakiewicz
f23486a321 In the json_tokener_state_number case, explicitly adjust what "number" characters are allowed based on the exact micro-state that we're in, and check for invalid following characters in a different way, to allow a valid json_type_number object to be returned at the top level.
This causes previously failing strings like "123-456" to return a valid json_object with the appropriate value.  If you care about the trailing content, call json_tokener_parse_ex() and check the parse end point with json_tokener_get_parse_end().
2020-06-29 02:14:26 +00:00
Eric Haszlakiewicz
6eac6986c9 Fix incremental parsing of invalid numbers with exponents, such as "0e+-" and "12.3E12E12", while still allowing "0e+" in non-strict mode.
Deprecate the json_parse_double() function from json_util.h
2020-06-27 15:35:04 +00:00
Eric Haszlakiewicz
84e6032883 Issue #635: Fix "expression has no effect" warning in json_tokener.c by casting to void. 2020-06-23 02:51:46 +00:00
Eric Haszlakiewicz
a68566bf6a Issue #616: Change the parsing of surrogate pairs in unicode escapes so it uses a couple of additional states instead of assuming the low surrogate is already present, to ensure that we correctly handle various cases of incremental parsing. 2020-06-21 18:29:57 +00:00
Eric Haszlakiewicz
36118b681e Rearrange the json_tokener_state_escape_unicode case in json_tokener to simplify the code slightly and make it a bit easier to understand.
While here, drop the utf8_replacement_char that is unnecesarily added if we run out of input in the middle of a unicode escape.  No other functional changes (yet).
2020-06-21 03:10:55 +00:00
Eric Hawicz
da76ee26e7 Merge pull request #633 from dota17/issue616
fix issue 616: support the surrogate pair in split file.
2020-06-20 22:33:17 -04:00
Eric Haszlakiewicz
e26a1195f4 Add json_object_array_shrink() (and array_list_shrink()) and use it in json_tokener to minimize the amount of memory used. This results in a 39%-50% reduction in memory use (peak RSS, peak heap usage) on the jc-bench benchmark and 9% shorter runtime.
Also add the json_object_new_array_ext, array_list_new2, and array_list_shrink functions.
2020-06-20 18:03:04 +00:00
dota17
c1b872d817 fix issue 616: support the surrogate pair in split file. 2020-06-08 17:19:32 +08:00
David McCann
add7b13a9a Improved support for IBM operating systems
Fix compiler errors and warnings when building on IBM operating systems such as AIX and IBM i.
2020-05-14 15:39:35 +01:00
Eric Haszlakiewicz
f6f76f9430 Add a JSON_TOKENER_ALLOW_TRAILING_CHARS flag for json_tokener_set_flags() to allow multiple objects to be parsed from input even when JSON_TOKENER_STRICT is set. 2020-04-21 03:53:44 +00:00
Eric Haszlakiewicz
ecb9354bb1 Re-do clang-format. 2020-04-18 02:14:13 +00:00
Eric Haszlakiewicz
5cc11289b4 Make json_tokener_validate_utf8() internal to json_tokener.c, and improve the docs a bit. 2020-04-18 02:02:06 +00:00
dota17
8b162c4b89 clang-format the files 2020-04-03 11:39:30 +08:00
dota17
c117d8a8a8 add the disabling formatting coments and adjust the partial code manuly 2020-04-03 11:28:04 +08:00
dota17
3c3b5920f7 add uint64 data to json-c 2020-02-25 14:51:35 +08:00
dota17
787a8b3f1c update code 2020-01-20 10:41:24 +08:00
dota17
eca74dcccf test utf8 2020-01-10 18:33:14 +08:00
Eric Haszlakiewicz
374ffe87c6 Issue #463: fix newlocale() call to use LC_NUMERIC_MASK instead of LC_NUMERIC, and remove incorrect comment.
The second call to newlocale() with LC_TIME accidentally made things
 work because LC_TIME == LC_NUMERIC_MASK on some platforms.
2019-09-08 22:27:30 -04:00
Eric Haszlakiewicz
05b41b159e Add a json_tokener_get_parse_end() function to replace direct access of tok->char_offset. 2019-09-08 21:35:37 -04:00
Ramiro Polla
c9a0ac5886 json_tokener: optimize parsing of integer values
speedup for 32-bit: ~8%
speedup for 64-bit: ~9%
2018-12-21 00:30:21 +01:00
Ramiro Polla
d98fc501fb json_tokener: optimize check for number characters
speedup for 32-bit: ~5%
speedup for 64-bit: ~3%
2018-12-21 00:24:17 +01:00
Ramiro Polla
45c601bfa4 json_tokener: optimize check for hex characters
speedup for 32-bit: ~1%
speedup for 64-bit: ~1%
2018-12-21 00:24:17 +01:00