Capability to integer casts on CheriBSD
Julien ÉLIE
julien at trigofacile.com
Tue Oct 31 22:01:01 UTC 2023
Hi Richard,
>> As for pointer-to-integer conversion, maybe casting to (uintptr_t)
>> could be of help? I've googled a bit and found out that it is in the
>> C99 standard
>
> All the sizes are bounded by the size of whatever the containing memory
> mapping is, which has to fit in a size_t since that's what the argument
> to mmap() was when the mapping was created.
>
> So I don't think uintptr_t will make much difference.
I've just tried:
- char *end = (char *) (((size_t) p + length + pagesize) & mask);
+ char *end = (char *) (((uintptr_t) p + length + pagesize) & mask);
and Clang no longer emits a warning.
size_t and uintptr_t are both of the same size.
Looking deeper, I've found this interesting document about porting C
software to Morello (an implementation of the CHERI architecture):
https://soft-dev.org/events/cheritech22/slides/Richardson.pdf
"""
CHERI C/C++ is very similar to “normal” C/C++ with a few difference such as:
○ On Morello, pointers require 16-byte alignment.
○ (u)intptr_t is not the same type as (unsigned) long.
○ Pointers created from a (non-uintptr_t) integer are not
dereferenceable.
○ Pointers are tightly bounded and cannot be used to access adjacent
objects.
In CHERI C/C++ unsigned long cannot store the capability metadata
○ Casting from pointer to integer strips the capability metadata.
○ Usually flagged by the compiler by emitting a warning when creating
a pointer from an integer.
Casting via uintptr_t generally resolves this problem.
Truncating capability metadata can result in crashes if converted back
to a pointer.
"""
Looks like just changing the (size_t) cast to (uintptr_t) does the job,
and the code builds and works fine. The relevant capability metadata
for pointers is preserved in the uintptr_t datatype, so the
pointer-to-integer conversion is fine, and so is afterwards the other
way round of converting the computed integer value to a pointer.
Interestingly, the document also mentions the other problem mentioned
earlier in this discussion (accessing p[-1] before the start of the
string). "CHERI sometimes detects out-of-bounds accesses that are not
noticed otherwise [...] [especially] reading beyond bounded buffers
derived from string literals."
So true!
I'll go on reading to understand a bit more that unusual architecture.
>>> // Total length of pages
>>> size_t total_length = start_offset + length + end_offset;
>>
>> I'm unsure total_length always has the right value. If end_offset is
>> 0, total_length should be pagesize I think.
>
> Are you sure?
>
> As a concrete example, suppose:
> pagesize = 4096
> p is at the start of a page
> length = 8192
> Then:
> start_offset = 0
> start = p
> end_offset = (0+8192)&4095 = 0
> total_length = 0+8192+0 = 8192
> which is surely what we want.
Indeed, agreed.
--
Julien ÉLIE
« Non omnia possumus omnes. » (Virgile)
More information about the inn-workers
mailing list