Capability to integer casts on CheriBSD

Julien ÉLIE julien at trigofacile.com
Tue Oct 31 22:01:01 UTC 2023


Hi Richard,

>> As for pointer-to-integer conversion, maybe casting to (uintptr_t) 
>> could be of help?  I've googled a bit and found out that it is in the 
>> C99 standard
>
> All the sizes are bounded by the size of whatever the containing memory 
> mapping is, which has to fit in a size_t since that's what the argument 
> to mmap() was when the mapping was created.
> 
> So I don't think uintptr_t will make much difference.

I've just tried:
- char *end = (char *) (((size_t) p + length + pagesize) & mask);
+ char *end = (char *) (((uintptr_t) p + length + pagesize) & mask);

and Clang no longer emits a warning.
size_t and uintptr_t are both of the same size.
Looking deeper, I've found this interesting document about porting C 
software to Morello (an implementation of the CHERI architecture):
     https://soft-dev.org/events/cheritech22/slides/Richardson.pdf

"""
CHERI C/C++ is very similar to “normal” C/C++ with a few difference such as:
   ○ On Morello, pointers require 16-byte alignment.
   ○ (u)intptr_t is not the same type as (unsigned) long.
   ○ Pointers created from a (non-uintptr_t) integer are not 
dereferenceable.
   ○ Pointers are tightly bounded and cannot be used to access adjacent
objects.

In CHERI C/C++ unsigned long cannot store the capability metadata
   ○ Casting from pointer to integer strips the capability metadata.
   ○ Usually flagged by the compiler by emitting a warning when creating 
a pointer from an integer.

Casting via uintptr_t generally resolves this problem.

Truncating capability metadata can result in crashes if converted back 
to a pointer.
"""

Looks like just changing the (size_t) cast to (uintptr_t) does the job, 
and the code builds and works fine.  The relevant capability metadata 
for pointers is preserved in the uintptr_t datatype, so the 
pointer-to-integer conversion is fine, and so is afterwards the other 
way round of converting the computed integer value to a pointer.


Interestingly, the document also mentions the other problem mentioned 
earlier in this discussion (accessing p[-1] before the start of the 
string).  "CHERI sometimes detects out-of-bounds accesses that are not 
noticed otherwise [...] [especially] reading beyond bounded buffers 
derived from string literals."
So true!
I'll go on reading to understand a bit more that unusual architecture.



>>>    // Total length of pages
>>>    size_t total_length = start_offset + length + end_offset;
>>
>> I'm unsure total_length always has the right value.  If end_offset is 
>> 0, total_length should be pagesize I think.
> 
> Are you sure?
> 
> As a concrete example, suppose:
>    pagesize = 4096
>    p is at the start of a page
>    length = 8192
> Then:
>    start_offset = 0
>    start = p
>    end_offset = (0+8192)&4095 = 0
>    total_length = 0+8192+0 = 8192
> which is surely what we want.

Indeed, agreed.

-- 
Julien ÉLIE

« Non omnia possumus omnes. » (Virgile)


More information about the inn-workers mailing list