1737 lines
63 KiB
Text
1737 lines
63 KiB
Text
|
\input texinfo @c -*- Texinfo -*-
|
||
|
@setfilename ctf-spec.info
|
||
|
@settitle The CTF File Format
|
||
|
@ifnottex
|
||
|
@xrefautomaticsectiontitle on
|
||
|
@end ifnottex
|
||
|
@synindex fn cp
|
||
|
@synindex tp cp
|
||
|
@synindex vr cp
|
||
|
|
||
|
@copying
|
||
|
Copyright @copyright{} 2021-2022 Free Software Foundation, Inc.
|
||
|
|
||
|
Permission is granted to copy, distribute and/or modify this document
|
||
|
under the terms of the GNU General Public License, Version 3 or any
|
||
|
later version published by the Free Software Foundation. A copy of the
|
||
|
license is included in the section entitled ``GNU General Public
|
||
|
License''.
|
||
|
|
||
|
@end copying
|
||
|
|
||
|
@dircategory Software development
|
||
|
@direntry
|
||
|
* CTF: (ctf-spec). The CTF file format.
|
||
|
@end direntry
|
||
|
|
||
|
@titlepage
|
||
|
@title The CTF File Format
|
||
|
@subtitle Version 3
|
||
|
@author Nick Alcock
|
||
|
|
||
|
@page
|
||
|
@vskip 0pt plus 1filll
|
||
|
@insertcopying
|
||
|
@end titlepage
|
||
|
@contents
|
||
|
|
||
|
@ifnottex
|
||
|
@node Top
|
||
|
@top The CTF file format
|
||
|
|
||
|
This manual describes version 3 of the CTF file format, which is
|
||
|
intended to model the C type system in a fashion that C programs can
|
||
|
consume at runtime.
|
||
|
@end ifnottex
|
||
|
|
||
|
@node Overview
|
||
|
@unnumbered Overview
|
||
|
@cindex Overview
|
||
|
|
||
|
The CTF file format compactly describes C types and the association
|
||
|
between function and data symbols and types: if embedded in ELF objects,
|
||
|
it can exploit the ELF string table to reduce duplication further.
|
||
|
There is no real concept of namespacing: only top-level types are
|
||
|
described, not types scoped to within single functions.
|
||
|
|
||
|
CTF dictionaries can be @dfn{children} of other dictionaries, in a
|
||
|
one-level hierarchy: child dictionaries can refer to types in the
|
||
|
parent, but the opposite is not sensible (since if you refer to a child
|
||
|
type in the parent, the actual type you cited would vary depending on
|
||
|
what child was attached). This parent/child definition is recorded in
|
||
|
the child, but only as a recommendation: users of the API have to attach
|
||
|
parents to children explicitly, and can choose to attach a child to any
|
||
|
parent they like, or to none, though doing so might lead to unpleasant
|
||
|
consequences like dangling references to types. @xref{Type indexes and
|
||
|
type IDs}. Type lookups in child dicts that are not associated with a
|
||
|
parent at all will fail with @code{ECTF_NOPARENT} if a parent type was
|
||
|
needed.
|
||
|
|
||
|
The associated API to generate, merge together, and query this file
|
||
|
format will be described in the accompanying @code{libctf} manual once
|
||
|
it is written. There is no API to modify dictionaries once they've been
|
||
|
written out: CTF is a write-once file format. (However, it is always
|
||
|
possible to dynamically create a new child dictionary on the fly and
|
||
|
attach it to a pre-existing, read-only parent.)
|
||
|
|
||
|
There are two major pieces to CTF: the @dfn{archive} and the
|
||
|
@dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries
|
||
|
@dfn{containers}: the archive format is unique to this variant of CTF.
|
||
|
(Much of the source code still uses the old term.)
|
||
|
|
||
|
The archive file format is a very simple mmappable archive used to group
|
||
|
multiple dictionaries together into groups: it is expected to slowly go
|
||
|
away and be replaced by other mechanisms, but right now it is an
|
||
|
important part of the file format, used to group dictionaries containing
|
||
|
types with conflicting definitions in different TUs with the overarching
|
||
|
dictionary used to store all other types. (Even when archives go away,
|
||
|
the @code{libctf} API used to access them will remain, and access the
|
||
|
other mechanisms that replace it instead.)
|
||
|
|
||
|
The CTF dictionary consists of a @dfn{preamble}, which does not vary
|
||
|
between versions of the CTF file format, and a @dfn{header} and some
|
||
|
number of @dfn{sections}, which can vary between versions.
|
||
|
|
||
|
The rest of this specification describes the format of these sections,
|
||
|
first for the latest version of CTF, then for all earlier versions
|
||
|
supported by @code{libctf}: the earlier versions are defined in terms of
|
||
|
their differences from the next later one. We describe each part of the
|
||
|
format first by reproducing the C structure which defines that part,
|
||
|
then describing it at greater length in terms of file offsets.
|
||
|
|
||
|
The description of the file format ends with a description of relevant
|
||
|
limits that apply to it. These limits can vary between file format
|
||
|
versions.
|
||
|
|
||
|
This document is quite young, so for now the C code in @file{ctf.h}
|
||
|
should be presumed correct when this document conflicts with it.
|
||
|
|
||
|
@node CTF archive
|
||
|
@chapter CTF archives
|
||
|
@cindex archive, CTF archive
|
||
|
|
||
|
The CTF archive format maps names to CTF dictionaries. The names may
|
||
|
contain any character other than \0, but for now archives containing
|
||
|
slashes in the names may not extract correctly. It is possible to
|
||
|
insert multiple members with the same name, but these are quite hard to
|
||
|
access reliably (you have to iterate through all the members rather than
|
||
|
opening by name) so this is not recommended.
|
||
|
|
||
|
CTF archives are not themselves compressed: the constituent components,
|
||
|
CTF dictionaries, can be compressed. (@xref{CTF header}).
|
||
|
|
||
|
CTF archives usually contain a collection of related dictionaries, one
|
||
|
parent and many children of that parent. CTF archives can have a member
|
||
|
with a @dfn{default name}, @code{.ctf} (which can be represented as
|
||
|
@code{NULL} in the API). If present, this member is usually the parent
|
||
|
of all the children, but it is possible for CTF producers to emit
|
||
|
parents with different names if they wish (usually for backward-
|
||
|
compatibility purposes).
|
||
|
|
||
|
@code{.ctf} sections in ELF objects consist of a single CTF dictionary
|
||
|
rather than an archive of dictionaries if and only if the section
|
||
|
contains no types with identical names but conflicting definitions: if
|
||
|
two conflicting definitions exist, the deduplicator will place the type
|
||
|
most commonly referred to by other types in the parent and will place
|
||
|
the other type in a child named after the translation unit it is found
|
||
|
in, and will emit a CTF archive containing both dictionaries instead of
|
||
|
a raw dictionary. All types that refer to such conflicting types are
|
||
|
also placed in the per-translation-unit child.
|
||
|
|
||
|
The definition of an archive in @file{ctf.h} is as follows:
|
||
|
|
||
|
@verbatim
|
||
|
struct ctf_archive
|
||
|
{
|
||
|
uint64_t ctfa_magic;
|
||
|
uint64_t ctfa_model;
|
||
|
uint64_t ctfa_nfiles;
|
||
|
uint64_t ctfa_names;
|
||
|
uint64_t ctfa_ctfs;
|
||
|
};
|
||
|
|
||
|
typedef struct ctf_archive_modent
|
||
|
{
|
||
|
uint64_t name_offset;
|
||
|
uint64_t ctf_offset;
|
||
|
} ctf_archive_modent_t;
|
||
|
@end verbatim
|
||
|
|
||
|
(Note one irregularity here: the @code{ctf_archive_t} is not a typedef
|
||
|
to @code{struct ctf_archive}, but a different typedef, private to
|
||
|
@code{libctf}, so that things that are not really archives can be made
|
||
|
to appear as if they were.)
|
||
|
|
||
|
All the above items are always in little-endian byte order, regardless
|
||
|
of the machine endianness.
|
||
|
|
||
|
The archive header has the following fields:
|
||
|
|
||
|
@tindex struct ctf_archive
|
||
|
@multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint64_t ctfa_magic}
|
||
|
@vindex ctfa_magic
|
||
|
@vindex struct ctf_archive, ctfa_magic
|
||
|
@tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb.
|
||
|
@tindex CTFA_MAGIC
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint64_t ctfa_model}
|
||
|
@vindex ctfa_model
|
||
|
@vindex struct ctf_archive, ctfa_model
|
||
|
@tab The data model for this archive: an arbitrary integer that serves no
|
||
|
purpose but to be handed back by the libctf API. @xref{Data models}.
|
||
|
|
||
|
@item 0x10
|
||
|
@tab @code{uint64_t ctfa_nfiles}
|
||
|
@vindex ctfa_nfiles
|
||
|
@vindex struct ctf_archive, ctfa_nfiles
|
||
|
@tab The number of CTF dictionaries in this archive.
|
||
|
|
||
|
@item 0x18
|
||
|
@tab @code{uint64_t ctfa_names}
|
||
|
@vindex ctfa_names
|
||
|
@vindex struct ctf_archive, ctfa_names
|
||
|
@tab Offset of the name table, in bytes from the start of the archive.
|
||
|
The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}.
|
||
|
|
||
|
@item 0x20
|
||
|
@tab @code{uint64_t ctfa_ctfs}
|
||
|
@vindex ctfa_ctfs
|
||
|
@vindex struct ctf_archive, ctfa_ctfs
|
||
|
@tab Offset of the CTF table. Each element starts with a @code{uint64_t} size,
|
||
|
followed by a CTF dictionary.
|
||
|
|
||
|
@end multitable
|
||
|
|
||
|
The array pointed to by @code{ctfa_names} is an array of entries of
|
||
|
@code{ctf_archive_modent}:
|
||
|
|
||
|
@tindex struct ctf_archive_modent
|
||
|
@tindex ctf_archive_modent_t
|
||
|
@multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint64_t name_offset}
|
||
|
@vindex name_offset
|
||
|
@vindex struct ctf_archive_modent, name_offset
|
||
|
@vindex ctf_archive_modent_t, name_offset
|
||
|
@tab Offset of this name, in bytes from the start of the archive.
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint64_t ctf_offset}
|
||
|
@vindex ctf_offset
|
||
|
@vindex struct ctf_archive_modent, ctf_offset
|
||
|
@vindex ctf_archive_modent_t, ctf_offset
|
||
|
@tab Offset of this CTF dictionary, in bytes from the start of the archive.
|
||
|
|
||
|
@end multitable
|
||
|
|
||
|
The @code{ctfa_names} array is sorted into ASCIIbetical order by name
|
||
|
(i.e. by the result of dereferencing the @code{name_offset}).
|
||
|
|
||
|
The archive file also contains a name table and a table of CTF
|
||
|
dictionaries: these are pointed to by the structures above. The name
|
||
|
table is a simple strtab which is not required to be sorted; the
|
||
|
dictionary array is described above in the entry for @code{ctfa_ctfs}.
|
||
|
|
||
|
The relative order of these various parts is not defined, except that
|
||
|
the header naturally always comes first.
|
||
|
|
||
|
@node CTF dictionaries
|
||
|
@chapter CTF dictionaries
|
||
|
@cindex dictionary, CTF dictionary
|
||
|
|
||
|
CTF dictionaries consist of a header, starting with a premable, and a
|
||
|
number of sections.
|
||
|
|
||
|
@node CTF Preamble
|
||
|
@section CTF Preamble
|
||
|
|
||
|
The preamble is the only part of the CTF dictionary whose format cannot
|
||
|
vary between versions. It is never compressed. It is correspondingly
|
||
|
simple:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_preamble
|
||
|
{
|
||
|
unsigned short ctp_magic;
|
||
|
unsigned char ctp_version;
|
||
|
unsigned char ctp_flags;
|
||
|
} ctf_preamble_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@code{#define}s are provided under the names @code{cth_magic},
|
||
|
@code{cth_version} and @code{cth_flags} to make the fields of the
|
||
|
@code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so
|
||
|
consuming programs rarely need to consider the existence of the preamble
|
||
|
as a separate structure.
|
||
|
|
||
|
@tindex struct ctf_preamble
|
||
|
@tindex ctf_preamble_t
|
||
|
@multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{unsigned short ctp_magic}
|
||
|
@vindex ctp_magic
|
||
|
@vindex cth_magic
|
||
|
@vindex ctf_preamble_t, ctp_magic
|
||
|
@vindex struct ctf_preamble, ctp_magic
|
||
|
@vindex ctf_header_t, cth_magic
|
||
|
@vindex struct ctf_header, cth_magic
|
||
|
@tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2.
|
||
|
@tindex CTF_MAGIC
|
||
|
|
||
|
@item 0x02
|
||
|
@tab @code {unsigned char ctp_version}
|
||
|
@vindex ctp_version
|
||
|
@vindex cth_version
|
||
|
@vindex ctf_preamble_t, ctp_version
|
||
|
@vindex struct ctf_preamble, ctp_version
|
||
|
@vindex ctf_header_t, cth_version
|
||
|
@vindex struct ctf_header, cth_version
|
||
|
@tab The version number of this CTF dictionary.
|
||
|
|
||
|
@item 0x03
|
||
|
@tab @code{ctp_flags}
|
||
|
@vindex ctp_flags
|
||
|
@vindex cth_flags
|
||
|
@vindex ctf_preamble_t, ctp_flags
|
||
|
@vindex struct ctf_preamble, ctp_flags
|
||
|
@vindex ctf_header_t, cth_flags
|
||
|
@vindex struct ctf_header, cth_flags
|
||
|
@tab Flags for this CTF file. @xref{CTF file-wide flags}.
|
||
|
@end multitable
|
||
|
|
||
|
@cindex alignment
|
||
|
Every element of a dictionary must be naturally aligned unless otherwise
|
||
|
specified. (This restriction will be lifted in later versions.)
|
||
|
|
||
|
@cindex endianness
|
||
|
CTF dictionaries are stored in the native endianness of the system that
|
||
|
generates them: the consumer (e.g., @code{libctf}) can detect whether to
|
||
|
endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it
|
||
|
appears as 0xf2df, endian-flipping is needed.)
|
||
|
|
||
|
The version of the CTF dictionary can be determined by inspecting
|
||
|
@code{ctp_version}. The following versions are currently valid, and
|
||
|
@code{libctf} can read all of them:
|
||
|
|
||
|
@tindex CTF_VERSION_3
|
||
|
@cindex CTF versions, versions
|
||
|
@multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.}
|
||
|
@headitem Version @tab Number @tab Description
|
||
|
@item @code{CTF_VERSION_1}
|
||
|
@tab 1 @tab First version, rare. Very similar to Solaris CTF.
|
||
|
|
||
|
@item @code{CTF_VERSION_1_UPGRADED_3}
|
||
|
@tab 2 @tab First version, upgraded to v3 or higher and written out again.
|
||
|
Name may change. Very rare.
|
||
|
|
||
|
@item @code{CTF_VERSION_2}
|
||
|
@tab 3 @tab Second version, with many range limits lifted.
|
||
|
|
||
|
@item @code{CTF_VERSION_3}
|
||
|
@tab 4 @tab Third and current version, documented here.
|
||
|
@end multitable
|
||
|
|
||
|
This section documents @code{CTF_VERSION_3}.
|
||
|
|
||
|
@vindex ctp_flags
|
||
|
@node CTF file-wide flags
|
||
|
@subsection CTF file-wide flags
|
||
|
|
||
|
The preamble contains bitflags in its @code{ctp_flags} field that
|
||
|
describe various file-wide properties. Some of the flags are valid only
|
||
|
for particular file-format versions, which means the flags can be used
|
||
|
to fix file-format bugs. Consumers that see unknown flags should
|
||
|
accordingly assume that the dictionary is not comprehensible, and
|
||
|
refuse to open them.
|
||
|
|
||
|
The following flags are currently defined. Many are bug workarounds,
|
||
|
valid only in CTFv3, and will not be valid in any future versions: the
|
||
|
same values may be reused for other flags in v4+.
|
||
|
|
||
|
@multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the}
|
||
|
@headitem Flag @tab Versions @tab Value @tab Meaning
|
||
|
@tindex CTF_F_COMPRESS
|
||
|
@item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib
|
||
|
@tindex CTF_F_NEWFUNCINFO
|
||
|
@item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2
|
||
|
@tab ``New-format'' func info section.
|
||
|
@tindex CTF_F_IDXSORTED
|
||
|
@item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is
|
||
|
in sorted order
|
||
|
@tindex CTF_F_DYNSTR
|
||
|
@item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is
|
||
|
in @code{.dynstr} and the symtab used is @code{.dynsym}.
|
||
|
@xref{The string section}
|
||
|
@end multitable
|
||
|
|
||
|
@code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the
|
||
|
function info and data object sections. @xref{The symtypetab sections}.
|
||
|
|
||
|
Further flags (and further compression methods) wil be added in future.
|
||
|
|
||
|
@node CTF header
|
||
|
@section CTF header
|
||
|
@cindex CTF header
|
||
|
@cindex Sections, header
|
||
|
|
||
|
The CTF header is the first part of a CTF dictionary, including the
|
||
|
preamble. All parts of it other than the preamble (@pxref{CTF Preamble})
|
||
|
can vary between CTF file versions and are never compressed. It
|
||
|
contains things that apply to the dictionary as a whole, and a table of
|
||
|
the sections into which the rest of the dictionary is divided. The
|
||
|
sections tile the file: each section runs from the offset given until
|
||
|
the start of the next section. Only the last section cannot follow this
|
||
|
rule, so the header has a length for it instead.
|
||
|
|
||
|
All section offsets, here and in the rest of the CTF file, are relative to the
|
||
|
@emph{end} of the header. (This is annoyingly different to how offsets in CTF
|
||
|
archives are handled.)
|
||
|
|
||
|
This is the first structure to include offsets into the string table, which are
|
||
|
not straight references because CTF dictionaries can include references into the
|
||
|
ELF string table to save space, as well as into the string table internal to the
|
||
|
CTF dictionary. @xref{The string section} for more on these. Offset 0 is
|
||
|
always the null string.
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_header
|
||
|
{
|
||
|
ctf_preamble_t cth_preamble;
|
||
|
uint32_t cth_parlabel;
|
||
|
uint32_t cth_parname;
|
||
|
uint32_t cth_cuname;
|
||
|
uint32_t cth_lbloff;
|
||
|
uint32_t cth_objtoff;
|
||
|
uint32_t cth_funcoff;
|
||
|
uint32_t cth_objtidxoff;
|
||
|
uint32_t cth_funcidxoff;
|
||
|
uint32_t cth_varoff;
|
||
|
uint32_t cth_typeoff;
|
||
|
uint32_t cth_stroff;
|
||
|
uint32_t cth_strlen;
|
||
|
} ctf_header_t;
|
||
|
@end verbatim
|
||
|
|
||
|
In detail:
|
||
|
|
||
|
@tindex struct ctf_header
|
||
|
@tindex ctf_header_t
|
||
|
@multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{ctf_preamble_t cth_preamble}
|
||
|
@vindex cth_preamble
|
||
|
@vindex struct ctf_header, cth_preamble
|
||
|
@vindex ctf_header_t, cth_preamble
|
||
|
@tab The preamble (conceptually embedded in the header). @xref{CTF Preamble}
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t cth_parlabel}
|
||
|
@vindex cth_parlabel
|
||
|
@vindex struct ctf_header, cth_parlabel
|
||
|
@vindex ctf_header_t, cth_parlabel
|
||
|
@tab The parent label, if deduplication happened against a specific label: a
|
||
|
strtab offset. @xref{The label section}. Currently unused and always 0, but may
|
||
|
be used in future when semantics are attached to the label section.
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint32_t cth_parname}
|
||
|
@vindex cth_parname
|
||
|
@vindex struct ctf_header, cth_parname
|
||
|
@vindex ctf_header_t, cth_parname
|
||
|
@tab The name of the parent dictionary deduplicated against: a strtab offset.
|
||
|
Interpretation is up to the consumer (usually a CTF archive member name). 0
|
||
|
(the null string) if this is not a child dictionary.
|
||
|
|
||
|
@item 0x1c
|
||
|
@tab @code{uint32_t cth_cuname}
|
||
|
@vindex cth_cuname
|
||
|
@vindex struct ctf_header, cth_cuname
|
||
|
@vindex ctf_header_t, cth_cuname
|
||
|
@tab The name of the compilation unit, for consumers like GDB that want to
|
||
|
know the name of CUs associated with single CUs: a strtab offset. 0 if this
|
||
|
dictionary describes types from many CUs.
|
||
|
|
||
|
@item 0x10
|
||
|
@tab @code{uint32_t cth_lbloff}
|
||
|
@vindex cth_lbloff
|
||
|
@vindex struct ctf_header, cth_lbloff
|
||
|
@vindex ctf_header_t, cth_lbloff
|
||
|
@tab The offset of the label section, which tiles the type space into
|
||
|
named regions. @xref{The label section}.
|
||
|
|
||
|
@item 0x14
|
||
|
@tab @code{uint32_t cth_objtoff}
|
||
|
@vindex cth_objtoff
|
||
|
@vindex struct ctf_header, cth_objtoff
|
||
|
@vindex ctf_header_t, cth_objtoff
|
||
|
@tab The offset of the data object symtypetab section, which maps ELF data symbols to
|
||
|
types. @xref{The symtypetab sections}.
|
||
|
|
||
|
@item 0x18
|
||
|
@tab @code{uint32_t cth_funcoff}
|
||
|
@vindex cth_funcoff
|
||
|
@vindex struct ctf_header, cth_funcoff
|
||
|
@vindex ctf_header_t, cth_funcoff
|
||
|
@tab The offset of the function info symtypetab section, which maps ELF function
|
||
|
symbols to a return type and arg types. @xref{The symtypetab sections}.
|
||
|
|
||
|
@item 0x1c
|
||
|
@tab @code{uint32_t cth_objtidxoff}
|
||
|
@vindex cth_objtidxoff
|
||
|
@vindex struct ctf_header, cth_objtidxoff
|
||
|
@vindex ctf_header_t, cth_objtidxoff
|
||
|
@tab The offset of the object index section, which maps ELF object symbols to
|
||
|
entries in the data object section. @xref{The symtypetab sections}.
|
||
|
|
||
|
@item 0x20
|
||
|
@tab @code{uint32_t cth_funcidxoff}
|
||
|
@vindex cth_funcidxoff
|
||
|
@vindex struct ctf_header, cth_funcidxoff
|
||
|
@vindex ctf_header_t, cth_funcidxoff
|
||
|
@tab The offset of the function info index section, which maps ELF function
|
||
|
symbols to entries in the function info section. @xref{The symtypetab sections}.
|
||
|
|
||
|
@item 0x24
|
||
|
@tab @code{uint32_t cth_varoff}
|
||
|
@vindex cth_varoff
|
||
|
@vindex struct ctf_header, cth_varoff
|
||
|
@vindex ctf_header_t, cth_varoff
|
||
|
@tab The offset of the variable section, which maps string names to types.
|
||
|
@xref{The variable section}.
|
||
|
|
||
|
@item 0x28
|
||
|
@tab @code{uint32_t cth_typeoff}
|
||
|
@vindex cth_typeoff
|
||
|
@vindex struct ctf_header, cth_typeoff
|
||
|
@vindex ctf_header_t, cth_typeoff
|
||
|
@tab The offset of the type section, the core of CTF, which describes types
|
||
|
using variable-length array elements. @xref{The type section}.
|
||
|
|
||
|
@item 0x2c
|
||
|
@tab @code{uint32_t cth_stroff}
|
||
|
@vindex cth_stroff
|
||
|
@vindex struct ctf_header, cth_stroff
|
||
|
@vindex ctf_header_t, cth_stroff
|
||
|
@tab The offset of the string section. @xref{The string section}.
|
||
|
|
||
|
@item 0x30
|
||
|
@tab @code{uint32_t cth_strlen}
|
||
|
@vindex cth_strlen
|
||
|
@vindex struct ctf_header, cth_strlen
|
||
|
@vindex ctf_header_t, cth_strlen
|
||
|
@tab The length of the string section (not an offset!). The CTF file ends
|
||
|
at this point.
|
||
|
|
||
|
@end multitable
|
||
|
|
||
|
Everything from this point on (until the end of the file at @code{cth_stroff} +
|
||
|
@code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in
|
||
|
the preamble's @code{ctp_flags}.
|
||
|
|
||
|
@node The type section
|
||
|
@section The type section
|
||
|
@cindex Type section
|
||
|
@cindex Sections, type
|
||
|
|
||
|
This section is the most important section in CTF, describing all the top-level
|
||
|
types in the program. It consists of an array of type structures, each of which
|
||
|
describes a type of some @dfn{kind}: each kind of type has some amount of
|
||
|
variable-length data associated with it (some kinds have none). The amount of
|
||
|
variable-length data associated with a given type can be determined by
|
||
|
inspecting the type, so the reading code can walk through the types in sequence
|
||
|
at opening time.
|
||
|
|
||
|
Each type structure is one of a set of overlapping structures in a discriminated
|
||
|
union of sorts: the variable-length data for each type immediately follows the
|
||
|
type's type structure. Here's the largest of the overlapping structures, which
|
||
|
is only needed for huge types and so is very rarely seen:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_type
|
||
|
{
|
||
|
uint32_t ctt_name;
|
||
|
uint32_t ctt_info;
|
||
|
__extension__
|
||
|
union
|
||
|
{
|
||
|
uint32_t ctt_size;
|
||
|
uint32_t ctt_type;
|
||
|
};
|
||
|
uint32_t ctt_lsizehi;
|
||
|
uint32_t ctt_lsizelo;
|
||
|
} ctf_type_t;
|
||
|
@end verbatim
|
||
|
|
||
|
Here's the much more common smaller form:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_stype
|
||
|
{
|
||
|
uint32_t ctt_name;
|
||
|
uint32_t ctt_info;
|
||
|
__extension__
|
||
|
union
|
||
|
{
|
||
|
uint32_t ctt_size;
|
||
|
uint32_t ctt_type;
|
||
|
};
|
||
|
} ctf_type_t;
|
||
|
@end verbatim
|
||
|
|
||
|
If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type
|
||
|
is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}.
|
||
|
@tindex CTF_LSIZE_SENT
|
||
|
|
||
|
Here's what the fields mean:
|
||
|
|
||
|
@tindex struct ctf_type
|
||
|
@tindex struct ctf_stype
|
||
|
@tindex ctf_type_t
|
||
|
@tindex ctf_stype_t
|
||
|
@multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint32_t ctt_name}
|
||
|
@vindex ctt_name
|
||
|
@tab Strtab offset of the type name, if any (0 if none).
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t ctt_info}
|
||
|
@vindex ctt_info
|
||
|
@vindex struct ctf_type, ctt_info
|
||
|
@vindex ctf_type_t, ctt_info
|
||
|
@vindex struct ctf_stype, ctt_info
|
||
|
@vindex ctf_stype_t, ctt_info
|
||
|
@tab The @dfn{info word}, containing information on the kind of this type, its
|
||
|
variable-length data and whether it is visible to name lookup. See @xref{The
|
||
|
info word}.
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint32_t ctt_size}
|
||
|
@vindex ctt_size
|
||
|
@vindex struct ctf_type, ctt_size
|
||
|
@vindex ctf_type_t, ctt_size
|
||
|
@vindex struct ctf_stype, ctt_size
|
||
|
@vindex ctf_stype_t, ctt_size
|
||
|
@tab The size of this type, if this type is of a kind for which a size needs
|
||
|
to be recorded (constant-size types don't need one). If this is
|
||
|
@code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}.
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint32_t ctt_type}
|
||
|
@vindex ctt_type
|
||
|
@vindex struct ctf_stype, ctt_type
|
||
|
@vindex ctf_stype_t, ctt_type
|
||
|
@tab The type this type refers to, if this type is of a kind which refers to
|
||
|
other types (like a pointer). All such types are fixed-size, and no types that
|
||
|
are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type}
|
||
|
overlap. All type kinds that use @code{ctt_type} are described by
|
||
|
@code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}.
|
||
|
|
||
|
@item 0x0c (@code{ctf_type_t} only)
|
||
|
@tab @code{uint32_t ctt_lsizehi}
|
||
|
@vindex ctt_lsizehi
|
||
|
@vindex struct ctf_type, ctt_lsizehi
|
||
|
@vindex ctf_type_t, ctt_lsizehi
|
||
|
@tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro
|
||
|
can be used to get a 64-bit size out of this field and the next one.
|
||
|
@code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again.
|
||
|
@findex CTF_TYPE_LSIZE
|
||
|
@findex CTF_SIZE_TO_LSIZE_HI
|
||
|
|
||
|
@item 0x10 (@code{ctf_type_t} only)
|
||
|
@tab @code{uint32_t ctt_lsizelo}
|
||
|
@vindex ctt_lsizelo
|
||
|
@vindex struct ctf_type, ctt_lsizelo
|
||
|
@vindex ctf_type_t, ctt_lsizelo
|
||
|
@tab The low 32 bits of the size of a very large type.
|
||
|
@code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size.
|
||
|
@findex CTF_SIZE_TO_LSIZE_LO
|
||
|
@end multitable
|
||
|
|
||
|
Two aspects of this need further explanation: the info word, and what exactly a
|
||
|
type ID is and how you determine it. (Information on the various type-kind-
|
||
|
dependent things, like whether @code{ctt_size} or @code{ctt_type} is used,
|
||
|
is described in the section devoted to each kind.)
|
||
|
|
||
|
@node The info word
|
||
|
@subsection The info word, ctt_info
|
||
|
|
||
|
The info word is a bitfield split into three parts. From MSB to LSB:
|
||
|
|
||
|
@multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).}
|
||
|
@headitem Bit offset @tab Name @tab Description
|
||
|
@item 26--31
|
||
|
@tab @code{kind}
|
||
|
@tab Type kind: @pxref{Type kinds}.
|
||
|
|
||
|
@item 25
|
||
|
@tab @code{isroot}
|
||
|
@tab 1 if this type is visible to name lookup
|
||
|
|
||
|
@item 0--24
|
||
|
@tab @code{vlen}
|
||
|
@tab Length of variable-length data for this type (some kinds only).
|
||
|
The variable-length data directly follows the @code{ctf_type_t} or
|
||
|
@code{ctf_stype_t}. This is a kind-dependent array length value,
|
||
|
not a length in bytes. Some kinds have no variable-length data, or
|
||
|
fixed-size variable-length data, and do not use this value.
|
||
|
@end multitable
|
||
|
|
||
|
The most mysterious of these is undoubtedly @code{isroot}. This indicates
|
||
|
whether types with names (nonzero @code{ctt_name}) are visible to name lookup:
|
||
|
if zero, this type is considered a @dfn{non-root type} and you can't look it up
|
||
|
by name at all. Multiple types with the same name in the same C namespace
|
||
|
(struct, union, enum, other) can exist in a single dictionary, but only one of
|
||
|
them may have a nonzero value for @code{isroot}. @code{libctf} validates this
|
||
|
at open time and refuses to open dictionaries that violate this constraint.
|
||
|
|
||
|
Historically, this feature was introduced for the encoding of bitfields
|
||
|
(@pxref{Integer types}): for instance, int bitfields will all be named
|
||
|
@code{int} with different widths or offsets, but only the full-width one at
|
||
|
offset zero is wanted when you look up the type named @code{int}. With the
|
||
|
introduction of slices (@pxref{Slices}) as a more general bitfield encoding
|
||
|
mechanism, this is less important, but we still use non-root types to handle
|
||
|
conflicts if the linker API is used to fuse multiple translation units into one
|
||
|
dictionary and those translation units contain types with the same name and
|
||
|
conflicting definitions. (We do not discuss this further here, because the
|
||
|
linker never does this: only specialized type mergers do, like that used for the
|
||
|
Linux kernel. The libctf documentation will describe this in more detail.)
|
||
|
@c XXX update when libctf docs are written.
|
||
|
|
||
|
The @code{CTF_TYPE_INFO} macro can be used to compose an info word from
|
||
|
a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND},
|
||
|
@code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again.
|
||
|
@findex CTF_TYPE_INFO
|
||
|
@findex CTF_V2_INFO_KIND
|
||
|
@findex CTF_V2_INFO_ISROOT
|
||
|
@findex CTF_V2_INFO_VLEN
|
||
|
|
||
|
@node Type indexes and type IDs
|
||
|
@subsection Type indexes and type IDs
|
||
|
@cindex Type indexes
|
||
|
@cindex Type IDs
|
||
|
@cindex Type, IDs of
|
||
|
@cindex Type, indexes of
|
||
|
@cindex ctf_id_t
|
||
|
|
||
|
@cindex Parent range
|
||
|
@cindex Child range
|
||
|
@cindex Type IDs, ranges
|
||
|
Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a
|
||
|
number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1}
|
||
|
and below are in the @dfn{parent range}: these IDs are used for dictionaries
|
||
|
that have not had any other dictionary @code{ctf_import}ed into it as a parent.
|
||
|
Both completely standalone dictionaries and parent dictionaries with children
|
||
|
hanging off them have types in this range. Types @math{2^31} and above are in
|
||
|
the @dfn{child range}: only types in child dictionaries are in this range.
|
||
|
|
||
|
These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but
|
||
|
the types themselves have no visible ID: quite intentionally, because adding an
|
||
|
ID uses space, and every ID is different so they don't compress well. The IDs
|
||
|
are implicit: at open time, the consumer walks through the entire type section
|
||
|
and counts the types in the type section. The type section is an array of
|
||
|
variable-length elements, so each entry could be considered as having an index,
|
||
|
starting from 1. We count these indexes and associate each with its
|
||
|
corresponding @code{ctf_type_t} or @code{ctf_stype_t}.
|
||
|
|
||
|
Lookups of types with IDs in the parent space look in the parent dictionary if
|
||
|
this dictionary has one associated with it; lookups of types with IDs in the
|
||
|
child space error out if the dictionary does not have a parent, and otherwise
|
||
|
convert the ID into an index by shaving off the top bit and look up the index
|
||
|
in the child.
|
||
|
|
||
|
These properties mean that the same dictionary can be used as a parent of child
|
||
|
dictionaries and can also be used directly with no children at all, but a
|
||
|
dictionary created as a child dictionary must always be associated with a parent
|
||
|
--- usually, the same parent --- because its references to its own types have
|
||
|
the high bit turned on and this is only flipped off again if this is a child
|
||
|
dictionary. (This is not a problem, because if you @emph{don't} associate the
|
||
|
child with a parent, any references within it to its parent types will fail, and
|
||
|
there are almost certain to be many such references, or why is it a child at
|
||
|
all?)
|
||
|
|
||
|
This does mean that consumers should keep a close eye on the distinction between
|
||
|
type IDs and type indexes: if you mix them up, everything will appear to work as
|
||
|
long as you're only using parent dictionaries or standalone dictionaries, but as
|
||
|
soon as you start using children, everything will fail horribly.
|
||
|
|
||
|
Type index zero, and type ID zero, are used to indicate that this type cannot be
|
||
|
represented in CTF as currently constituted: they are emitted by the compiler,
|
||
|
but all type chains that terminate in the unknown type are erased at link time
|
||
|
(structure fields that use them just vanish, etc). So you will probably never
|
||
|
see a use of type zero outside the symtypetab sections, where they serve as
|
||
|
sentinels of sorts, to indicate symbols with no associated type.
|
||
|
|
||
|
The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help
|
||
|
in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and
|
||
|
@code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the
|
||
|
parent or child range.
|
||
|
@findex CTF_V2_TYPE_TO_INDEX
|
||
|
@findex CTF_V2_INDEX_TO_TYPE
|
||
|
@findex CTF_V2_TYPE_ISPARENT
|
||
|
@findex CTF_V2_TYPE_ISCHILD
|
||
|
|
||
|
It is quite possible and indeed common for type IDs to point forward in the
|
||
|
dictionary, as well as backward.
|
||
|
|
||
|
@node Type kinds
|
||
|
@subsection Type kinds
|
||
|
@cindex Type kinds
|
||
|
@cindex Type, kinds of
|
||
|
|
||
|
Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type:
|
||
|
all structures are a single kind, as are all unions, all pointers, all arrays,
|
||
|
all integers regardless of their bitfield width, etc. The kind of a type is
|
||
|
given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info
|
||
|
word}).
|
||
|
|
||
|
The space of type kinds is only a quarter full so far, so there is plenty of
|
||
|
room for expansion. It is likely that in future versions of the file format,
|
||
|
types with smaller kinds will be more efficiently encoded than types with larger
|
||
|
kinds, so their numerical value will actually start to matter in future. (So
|
||
|
these IDs will probably change their numerical values in a later release of this
|
||
|
format, to move more frequently-used kinds like structures and cv-quals towards
|
||
|
the top of the space, and move rarely-used kinds like integers downwards. Yes,
|
||
|
integers are rare: how many kinds of @code{int} are there in a program? They're
|
||
|
just very frequently @emph{referenced}.)
|
||
|
|
||
|
Here's the set of kinds so far. Each kind has a @code{#define} associated with
|
||
|
it, also given here.
|
||
|
|
||
|
@multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}}
|
||
|
@headitem Kind @tab Macro @tab Purpose
|
||
|
@item 0
|
||
|
@tab @code{CTF_K_UNKNOWN}
|
||
|
@tab Indicates a type that cannot be represented in CTF, or that is being skipped.
|
||
|
It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types
|
||
|
of kind @code{CTF_K_UNKNOWN}.
|
||
|
@tindex CTF_K_UNKNOWN
|
||
|
|
||
|
@item 1
|
||
|
@tab @code{CTF_K_INTEGER}
|
||
|
@tab An integer type. @xref{Integer types}.
|
||
|
|
||
|
@item 2
|
||
|
@tab @code{CTF_K_FLOAT}
|
||
|
@tab A floating-point type. @xref{Floating-point types}.
|
||
|
|
||
|
@item 3
|
||
|
@tab @code{CTF_K_POINTER}
|
||
|
@tab A pointer. @xref{Pointers typedefs and cvr-quals}.
|
||
|
|
||
|
@item 4
|
||
|
@tab @code{CTF_K_ARRAY}
|
||
|
@tab An array. @xref{Arrays}.
|
||
|
|
||
|
@item 5
|
||
|
@tab @code{CTF_K_FUNCTION}
|
||
|
@tab A function pointer. @xref{Function pointers}.
|
||
|
|
||
|
@item 6
|
||
|
@tab @code{CTF_K_STRUCT}
|
||
|
@tab A structure. @xref{Structs and unions}.
|
||
|
|
||
|
@item 7
|
||
|
@tab @code{CTF_K_UNION}
|
||
|
@tab A union. @xref{Structs and unions}.
|
||
|
|
||
|
@item 8
|
||
|
@tab @code{CTF_K_ENUM}
|
||
|
@tab An enumerated type. @xref{Enums}.
|
||
|
|
||
|
@item 9
|
||
|
@tab @code{CTF_K_FORWARD}
|
||
|
@tab A forward. @xref{Forward declarations}.
|
||
|
|
||
|
@item 10
|
||
|
@tab @code{CTF_K_TYPEDEF}
|
||
|
@tab A typedef. @xref{Pointers typedefs and cvr-quals}.
|
||
|
|
||
|
@item 11
|
||
|
@tab @code{CTF_K_VOLATILE}
|
||
|
@tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}.
|
||
|
|
||
|
@item 12
|
||
|
@tab @code{CTF_K_CONST}
|
||
|
@tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}.
|
||
|
|
||
|
@item 13
|
||
|
@tab @code{CTF_K_RESTRICT}
|
||
|
@tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}.
|
||
|
|
||
|
@item 14
|
||
|
@tab @code{CTF_K_SLICE}
|
||
|
@tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}.
|
||
|
@end multitable
|
||
|
|
||
|
Now we cover all type kinds in turn. Some are more complicated than others.
|
||
|
|
||
|
@node Integer types
|
||
|
@subsection Integer types
|
||
|
@cindex Integer types
|
||
|
@cindex Types, integer
|
||
|
@tindex int
|
||
|
@tindex long
|
||
|
@tindex long long
|
||
|
@tindex short
|
||
|
@tindex char
|
||
|
@tindex bool
|
||
|
@tindex unsigned int
|
||
|
@tindex unsigned long
|
||
|
@tindex unsigned long long
|
||
|
@tindex unsigned short
|
||
|
@tindex unsigned char
|
||
|
@tindex signed int
|
||
|
@tindex signed long
|
||
|
@tindex signed long long
|
||
|
@tindex signed short
|
||
|
@tindex signed char
|
||
|
@cindex CTF_K_INTEGER
|
||
|
|
||
|
Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These
|
||
|
types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes
|
||
|
of the integral type in question. They are always represented by
|
||
|
@code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one
|
||
|
@code{uint32_t} in length: @code{vlen} in the info word should be disregarded
|
||
|
and is always zero.
|
||
|
|
||
|
The variable-length data for integers has multiple items packed into it much
|
||
|
like the info word does.
|
||
|
|
||
|
@multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.}
|
||
|
@headitem Bit offset @tab Name @tab Description
|
||
|
@item 24--31
|
||
|
@tab Encoding
|
||
|
@tab The desired display representation of this integer. You can extract this
|
||
|
field with the @code{CTF_INT_ENCODING} macro. See below.
|
||
|
@findex CTF_INT_ENCODING
|
||
|
|
||
|
@item 16--23
|
||
|
@tab Offset
|
||
|
@tab The offset of this integral type in bits from the start of its enclosing
|
||
|
structure field, adjusted for endianness: @pxref{Structs and unions}. You can
|
||
|
extract this field with the @code{CTF_INT_OFFSET} macro.
|
||
|
@findex CTF_INT_OFFSET
|
||
|
|
||
|
@item 0--15
|
||
|
@tab Bit-width
|
||
|
@tab The width of this integral type in bits. You can extract this field with
|
||
|
the @code{CTF_INT_BITS} macro.
|
||
|
@findex CTF_INT_BITS
|
||
|
@end multitable
|
||
|
|
||
|
If you choose, bitfields can be represented using the things above as a sort of
|
||
|
integral type with the @code{isroot} bit flipped off and the offset and bits
|
||
|
values set in the vlen word: you can populate it with the @code{CTF_INT_DATA}
|
||
|
macro. (But it may be more convenient to represent them using slices of a
|
||
|
full-width integer: @pxref{Slices}.)
|
||
|
@findex CTF_INT_DATA
|
||
|
|
||
|
Integers that are bitfields usually have a @code{ctt_size} rounded up to the
|
||
|
nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer
|
||
|
would have a @code{ctt_size} of 4). However, not all types are naturally
|
||
|
aligned on all architectures: packed structures may in theory use integral
|
||
|
bitfields with different @code{ctt_size}, though this is rarely observed.
|
||
|
|
||
|
The @dfn{encoding} for integers is a bit-field comprised of the values below,
|
||
|
which consumers can use to decide how to display values of this type:
|
||
|
|
||
|
@multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x01
|
||
|
@tab @code{CTF_INT_SIGNED}
|
||
|
@tab If set, this is a signed int: if false, unsigned.
|
||
|
@tindex CTF_INT_SIGNED
|
||
|
|
||
|
@item 0x02
|
||
|
@tab @code{CTF_INT_CHAR}
|
||
|
@tab If set, this is a char type. It is platform-dependent whether unadorned
|
||
|
@code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral
|
||
|
type suitable for the definition of @code{char} on this platform.
|
||
|
@tindex CTF_INT_CHAR
|
||
|
@findex CTF_CHAR
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{CTF_INT_BOOL}
|
||
|
@tab If set, this is a boolean type. (It is theoretically possible to turn this
|
||
|
and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would
|
||
|
mean.)
|
||
|
@tindex CTF_INT_BOOL
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{CTF_INT_VARARGS}
|
||
|
@tab If set, this is a varargs-promoted value in a K&R function definition.
|
||
|
This is not currently produced or consumed by anything that we know of: it is set
|
||
|
aside for future use.
|
||
|
@end multitable
|
||
|
|
||
|
The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported:
|
||
|
references to such types will be emitted as type 0.
|
||
|
|
||
|
@node Floating-point types
|
||
|
@subsection Floating-point types
|
||
|
@cindex Floating-point types
|
||
|
@cindex Types, floating-point
|
||
|
@tindex float
|
||
|
@tindex double
|
||
|
@tindex signed float
|
||
|
@tindex signed double
|
||
|
@tindex unsigned float
|
||
|
@tindex unsigned double
|
||
|
@tindex Complex, float
|
||
|
@tindex Complex, double
|
||
|
@tindex Complex, signed float
|
||
|
@tindex Complex, signed double
|
||
|
@tindex Complex, unsigned float
|
||
|
@tindex Complex, unsigned double
|
||
|
@cindex CTF_K_FLOAT
|
||
|
|
||
|
Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}.
|
||
|
Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t}
|
||
|
with the size in bytes of the floating-point type in question. They are always
|
||
|
represented by @code{ctf_stype_t}, never @code{ctf_type_t}.
|
||
|
|
||
|
This part of CTF shows many rough edges in the more obscure corners of
|
||
|
floating-point handling, and is likely to change in format v4.
|
||
|
|
||
|
The variable-length data for floats has multiple items packed into it just like
|
||
|
integers do:
|
||
|
|
||
|
@multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.}
|
||
|
@headitem Bit offset @tab Name @tab Description
|
||
|
@item 24--31
|
||
|
@tab Encoding
|
||
|
@tab The desired display representation of this float. You can extract this
|
||
|
field with the @code{CTF_FP_ENCODING} macro. See below.
|
||
|
@findex CTF_FP_ENCODING
|
||
|
|
||
|
@item 16--23
|
||
|
@tab Offset
|
||
|
@tab The offset of this floating-point type in bits from the start of its enclosing
|
||
|
structure field, adjusted for endianness: @pxref{Structs and unions}. You can
|
||
|
extract this field with the @code{CTF_FP_OFFSET} macro.
|
||
|
@findex CTF_FP_OFFSET
|
||
|
|
||
|
@item 0--15
|
||
|
@tab Bit-width
|
||
|
@tab The width of this floating-point type in bits. You can extract this field with
|
||
|
the @code{CTF_FP_BITS} macro.
|
||
|
@findex CTF_FP_BITS
|
||
|
@end multitable
|
||
|
|
||
|
The purpose of the floating-point offset and bit-width is somewhat opaque, since
|
||
|
there are no such things as floating-point bitfields in C: the bit-width should
|
||
|
be filled out with the full width of the type in bits, and the offset should
|
||
|
always be zero. It is likely that these fields will go away in the future. As
|
||
|
with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen
|
||
|
items from its component parts.
|
||
|
@findex CTF_INT_DATA
|
||
|
|
||
|
The @dfn{encoding} for floats is not a bitfield but a simple value indicating
|
||
|
the display representation. Many of these are unused, relate to
|
||
|
Solaris-specific compiler extensions, and will be recycled in future: some are
|
||
|
unused and will become used in future.
|
||
|
|
||
|
@multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 1
|
||
|
@tab @code{CTF_FP_SINGLE}
|
||
|
@tab This is a single-precision IEEE 754 @code{float}.
|
||
|
@tindex CTF_FP_SINGLE
|
||
|
@item 2
|
||
|
@tab @code{CTF_FP_DOUBLE}
|
||
|
@tab This is a double-precision IEEE 754 @code{double}.
|
||
|
@tindex CTF_FP_DOUBLE
|
||
|
@item 3
|
||
|
@tab @code{CTF_FP_CPLX}
|
||
|
@tab This is a @code{Complex float}.
|
||
|
@tindex CTF_FP_CPLX
|
||
|
@item 4
|
||
|
@tab @code{CTF_FP_DCPLX}
|
||
|
@tab This is a @code{Complex double}.
|
||
|
@tindex CTF_FP_DCPLX
|
||
|
@item 5
|
||
|
@tab @code{CTF_FP_LDCPLX}
|
||
|
@tab This is a @code{Complex long double}.
|
||
|
@tindex CTF_FP_LDCPLX
|
||
|
@item 6
|
||
|
@tab @code{CTF_FP_LDOUBLE}
|
||
|
@tab This is a @code{long double}.
|
||
|
@tindex CTF_FP_LDOUBLE
|
||
|
@item 7
|
||
|
@tab @code{CTF_FP_INTRVL}
|
||
|
@tab This is a @code{float} interval type, a Solaris-specific extension.
|
||
|
Unused: will be recycled.
|
||
|
@tindex CTF_FP_INTRVL
|
||
|
@cindex Unused bits
|
||
|
@item 8
|
||
|
@tab @code{CTF_FP_DINTRVL}
|
||
|
@tab This is a @code{double} interval type, a Solaris-specific extension.
|
||
|
Unused: will be recycled.
|
||
|
@tindex CTF_FP_DINTRVL
|
||
|
@cindex Unused bits
|
||
|
@item 9
|
||
|
@tab @code{CTF_FP_LDINTRVL}
|
||
|
@tab This is a @code{long double} interval type, a Solaris-specific extension.
|
||
|
Unused: will be recycled.
|
||
|
@tindex CTF_FP_LDINTRVL
|
||
|
@cindex Unused bits
|
||
|
@item 10
|
||
|
@tab @code{CTF_FP_IMAGRY}
|
||
|
@tab This is a the imaginary part of a @code{Complex float}. Not currently
|
||
|
generated. May change.
|
||
|
@tindex CTF_FP_IMAGRY
|
||
|
@cindex Unused bits
|
||
|
@item 11
|
||
|
@tab @code{CTF_FP_DIMAGRY}
|
||
|
@tab This is a the imaginary part of a @code{Complex double}. Not currently
|
||
|
generated. May change.
|
||
|
@tindex CTF_FP_DIMAGRY
|
||
|
@cindex Unused bits
|
||
|
@item 12
|
||
|
@tab @code{CTF_FP_LDIMAGRY}
|
||
|
@tab This is a the imaginary part of a @code{Complex long double}. Not currently
|
||
|
generated. May change.
|
||
|
@tindex CTF_FP_LDIMAGRY
|
||
|
@cindex Unused bits
|
||
|
@end multitable
|
||
|
|
||
|
The use of the complex floating-point encodings is obscure: it is possible that
|
||
|
@code{CTF_FP_CPLX} is meant to be used for only the real part of complex types,
|
||
|
and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are
|
||
|
emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its
|
||
|
constituent parts. There appear to be no uses of these encodings anywhere, so
|
||
|
they are quite likely to change incompatibly in future.
|
||
|
|
||
|
@node Slices
|
||
|
@subsection Slices
|
||
|
@cindex Slices
|
||
|
@cindex Types, slices of integral
|
||
|
@tindex CTF_K_SLICE
|
||
|
|
||
|
Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not
|
||
|
directly correspond to any C type, but are a way to model other types in a more
|
||
|
convenient fashion for CTF generators.
|
||
|
|
||
|
A slice is like a pointer or other reference type in that they are always
|
||
|
represented by @code{ctf_stype_t}: but unlike pointers and other reference
|
||
|
types, they populate the @code{ctt_size} field just like integral types do, and
|
||
|
come with an attached encoding and transform the encoding of the underlying
|
||
|
type. The underlying type is described in the variable-length data, similarly
|
||
|
to structure and union fields: see below. Requests for the type size should
|
||
|
also chase down to the referenced type.
|
||
|
|
||
|
Slices are always nameless: @code{ctt_name} is always zero for them.
|
||
|
|
||
|
(The @code{libctf} API behaviour is unusual as well, and justifies the existence
|
||
|
of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the
|
||
|
underlying type kind, so that consumers never need to know about slices: they
|
||
|
can tell if an apparent integer is actually a slice if they need to by calling
|
||
|
@code{ctf_type_reference}, which will uniquely return the underlying integral
|
||
|
type rather than erroring out with @code{ECTF_NOTREF} if this is actually a
|
||
|
slice. So slices act just like an integer with an encoding, but more closely
|
||
|
mirror DWARF and other debugging information formats by allowing CTF file
|
||
|
creators to represent a bitfield as a slice of an underlying integral type.)
|
||
|
@findex Slices, effect on ctf_type_kind
|
||
|
@findex Slices, effect on ctf_type_reference
|
||
|
@findex libctf, effect of slices
|
||
|
|
||
|
The vlen in the info word for a slice should be ignored and is always zero. The
|
||
|
variable-length data for a slice is a single @code{ctf_slice_t}:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_slice
|
||
|
{
|
||
|
uint32_t cts_type;
|
||
|
unsigned short cts_offset;
|
||
|
unsigned short cts_bits;
|
||
|
} ctf_slice_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@tindex struct ctf_slice
|
||
|
@tindex ctf_slice_t
|
||
|
@multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x0
|
||
|
@tab @code{uint32_t cts_type}
|
||
|
@vindex cts_type
|
||
|
@vindex struct ctf_slice, cts_type
|
||
|
@vindex ctf_slice_t, cts_type
|
||
|
@tab The type this slice is a slice of. Must be an integral type (or a
|
||
|
floating-point type, but this nonsensical option will go away in v4.)
|
||
|
|
||
|
@item 0x4
|
||
|
@tab @code{unsigned short cts_offset}
|
||
|
@vindex cts_offset
|
||
|
@vindex struct ctf_slice, cts_offset
|
||
|
@vindex ctf_slice_t, cts_offset
|
||
|
@tab The offset of this integral type in bits from the start of its enclosing
|
||
|
structure field, adjusted for endianness: @pxref{Structs and unions}. Identical
|
||
|
semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field
|
||
|
is much too long, because the maximum possible offset of an integral type would
|
||
|
easily fit in a char: this field is bigger just for the sake of alignment. This
|
||
|
will change in v4.
|
||
|
|
||
|
@item 0x6
|
||
|
@tab @code{unsigned short cts_bits}
|
||
|
@vindex cts_bits
|
||
|
@vindex struct ctf_slice, cts_bits
|
||
|
@vindex ctf_slice_t, cts_bits
|
||
|
@tab The bit-width of this integral type. Identical semantics to the
|
||
|
@code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is
|
||
|
really too large and will shrink in v4.
|
||
|
@end multitable
|
||
|
|
||
|
@node Pointers typedefs and cvr-quals
|
||
|
@subsection Pointers, typedefs, and cvr-quals
|
||
|
@cindex Pointers
|
||
|
@cindex Typedefs
|
||
|
@cindex cvr-quals
|
||
|
@tindex typedef
|
||
|
@tindex const
|
||
|
@tindex volatile
|
||
|
@tindex restrict
|
||
|
@tindex CTF_K_POINTER
|
||
|
@tindex CTF_K_TYPEDEF
|
||
|
@tindex CTF_K_CONST
|
||
|
@tindex CTF_K_VOLATILE
|
||
|
@tindex CTF_K_RESTRICT
|
||
|
|
||
|
Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict}
|
||
|
qualifiers are represented identically except for their type kind (though they
|
||
|
may be treated differently by consuming libraries like @code{libctf}, since
|
||
|
pointers affect assignment-compatibility in ways cvr-quals do not, and they may
|
||
|
have different alignment requirements, etc).
|
||
|
|
||
|
All of these are represented by @code{ctf_stype_t}, have no variable data at
|
||
|
all, and populate @code{ctt_type} with the type ID of the type they point
|
||
|
to. These types can stack: a @code{CTF_K_RESTRICT} can point to a
|
||
|
@code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc.
|
||
|
|
||
|
They are all unnamed: @code{ctt_name} is 0.
|
||
|
|
||
|
The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data
|
||
|
models}), i.e. in practice, from the target machine ABI, and is not explicitly
|
||
|
represented. The size of other kinds in this set should be determined by
|
||
|
chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is
|
||
|
found, and using that.
|
||
|
|
||
|
@node Arrays
|
||
|
@subsection Arrays
|
||
|
@cindex Arrays
|
||
|
|
||
|
Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}.
|
||
|
Both size and kind for arrays are zero. The variable-length data is a
|
||
|
@code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is
|
||
|
always zero.
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_array
|
||
|
{
|
||
|
uint32_t cta_contents;
|
||
|
uint32_t cta_index;
|
||
|
uint32_t cta_nelems;
|
||
|
} ctf_array_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@tindex struct ctf_array
|
||
|
@tindex ctf_array_t
|
||
|
@multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x0
|
||
|
@tab @code{uint32_t cta_contents}
|
||
|
@vindex cta_contents
|
||
|
@vindex struct ctf_array, cta_contents
|
||
|
@vindex ctf_array_t, cta_contents
|
||
|
@tab The type of the array elements: a type ID.
|
||
|
|
||
|
@item 0x4
|
||
|
@tab @code{uint32_t cta_index}
|
||
|
@vindex cta_index
|
||
|
@vindex struct ctf_array, cta_index
|
||
|
@vindex ctf_array_t, cta_index
|
||
|
@tab The type of the array index: a type ID of an integral type.
|
||
|
If this is a variable-length array, the index type ID will be 0
|
||
|
(but the actual index type of this array is probably @code{int}).
|
||
|
Probably redundant and may be dropped in v4.
|
||
|
|
||
|
@item 0x8
|
||
|
@tab @code{uint32_t cta_nelems}
|
||
|
@vindex cta_nelems
|
||
|
@vindex struct ctf_array, cta_nelems
|
||
|
@vindex ctf_array_t, cta_nelems
|
||
|
@tab The number of array elements. 0 for VLAs, and also for
|
||
|
the historical variety of VLA which has explicit zero dimensions (which will
|
||
|
have a nonzero @code{cta_index}.)
|
||
|
@end multitable
|
||
|
|
||
|
The size of an array can be computed by simple multiplication of the size of the
|
||
|
@code{cta_contents} type by the @code{cta_nelems}.
|
||
|
|
||
|
@node Function pointers
|
||
|
@subsection Function pointers
|
||
|
@cindex Function pointers
|
||
|
@cindex Pointers, to functions
|
||
|
|
||
|
Function pointers are explicitly represented in the CTF type section by a type
|
||
|
of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The
|
||
|
@code{ctt_type} is the function return type ID. The @code{vlen} in the info
|
||
|
word is the number of arguments, each of which is a type ID, a @code{uint32_t}:
|
||
|
if the last argument is 0, this is a varargs function and the number of
|
||
|
arguments is one less than indicated by the vlen.
|
||
|
|
||
|
If the number of arguments is odd, a single @code{uint32_t} of padding is
|
||
|
inserted to maintain alignment.
|
||
|
|
||
|
@node Enums
|
||
|
@subsection Enums
|
||
|
@cindex Enums
|
||
|
@tindex enum
|
||
|
@tindex CTF_K_ENUM
|
||
|
|
||
|
Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a
|
||
|
@code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the
|
||
|
data model (enum bitfields are implemented via slices). The @code{vlen} is a
|
||
|
count of enumerations, each of which is represented by a @code{ctf_enum_t} in
|
||
|
the vlen:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_enum
|
||
|
{
|
||
|
uint32_t cte_name;
|
||
|
int32_t cte_value;
|
||
|
} ctf_enum_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@tindex struct ctf_enum
|
||
|
@tindex ctf_enum_t
|
||
|
@multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x0
|
||
|
@tab @code{uint32_t cte_name}
|
||
|
@vindex cte_name
|
||
|
@vindex struct ctf_enum, cte_name
|
||
|
@vindex ctf_enum_t, cte_name
|
||
|
@tab Strtab offset of the enumeration name. Must not be 0.
|
||
|
|
||
|
@item 0x4
|
||
|
@tab @code{int32_t cte_value}
|
||
|
@vindex cte_value
|
||
|
@vindex struct ctf_enum, cte_value
|
||
|
@vindex ctf_enum_t, cte_value
|
||
|
@tab The enumeration value.
|
||
|
|
||
|
@end multitable
|
||
|
|
||
|
Enumeration values larger than @math{2^32} are not yet supported and are omitted
|
||
|
from the enumeration. (v4 will lift this restriction by encoding the value
|
||
|
differently.)
|
||
|
|
||
|
Forward declarations of enums are not implemented with this kind: @pxref{Forward
|
||
|
declarations}.
|
||
|
|
||
|
Enumerated type names, as usual in C, go into their own namespace, and do not
|
||
|
conflict with non-enums, structs, or unions with the same name.
|
||
|
|
||
|
@node Structs and unions
|
||
|
@subsection Structs and unions
|
||
|
@cindex Structures
|
||
|
@cindex Unions
|
||
|
@tindex struct
|
||
|
@tindex union
|
||
|
@tindex CTF_K_STRUCT
|
||
|
@tindex CTF_K_UNION
|
||
|
|
||
|
Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and
|
||
|
@code{CTF_K_UNION}: their representation is otherwise identical, and it is
|
||
|
perfectly allowed for ``structs'' to contain overlapping fields etc, so we will
|
||
|
treat them together for the rest of this section.
|
||
|
|
||
|
They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to
|
||
|
@code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE}
|
||
|
(0xfffffffe).
|
||
|
@tindex CTF_MAX_LSIZE
|
||
|
|
||
|
The vlen for structures and unions is a count of structure fields, but the type
|
||
|
used to represent a structure field (and thus the size of the variable-length
|
||
|
array element representing the type) depends on the size of the structure: truly
|
||
|
huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a
|
||
|
different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are
|
||
|
vanishingly rare: in v4, this representation will change somewhat for greater
|
||
|
compactness. It's inherited from v1, where the limits were much lower.)
|
||
|
@tindex CTF_LSTRUCT_THRESH
|
||
|
|
||
|
Most structures can get away with using @code{ctf_member_t}:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_member_v2
|
||
|
{
|
||
|
uint32_t ctm_name;
|
||
|
uint32_t ctm_offset;
|
||
|
uint32_t ctm_type;
|
||
|
} ctf_member_t;
|
||
|
@end verbatim
|
||
|
|
||
|
Huge structures that are represented by @code{ctf_type_t} rather than
|
||
|
@code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as
|
||
|
@code{ctf_type_t} splits the size:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_lmember_v2
|
||
|
{
|
||
|
uint32_t ctlm_name;
|
||
|
uint32_t ctlm_offsethi;
|
||
|
uint32_t ctlm_type;
|
||
|
uint32_t ctlm_offsetlo;
|
||
|
} ctf_lmember_t;
|
||
|
@end verbatim
|
||
|
|
||
|
Here's what the fields of @code{ctf_member} mean:
|
||
|
|
||
|
@tindex struct ctf_member_v2
|
||
|
@tindex ctf_member_t
|
||
|
@multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint32_t ctm_name}
|
||
|
@vindex ctm_name
|
||
|
@vindex struct ctf_member_v2, ctm_name
|
||
|
@vindex ctf_member_t, ctm_name
|
||
|
@tab Strtab offset of the field name.
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t ctm_offset}
|
||
|
@vindex ctm_offset
|
||
|
@vindex struct ctf_member_v2, ctm_offset
|
||
|
@vindex ctf_member_t, ctm_offset
|
||
|
@tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is
|
||
|
machine-word-aligned and the individual field has an offset in bits, but
|
||
|
the format allows for the offset to be encoded in bits here.)
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint32_t ctm_type}
|
||
|
@vindex ctm_type
|
||
|
@vindex struct ctf_member_v2, ctm_type
|
||
|
@vindex ctf_member_t, ctm_type
|
||
|
@tab The type ID of the type of the field.
|
||
|
@end multitable
|
||
|
|
||
|
Here's what the fields of the very similar @code{ctf_lmember} mean:
|
||
|
|
||
|
@tindex struct ctf_lmember_v2
|
||
|
@tindex ctf_lmember_t
|
||
|
@multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint32_t ctlm_name}
|
||
|
@vindex ctlm_name
|
||
|
@vindex struct ctf_lmember_v2, ctlm_name
|
||
|
@vindex ctf_lmember_t, ctlm_name
|
||
|
@tab Strtab offset of the field name.
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t ctlm_offsethi}
|
||
|
@vindex ctlm_offsethi
|
||
|
@vindex struct ctf_lmember_v2, ctlm_offsethi
|
||
|
@vindex ctf_lmember_t, ctlm_offsethi
|
||
|
@tab The high 32 bits of the offset of this field in bits.
|
||
|
|
||
|
@item 0x08
|
||
|
@tab @code{uint32_t ctlm_type}
|
||
|
@vindex ctm_type
|
||
|
@vindex struct ctf_lmember_v2, ctlm_type
|
||
|
@vindex ctf_member_t, ctlm_type
|
||
|
@tab The type ID of the type of the field.
|
||
|
|
||
|
@item 0x0c
|
||
|
@tab @code{uint32_t ctlm_offsetlo}
|
||
|
@vindex ctlm_offsetlo
|
||
|
@vindex struct ctf_lmember_v2, ctlm_offsetlo
|
||
|
@vindex ctf_lmember_t, ctlm_offsetlo
|
||
|
@tab The low 32 bits of the offset of this field in bits.
|
||
|
@end multitable
|
||
|
|
||
|
Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and
|
||
|
@code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the
|
||
|
@code{ctlm_offset} fields, much as with the split size fields in
|
||
|
@code{ctf_type_t}.
|
||
|
|
||
|
Unnamed structure and union fields are simply implemented by collapsing the
|
||
|
unnamed field's members into the containing structure or union: this does mean
|
||
|
that a structure containing an unnamed union can end up being a ``structure''
|
||
|
with multiple members at the same offset. (A future format revision may
|
||
|
collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and
|
||
|
decide among them based on whether their members do in fact overlap.)
|
||
|
|
||
|
Structure and union type names, as usual in C, go into their own namespace,
|
||
|
just as enum type names do.
|
||
|
|
||
|
Forward declarations of structures and unions are not implemented with this
|
||
|
kind: @pxref{Forward declarations}.
|
||
|
|
||
|
@node Forward declarations
|
||
|
@subsection Forward declarations
|
||
|
@cindex Forwards
|
||
|
@tindex enum
|
||
|
@tindex struct
|
||
|
@tindex union
|
||
|
@tindex CTF_K_FORWARD
|
||
|
|
||
|
When the compiler encounters a forward declaration of a struct, union, or enum,
|
||
|
it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non-
|
||
|
forward declaration of the same thing, it marks the forward as non-root-visible:
|
||
|
before link time, therefore, non-root-visible forwards indicate that a
|
||
|
non-forward is coming.
|
||
|
|
||
|
After link time, forwards are fused with their corresponding non-forwards by the
|
||
|
deduplicator where possible. They are kept if there is no non-forward
|
||
|
definition (maybe it's not visible from any TU at all) or if @code{multiple}
|
||
|
conflicting structures with the same name might match it. Otherwise, all other
|
||
|
forwards are converted to structures, unions, or enums as appropriate, even
|
||
|
across TUs if only one structure could correspond to the forward (after all,
|
||
|
all types across all TUs land in the same dictionary unless they conflict,
|
||
|
so promoting forwards to their concrete type seems most helpful).
|
||
|
|
||
|
A forward has a rather strange representation: it is encoded with a
|
||
|
@code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's
|
||
|
a forward, we don't have an underlying type yet: if we did, we'd have promoted
|
||
|
it and this wouldn't be a forward any more) but with the @code{kind} of the
|
||
|
forward. This means that we can distinguish forwards to structs, enums and
|
||
|
unions reliably and ensure they land in the appropriate namespace even before
|
||
|
the actual struct, union or enum is found.
|
||
|
|
||
|
@node The symtypetab sections
|
||
|
@section The symtypetab sections
|
||
|
@cindex Symtypetab section
|
||
|
@cindex Sections, symtypetab
|
||
|
@cindex Function info section
|
||
|
@cindex Sections, function info
|
||
|
@cindex Data object section
|
||
|
@cindex Sections, data object
|
||
|
@cindex Function info index section
|
||
|
@cindex Sections, function info index
|
||
|
@cindex Data object index section
|
||
|
@cindex Sections, data object index
|
||
|
@tindex CTF_F_IDXSORTED
|
||
|
@tindex CTF_F_DYNSTR
|
||
|
@cindex Bug workarounds, CTF_F_DYNSTR
|
||
|
|
||
|
These are two very simple sections with identical formats, used by consumers to
|
||
|
map from ELF function and data symbols directly to their types. So they are
|
||
|
usually populated only in CTF sections that are embedded in ELF objects.
|
||
|
|
||
|
Their format is very simple: an array of type IDs. Which symbol each type ID
|
||
|
corresponds to depends on whether the optional @emph{index section} associated
|
||
|
with this symtypetab section has any content.
|
||
|
|
||
|
If the index section is nonempty, it is an array of @code{uint32_t} string table
|
||
|
offsets, each giving the name of the symbol whose type is at the same offset in
|
||
|
the corresponding non-index section: users can look up symbols in such a table
|
||
|
by name. The index section and corresponding symtypetab section is usually
|
||
|
ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the
|
||
|
header): if it's sorted, it can be bsearched for a symbol name rather than
|
||
|
having to use a slower linear search.
|
||
|
|
||
|
If the data object index section is empty, the entries in the data object and
|
||
|
function info sections are associated 1:1 with ELF symbols of type
|
||
|
@code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with
|
||
|
a nonzero value: the linker shuffles the symtypetab sections to correspond with
|
||
|
the order of the symbols in the ELF file. Symbols with no name, undefined
|
||
|
symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped
|
||
|
and never appear in either section. Symbols that have no corresponding type are
|
||
|
represented by type ID 0. The section may have fewer entries than the symbol
|
||
|
table, in which case no later entries have associated types. This format is
|
||
|
more compact than an indexed form if most entries have types (since there is no
|
||
|
need to record any symbol names), but if the producer and consumer disagree even
|
||
|
slightly about which symbols are omitted, the types of all further symbols will
|
||
|
be wrong!
|
||
|
|
||
|
The compiler always emits indexed symtypetab tables, because there is no symbol
|
||
|
table yet. The linker will always have to read them all in and always works
|
||
|
through them from start to end, so there is no benefit having the compiler sort
|
||
|
them either. The linker (actually, @code{libctf}'s linking machinery) will
|
||
|
automatically sort unsorted indexed sections, and convert indexed sections that
|
||
|
contain a lot of pads into the more compact, unindexed form.
|
||
|
|
||
|
If child dicts are in use, only symbols that use types actually mentioned in the
|
||
|
child appear in the child's symtypetab: symbols that use only types in the
|
||
|
parent appear in the parent's symtypetab instead. So the child's symtypetab will
|
||
|
almost always be very sparse, and thus will usually use the indexed form even in
|
||
|
fully linked objects. (It is, of course, impossible for symbols to exist that
|
||
|
use types from multiple child dicts at once, since it's impossible to declare a
|
||
|
function in C that uses types that are only visible in two different, disjoint
|
||
|
translation units.)
|
||
|
|
||
|
@node The variable section
|
||
|
@section The variable section
|
||
|
@cindex Variable section
|
||
|
@cindex Sections, variable
|
||
|
|
||
|
The variable section is a simple array mapping names (strtab entries) to type
|
||
|
IDs, intended to provide a replacement for the data object section in dynamic
|
||
|
situations in which there is no static ELF strtab but the consumer instead hands
|
||
|
back names. The section is sorted into ASCIIbetical order by name for rapid
|
||
|
lookup, like the CTF archive name table.
|
||
|
|
||
|
The section is an array of these structures:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_varent
|
||
|
{
|
||
|
uint32_t ctv_name;
|
||
|
uint32_t ctv_type;
|
||
|
} ctf_varent_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@tindex struct ctf_varent
|
||
|
@tindex ctf_varent_t
|
||
|
@multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint32_t ctv_name}
|
||
|
@vindex ctv_name
|
||
|
@vindex struct ctf_varent, ctv_name
|
||
|
@vindex ctf_varent_t, ctv_name
|
||
|
@tab Strtab offset of the name
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t ctv_type}
|
||
|
@vindex ctv_type
|
||
|
@vindex struct ctf_varent, ctv_type
|
||
|
@vindex ctf_varent_t, ctv_type
|
||
|
@tab Type ID of this type
|
||
|
@end multitable
|
||
|
|
||
|
There is no analogue of the function info section yet: v4 will probably drop
|
||
|
this section in favour of a way to put both indexed (thus, named) and nonindexed
|
||
|
symbols into the symtypetab sections at the same time.
|
||
|
|
||
|
@node The label section
|
||
|
@section The label section
|
||
|
@cindex Label section
|
||
|
@cindex Sections, label
|
||
|
|
||
|
The label section is a currently-unused facility allowing the tiling of the type
|
||
|
space with names taken from the strtab. The section is an array of these
|
||
|
structures:
|
||
|
|
||
|
@verbatim
|
||
|
typedef struct ctf_lblent
|
||
|
{
|
||
|
uint32_t ctl_label;
|
||
|
uint32_t ctl_type;
|
||
|
} ctf_lblent_t;
|
||
|
@end verbatim
|
||
|
|
||
|
@tindex struct ctf_lblent
|
||
|
@tindex ctf_lblent_t
|
||
|
@multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label}
|
||
|
@headitem Offset @tab Name @tab Description
|
||
|
@item 0x00
|
||
|
@tab @code{uint32_t ctl_label}
|
||
|
@vindex ctl_label
|
||
|
@vindex struct ctf_lblent, ctl_label
|
||
|
@vindex ctf_lblent_t, ctl_label
|
||
|
@tab Strtab offset of the label
|
||
|
|
||
|
@item 0x04
|
||
|
@tab @code{uint32_t ctl_type}
|
||
|
@vindex ctl_type
|
||
|
@vindex struct ctf_lblent, ctl_type
|
||
|
@vindex ctf_lblent_t, ctl_type
|
||
|
@tab Type ID of the last type covered by this label
|
||
|
@end multitable
|
||
|
|
||
|
Semantics will be attached to labels soon, probably in v4 (the plan is to use
|
||
|
them to allow multiple disjoint namespaces in a single CTF file, removing many
|
||
|
uses of CTF archives, in particular in the @code{.ctf} section in ELF objects).
|
||
|
|
||
|
@node The string section
|
||
|
@section The string section
|
||
|
@cindex String section
|
||
|
@cindex Sections, string
|
||
|
|
||
|
This section is a simple ELF-format strtab, starting with a zero byte (thus
|
||
|
ensuring that the string with offset 0 is the null string, as assumed elsewhere
|
||
|
in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve
|
||
|
compression efficiency.
|
||
|
|
||
|
Where the strtab is unusual is the @emph{references} to it. CTF has two
|
||
|
string tables, the internal strtab and an external strtab associated
|
||
|
with the CTF dictionary at open time: usually, this is the ELF dynamic
|
||
|
strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We
|
||
|
distinguish between these strtabs by the most significant bit, bit 31,
|
||
|
of the 32-bit strtab references: if it is 0, the offset is in the
|
||
|
internal strtab: if 1, the offset is in the external strtab.
|
||
|
|
||
|
@tindex CTF_F_DYNSTR
|
||
|
@cindex Bug workarounds, CTF_F_DYNSTR
|
||
|
There is a bug workaround in this area: in format v3 (the first version
|
||
|
to have working support for external strtabs), the external strtab is
|
||
|
@code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the
|
||
|
dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a
|
||
|
header field that explicitly names the external strtab, making this flag
|
||
|
unnecessary.
|
||
|
|
||
|
@node Data models
|
||
|
@section Data models
|
||
|
@cindex Data models
|
||
|
|
||
|
The data model is a simple integer which indicates the ABI in use on this
|
||
|
platform. Right now, it is very simple, distinguishing only between 32- and
|
||
|
64-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from
|
||
|
ABI integer to type sizes is hardwired into @code{libctf}: currently, we use
|
||
|
this to hardwire the size of pointers, function pointers, and enumerated types,
|
||
|
|
||
|
This is a very kludgy corner of CTF and will probably be replaced with explicit
|
||
|
header fields to record this sort of thing in future.
|
||
|
|
||
|
@node Limits of CTF
|
||
|
@section Limits of CTF
|
||
|
@cindex Limits
|
||
|
|
||
|
The following limits are imposed by various aspects of CTF version 3:
|
||
|
|
||
|
@table @code
|
||
|
@item CTF_MAX_TYPE
|
||
|
Maximum type identifier (maximum number of types accessible with parent and
|
||
|
child containers in use): 0xfffffffe
|
||
|
@item CTF_MAX_PTYPE
|
||
|
Maximum type identifier in a parent dictioanry: maximum number of types in any
|
||
|
one dictionary: 0x7fffffff
|
||
|
@item CTF_MAX_NAME
|
||
|
Maximum offset into a string table: 0x7fffffff
|
||
|
@item CTF_MAX_VLEN
|
||
|
Maximum number of members in a struct, union, or enum: maximum number of
|
||
|
function args: 0xffffff
|
||
|
@item CTF_MAX_SIZE
|
||
|
Maximum size of a @code{ctf_stype_t} in bytes before we fall back to
|
||
|
@code{ctf_type_t}: 0xfffffffe bytes
|
||
|
@end table
|
||
|
|
||
|
Other maxima without associated macros:
|
||
|
@itemize
|
||
|
@item
|
||
|
Maximum value of an enumerated type: 2^32
|
||
|
@item
|
||
|
Maximum size of an array element: 2^32
|
||
|
@end itemize
|
||
|
|
||
|
These maxima are generally considered to be too low, because C programs can and
|
||
|
do exceed them: they will be lifted in format v4.
|
||
|
|
||
|
@node Index
|
||
|
@unnumbered Index
|
||
|
|
||
|
@printindex cp
|
||
|
|
||
|
@bye
|