1 \chapter[Compression (Informative)]{DWARF Compression and Duplicate Elimination (Informative)}
2 \label{dwarfcompressionandduplicateeliminationinformative}
4 % It seemed difficult to get close to the same layout and
5 % captioning as DWARF4 here with figures as they moved (floated)
6 % making it hard to follow. Hence this uses fewer figures.
9 \addtoindexx{DWARF compression}
11 \addtoindexx{DWARF duplicate elimination}
12 use a lot of disk space.
14 This is especially true for \addtoindex{C++}, where the depth and complexity
15 of headers can mean that many, many (possibly thousands of)
16 declarations are repeated in every compilation unit. \addtoindex{C++}
17 templates can also mean that some functions and their DWARF
18 descriptions get duplicated.
20 This Appendix describes techniques for using the DWARF
21 representation in combination with features and characteristics
22 of some common object file representations to reduce redundancy
23 without losing information. It is worth emphasizing that none
24 of these techniques are necessary to provide a complete and
25 accurate DWARF description; they are solely concerned with
26 reducing the size of DWARF information.
28 The techniques described here depend more directly and more
29 obviously on object file concepts and linker mechanisms than
30 most other parts of DWARF. While the presentation tends to
31 use the vocabulary of specific systems, this is primarily to
32 aid in describing the techniques by appealing to well\dash known
33 terminology. These techniques can be employed on any system
34 that supports certain general functional capabilities
38 \section{Using Compilation Units}
39 \label{app:usingcompilationunits}
42 The general approach is to break up the debug information of
43 a compilation into separate normal and partial compilation
44 units, each consisting of one or more sections. By arranging
45 that a sufficiently similar partitioning occurs in other
46 compilations, a suitable system linker can delete redundant
47 groups of sections when combining object files.
49 \textit{The following uses some traditional section naming here
50 but aside from the DWARF sections, the names are just meant
51 to suggest traditional contents as a way of explaining the
52 approach, not to be limiting.}
54 A traditional relocatable object output file
55 from a single compilation might contain sections
65 A relocatable object file from a compilation system
66 attempting duplicate DWARF elimination might
67 contain sections as in:
78 followed (or preceded, the order is not significant)
80 \addtoindexx{section group}
94 where each \addtoindex{section group} might or might not contain executable
95 code (\dottext{} sections) or data (\dotdata{} sections).
98 A \textit{\addtoindex{section group}} is a named set
99 of section contributions
100 within an object file with the property that the entire set
101 of section contributions must be retained or discarded as a
102 whole; no partial elimination is allowed. Section groups can
103 generally be handled by a linker in two ways:
104 \begin{enumerate}[1. ]
106 \item Given multiple identical (duplicate) section groups,
107 \addtoindexx{section group}
108 one of them is chosen to be kept and used, while the rest
111 \item Given a \addtoindex{section group}
112 that is not referenced from any
113 section outside of the \addtoindex{section group},
120 Which handling applies may be indicated by the
121 \addtoindex{section group}
122 itself and/or selection of certain linker options.
124 For example, if a linker determines that
125 \addtoindex{section group} 1
127 \addtoindex{section group} 3 from B.o are identical, it could
128 discard one group and arrange that all references in A.o and
129 B.o apply to the remaining one of the two identical section
130 groups. This saves space.
132 An important part of making it possible to \doublequote{redirect}
133 references to the surviving
134 \addtoindex{section group} is the use of
135 consistently chosen linker global symbols for referring to
136 locations within each
137 \addtoindex{section group}.
138 It follows that references
139 are simply to external names and the linker already knows
140 how to match up references and definitions.
142 What is minimally needed from the object file format and system
143 linker (outside of DWARF itself, and normal object/linker
144 facilities such as simple relocations) are:
145 \begin{enumerate}[1. ]
147 \item A means to reference the \dotdebuginfo{} information
148 of one compilation unit from the \dotdebuginfo{} section of
149 another compilation unit (\DWFORMrefaddr{} provides this).
151 \item A means to combine multiple contributions to specific sections
152 (for example, \dotdebuginfo{}) into a single object file.
154 \item A means to identify a \addtoindex{section group}
157 \item A means to indicate which sections go together to make
158 up a \addtoindex{section group}, so that the group can be
159 treated as a unit (kept or discarded).
161 \item A means to indicate how each \addtoindex{section group}
162 should be processed by the linker.
166 \textit{The notion of section and section contribution used here
167 corresponds closely to the similarly named concepts in the
168 ELF object file representation.
169 The notion of \addtoindex{section group} is
170 an abstraction of common extensions of the ELF representation
172 \doublequote{\COMDAT{}s} or \doublequote{\COMDAT{} sections.} (Other
173 object file representations provide \COMDAT{}\dash style mechanisms as
174 well.) There are several variations in the \COMDAT{} schemes in
175 common use, any of which should be sufficient for the purposes
177 \addtoindexx{duplication elimination|see{DWARF duplicate elimination}}
178 DWARF duplicate elimination techniques described here.}
180 \subsection{Naming and Usage Considerations}
181 \label{app:namingandusageconsiderations}
183 A precise description of the means of deriving names usable
184 by the linker to access DWARF entities is not part of this
185 specification. Nonetheless, an outline of a usable approach
186 is given here to make this more understandable and to guide
189 Implementations should clearly document their naming conventions.
191 In the following, it will be helpful to refer to the examples
193 Figure \ref{fig:duplicateeliminationexample1csource}
195 Figure \ref{fig:duplicateeliminationexample2companiondwarf}
197 Section \refersec{app:examples}.
199 \textbf{Section Group Names}
201 Section groups must have a \addtoindex{section group} name.
202 \addtoindexx{section group!name}
204 \addtoindex{C++} example, a name like
206 <producer-prefix>.<file-designator>.<gid-number>
212 \item [\textless producer\dash prefix\textgreater]
213 is some string specific to the
214 producer, which has a language\dash designation embedded in the
215 name when appropriate. (Alternatively, the language name
216 could be embedded in the
217 \textless gid\dash number\textgreater).
220 \item [\textless file\dash designator\textgreater]
221 names the file, such as wa.h in
225 \item [\textless gid\dash number\textgreater]
226 is a string generated to identify the
227 specific wa.h header file in such a way that
231 \item a 'matching' output from another compile generates
233 \textless gid\dash number\textgreater,
236 \item a non\dash matching output (say because of \texttt{\#defines})
237 generates a different
238 \textless gid\dash number\textgreater.
243 \textit{It may be useful to think of a
244 \textless gid\dash number\textgreater
246 of \doublequote{digital signature} that allows a fast test for the
248 \addtoindexx{section group}
251 So, for example, the \addtoindex{section group}
252 corresponding to file wa.h
253 above is given the name \texttt{my.compiler.company.cpp.wa.h.123456}.
257 \textbf{Debugging Information Entry Names}
260 \addtoindexx{debugging information entry!ownership relation}
261 debugging information entries (the need for which is explained
262 below) within a \addtoindex{section group}
263 can be given names of the form
266 <prefix>.<file-designator>.<gid-number>.<die-number>
272 my.compiler.company.wa.h.123456.987
277 \item [\textless prefix\textgreater]
278 distinguishes this as a DWARF debug info name, and should identify the producer
279 and, when appropriate, the language.
280 \item [\textless file\dash designator\textgreater]
282 \texttt{\textless gid\dash number\textgreater}
285 \item [\textless die\dash number\textgreater]
286 could be a number sequentially assigned
287 to entities (tokens, perhaps) found
292 In general, every point in the
293 \addtoindexx{section group}
296 could be referenced from outside by \emph{any} compilation unit must
297 normally have an external name generated for it in the linker
298 symbol table, whether the current compilation references all
301 \textit{The completeness of the set of names generated is a
302 quality\dash of\dash implementation issue.}
304 It is up to the producer to ensure that if
305 \textless die\dash numbers\textgreater\
306 in separate compilations would not match properly then a
308 \textless gid\dash number\textgreater\
312 \addtoindexx{section group}
313 section groups that are designated as
314 duplicate\dash removal\dash applies actually require the
316 <prefix>.<file-designator>.<gid-number>.<die-number>
318 external labels for debugging information entries as all other
319 \addtoindex{section group} sections can use 'local' labels
320 (section\dash relative
323 (This is a consequence of separate compilation, not a rule
324 imposed by this document.)
326 \textit{Local labels use references with form \DWFORMreffour{}
329 (These are affected by relocations
334 normally not usable and
335 \DWFORMrefaddr{} is not necessary
339 \subsubsection{Use of \addtoindex{DW\_TAG\_compile\_unit} versus
340 \addtoindex{DW\_TAG\_partial\_unit}}
342 A \addtoindex{section group} compilation unit that uses
344 is like any other compilation unit, in that its contents
345 are evaluated by consumers as though it were an ordinary
348 An \#include directive appearing outside any other
349 declarations is a good candidate to be represented using
351 However, an \#include appearing inside
352 a \addtoindex{C++} namespace declaration or a function, for example, is
353 not a good candidate because the entities included are not
354 necessarily file level entities.
356 This also applies to \addtoindex{Fortran} INCLUDE lines when declarations
357 are included into a subprogram or module context.
359 Consequently a compiler must use \DWTAGpartialunit{} (instead
360 of \DWTAGcompileunit)
361 in a \addtoindex{section group}
362 whenever the section group
363 contents are not necessarily globally visible.
365 directs consumers to ignore that compilation unit when scanning
366 top level declarations and definitions.
368 The \DWTAGpartialunit{} compilation unit will be referenced
369 from elsewhere and the referencing locations give the
370 appropriate context for interpreting the partial compilation
373 A \DWTAGpartialunit{} entry may have, as appropriate, any of
374 the attributes assigned to a \DWTAGcompileunit.
377 \subsubsection{Use of DW\_TAG\_imported\_unit}
379 A \DWTAGimportedunit{} debugging information entry has an
380 \DWATimport{} attribute referencing a \DWTAGcompileunit{} or
381 \DWTAGpartialunit{} debugging information entry.
383 A \DWTAGimportedunit{} debugging information entry refers
385 \DWTAGcompileunit{} or
386 \DWTAGpartialunit{} debugging
387 information entry to specify that the
388 \DWTAGcompileunit{} or
389 \DWTAGpartialunit{} contents logically appear at the point
391 \DWTAGimportedunit{} entry.
394 \subsubsection{Use of DW\_FORM\_ref\_addr}
397 \DWFORMrefaddr{} to reference from one compilation
398 unit's debugging information entries to those of another
402 When referencing into a removable \addtoindex{section group}
404 from another \dotdebuginfo{} (from anywhere), the
406 <prefix>.<file-designator>.<gid-number>.<die-number>
408 name should be used for an external symbol and a relocation
409 generated based on that name.
412 \textit{When referencing into a
413 \addtoindexx{section group}
416 from another \dotdebuginfo{} (from anywhere)
418 still the form to be used, but a section\dash relative relocation
419 generated by use of a non-exported name (often called an
420 \doublequote{internal name}) may be used for references within the
423 \subsection{Examples}
426 This section provides several
427 \addtoindexx{DWARF duplicate elimination!examples}
428 examples in order to have a
429 concrete basis for discussion.
431 In these examples, the focus is on the arrangement of DWARF
432 information into sections (specifically the
434 section) and the naming conventions used to achieve references
436 \addtoindexx{section group}
438 In practice, all of the examples that
439 follow involve DWARF sections other than just
441 (for example, \dotdebugline{},
442 \dotdebugaranges{}, or others);
443 however, only the \dotdebuginfo{}
444 section is shown to keep the
445 examples compact and easier to read.
447 The grouping of sections into a named set is shown, but the means for achieving this in terms of
448 the underlying object language is not (and varies from system to system).
450 \subsubsection{C++ Example}
452 The \addtoindex{C++} source
453 \addtoindexx{DWARF duplicate elimination!examples}
455 Figure \refersec{fig:duplicateeliminationexample1csource}
456 is used to illustrate the DWARF
457 representation intended to allow duplicate elimination.
461 \begin{lstlisting}[numbers=none]
467 \begin{lstlisting}[numbers=none]
475 \caption{Duplicate elimination example \#1: C++ Source}
476 \label{fig:duplicateeliminationexample1csource}
479 Figure \refersec{fig:duplicateeliminationexample1dwarfsectiongroup}
480 shows the \addtoindex{section group} corresponding to the included file
485 % FIXME: the DWFORMrefn could use rethinking
487 ==== Section group name:
488 my.compiler.company.cpp.wa.h.123456
489 == section \dotdebuginfo{}
490 DW.cpp.wa.h.123456.1: ! linker global symbol
492 \DWATlanguage(\DWLANGCplusplus)
493 ... ! other unit attributes
494 DW.cpp.wa.h.123456.2: ! linker global symbol
497 DW.cpp.wa.h.123456.3: ! linker global symbol
500 DW.cpp.wa.h.123456.4: ! linker global symbol
503 \DWATtype(\DWFORMrefn to DW.cpp.wa.h.123456.2)
504 ! (This is a local reference, so the more
505 ! compact form \DWFORMrefn
506 ! for n = 1,2,4, or 8 can be used)
510 \caption{Duplicate elimination example \#1: DWARF section group}
511 \label{fig:duplicateeliminationexample1dwarfsectiongroup}
514 Figure \refersec{fig:duplicateeliminationexample1primarycompilationunit}
515 shows the \doublequote{normal} DWARF sections, which are not part of
516 any \addtoindex{section group},
517 and how they make use of the information
518 in the \addtoindex{section group} shown above.
523 == section \dottext{}
524 [generated code for function f]
525 == section \dotdebuginfo{}
527 .L1: ! local (non-linker) symbol
529 \DWATtype(reference to DW.cpp.wa.h.123456.3)
532 \DWATtype(reference to DW.cpp.wa.h.123456.2)
535 \DWATtype(reference to .L1)
539 \caption{Duplicate elimination example \#1: primary compilation unit}
540 \label{fig:duplicateeliminationexample1primarycompilationunit}
544 This example uses \DWTAGcompileunit{}
545 for the \addtoindex{section group},
546 implying that the contents of the compilation unit are
547 globally visible (in accordance with
548 \addtoindex{C++} language rules).
550 is not needed for the same reason.
553 \subsubsection{C Example}
555 The \addtoindex{C++} example
556 \addtoindexx{DWARF duplicate elimination!examples}
557 in this Section might appear to be equally
558 valid as a \addtoindex{C} example. However, for \addtoindex{C}
559 it is prudent to include a \DWTAGimportedunit{}
561 (see Figure \refersec{fig:duplicateeliminationexample1primarycompilationunit})
562 as well as an \DWATimport{} attribute that refers to the proper unit
563 in the \addtoindex{section group}.
566 \textit{The \addtoindex{C} rules for consistency of global (file scope) symbols
567 across compilations are less strict than for \addtoindex{C++}; inclusion
568 of the import unit attribute assures that the declarations of
569 the proper \addtoindex{section group} are considered before declarations
570 from other compilations.}
573 \subsubsection{Fortran Example}
576 For a \addtoindex{Fortran}
577 \addtoindexx{DWARF duplicate elimination!examples}
579 Figure \refersec{fig:duplicateeliminationexample2fortransource}.
582 \textit{File CommonStuff.f\hspace{1pt}h}
583 \addtoindexx{Fortran}
584 \begin{lstlisting}[numbers=none]
585 IMPLICIT INTEGER(A-Z)
586 COMMON /Common1/ C(100)
591 \begin{lstlisting}[numbers=none]
593 INCLUDE 'CommonStuff.fh'
598 \caption{Duplicate elimination example \#2: Fortran source}
599 \label{fig:duplicateeliminationexample2fortransource}
603 Figure \refersec{fig:duplicateeliminationexample2dwarfsectiongroup}
604 shows the \addtoindex{section group}
605 corresponding to the included file
606 \addtoindexx{Fortran example}
612 ==== Section group name:
614 my.f90.company.f90.CommonStuff.fh.654321
616 == section \dotdebuginfo{}
618 DW.myf90.CommonStuff.fh.654321.1: ! linker global symbol
620 ! ...compilation unit attributes, including...
621 \DWATlanguage(\DWLANGFortranninety)
622 \DWATidentifiercase(\DWIDcaseinsensitive)
624 DW.myf90.CommonStuff.fh.654321.2: ! linker global symbol
627 \DWATtype(reference to DW.f90.F90\$main.f.2)
630 \DWATtype(reference to DW.f90.F90\$main.f.2)
632 \DWATlowerbound(constant 1)
633 \DWATupperbound(constant 100)
635 DW.myf90.CommonStuff.fh.654321.3: ! linker global symbol
638 \DWATlocation(Address of common \nolink{block} Common1)
641 \DWATtype(reference to 3\$)
642 \DWATlocation(address of C)
644 DW.myf90.CommonStuff.fh.654321.4: ! linker global symbol
647 \DWATtype(reference to DW.f90.F90\$main.f.2)
649 \DWATconstvalue(constant 7)
652 \caption{Duplicate elimination example \#2: DWARF section group}
653 \label{fig:duplicateeliminationexample2dwarfsectiongroup}
656 Figure \refersec{fig:duplicateeliminationexample2primaryunit}
657 shows the sections for the primary compilation unit.
662 == section \dottext{}
663 [code for function Foo]
665 == section \dotdebuginfo{}
669 \DWATtype(reference to DW.f90.F90\$main.f.2)
672 \DWATimport(reference to
673 DW.myf90.CommonStuff.fh.654321.1)
674 \DWTAGcommoninclusion ! For Common1
675 \DWATcommonreference(reference to
676 DW.myf90.CommonStuff.fh.654321.3)
677 \DWTAGvariable ! For function result
679 \DWATtype(reference to DW.f90.F90\$main.f.2)
683 \caption{Duplicate elimination example \#2: primary unit}
684 \label{fig:duplicateeliminationexample2primaryunit}
687 A companion main program is shown in
688 Figure \refersec{fig:duplicateeliminationexample2companionsource}
692 \begin{lstlisting}[numbers=none]
693 INCLUDE 'CommonStuff.fh'
695 PRINT *, 'Result = ', FOO(50 - SEVEN)
698 \caption{Duplicate elimination example \#2: companion source }
699 \label{fig:duplicateeliminationexample2companionsource}
703 That main program results in an object file that
704 contained a duplicate of the \addtoindex{section group} named
705 \texttt{my.f90.company.f90.CommonStuff.fh.654321}
707 included file as well as the remainder of the main subprogram
709 Figure \refersec{fig:duplicateeliminationexample2companiondwarf}.
714 == section \dotdebuginfo{}
719 \DWATencoding(\DWATEsigned)
724 ... ! other base types
726 \DWATname("F90\$main")
728 \DWATimport(reference to
729 DW.myf90.CommonStuff.fh.654321.1)
730 \DWTAGcommoninclusion ! for Common1
731 \DWATcommonreference(reference to
732 DW.myf90.CommonStuff.fh.654321.3)
736 \caption{Duplicate elimination example \#2: companion DWARF }
737 \label{fig:duplicateeliminationexample2companiondwarf}
740 This example uses \DWTAGpartialunit{} for the \addtoindex{section group}
741 because the included declarations are not independently
742 visible as global entities.
745 \section{Using Type Units}
746 \label{app:usingtypeunits}
748 A large portion of debug information is type information, and
749 in a typical compilation environment, many types are duplicated
750 many times. One method of controlling the amount of duplication
751 is separating each type into a separate
752 \COMDAT{} \dotdebuginfo{} section
753 and arranging for the linker to recognize and eliminate
754 duplicates at the individual type level.
756 Using this technique, each substantial type definition is
757 placed in its own individual section, while the remainder
758 of the DWARF information (non-type information, incomplete
759 type declarations, and definitions of trivial types) is
760 placed in the usual debug information section. In a typical
761 implementation, the relocatable object file may contain one
762 of each of these debug sections:
770 and any number of additional \COMDAT{} \dotdebuginfo{} sections
771 containing type units.
774 As discussed in the previous section
775 (Section \refersec{app:usingcompilationunits}),
777 linkers today support the concept of a \COMDAT{} group or
778 linkonce section. The general idea is that a \doublequote{key} can be
779 attached to a section or a group of sections, and the linker
780 will include only one copy of a \addtoindex{section group}
781 (or individual section) for any given key.
782 For \COMDAT{} \dotdebuginfo{} sections, the
783 key is the \addtoindex{type signature}
784 formed from the algorithm given in
785 Section \refersec{datarep:typesignaturecomputation}.
787 \subsection{Signature Computation Example}
788 \label{app:signaturecomputationexample}
791 \addtoindexx{type signature!example computation}
792 consider a \addtoindex{C++} header file
793 containing the type definitions shown
794 in Figure \refersec{fig:typesignatureexamplescsource}.
819 \caption{Type signature examples: C++ source}
820 \label{fig:typesignatureexamplescsource}
823 Next, consider one possible representation of the DWARF
824 information that describes the type \doublequote{struct C} as shown
826 \refersec{fig:typesignaturecomputation1dwarfrepresentation}.
830 % We keep the : (colon) away from the attribute so tokenizing in the python tools
831 % does not result in adding : into the attribute name.
834 \DWATlanguage : \DWLANGCplusplus (4)
847 \DWATtype : reference to L2
848 \DWATdatamemberlocation : 0
853 \DWATtype : reference to L2
854 \DWATdatamemberlocation : 4
858 \DWATencoding : \DWATEsigned
862 \caption{Type signature computation \#1: DWARF representation}
863 \label{fig:typesignaturecomputation1dwarfrepresentation}
867 In computing a signature for the type \texttt{N::C}, flatten the type
868 \addtoindexx{type signature}
869 description into a byte stream according to the procedure
871 Section \refersec{datarep:typesignaturecomputation}.
872 The result is shown in
873 Figure \refersec{fig:typesignaturecomputation1flattenedbytestream}.
878 // Step 2: 'C' \DWTAGnamespace "N"
880 // Step 3: 'D' \DWTAGstructuretype
882 // Step 4: 'A' \DWATname \DWFORMstring "C"
883 0x41 0x03 0x08 0x43 0x00
884 // Step 4: 'A' \DWATbytesize \DWFORMsdata 8
886 // Step 7: First child ("x")
887 // Step 3: 'D' \DWTAGmember
889 // Step 4: 'A' \DWATname \DWFORMstring "x"
890 0x41 0x03 0x08 0x78 0x00
891 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 0
893 // Step 6: 'T' \DWATtype (type \#2)
895 // Step 3: 'D' \DWTAGbasetype
897 // Step 4: 'A' \DWATname \DWFORMstring "int"
898 0x41 0x03 0x08 0x69 0x6e 0x74 0x00
899 // Step 4: 'A' \DWATbytesize \DWFORMsdata 4
901 // Step 4: 'A' \DWATencoding \DWFORMsdata \DWATEsigned
903 // Step 7: End of \DWTAGbasetype "int"
905 // Step 7: End of \DWTAGmember "x"
907 // Step 7: Second child ("y")
908 // Step 3: 'D' \DWTAGmember
910 // Step 4: 'A' \DWATname \DWFORMstring "y"
911 0x41 0x03 0x08 0x78 0x00
912 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 4
914 // Step 6: 'R' \DWATtype (type \#2)
916 // Step 7: End of \DWTAGmember "y"
918 // Step 7: End of \DWTAGstructuretype "C"
922 \caption{Type signature computation \#1: flattened byte stream}
923 \label{fig:typesignaturecomputation1flattenedbytestream}
927 Running an \MDfive{} hash over this byte stream, and taking the
928 low\dash order 64 bits, yields the final signature:
931 Next, consider a representation of the DWARF information that
932 describes the type \doublequote{class A} as shown in
933 Figure \refersec{fig:typesignaturecomputation2dwarfrepresentation}.
940 \DWATlanguage : \DWLANGCplusplus (4)
953 \DWATtype : reference to L2
954 \DWATdatamemberlocation : 0
955 \DWATaccessibility : \DWACCESSprivate
960 \DWATtype : reference to L3
961 \DWATdatamemberlocation : 4
962 \DWATaccessibility : \DWACCESSprivate
967 \DWATtype : reference to L4
968 \DWATdatamemberlocation : 8
969 \DWATaccessibility : \DWACCESSprivate
974 \DWATtype : 0xd28081e8 dcf5070a (signature for struct C)
975 \DWATdatamemberlocation : 12
976 \DWATaccessibility : \DWACCESSprivate
979 \caption{Type signature computation \#2: DWARF representation}
980 \label{fig:typesignaturecomputation2dwarfrepresentation}
993 \DWTAGformalparameter
994 \DWATtype : reference to L3
996 \DWTAGformalparameter
997 \DWATtype : reference to L2
1003 \DWATtype : reference to L2
1004 \DWATdeclaration : 1
1005 \DWTAGformalparameter
1006 \DWATtype : reference to L3
1011 \DWATencoding : \DWATEsigned
1015 \DWATtype : reference to L1
1018 \DWATtype : reference to L5
1024 \DWATdeclaration : 1
1029 Figure~\ref{fig:typesignaturecomputation2dwarfrepresentation}: Type signature computation \#2: DWARF representation \textit{(concluded)}
1033 In this example, the structure types \texttt{N::A} and \texttt{N::C} have each
1034 been placed in separate
1035 \addtoindexx{type unit}
1036 type units. For \texttt{N::A}, the actual
1037 definition of the type begins at label L1. The definition
1038 involves references to the \texttt{int} base type and to two pointer
1039 types. The information for each of these referenced types is
1040 also included in this \addtoindex{type unit},
1041 since base types and pointer
1042 types are trivial types that are not worth the overhead of a
1043 separate \addtoindex{type unit}.
1044 The last pointer type contains a reference
1045 to an incomplete type \texttt{N::B}, which is also included here as
1046 a declaration, since the complete type is unknown and its
1047 signature is therefore unavailable. There is also a reference
1048 to \texttt{N::C}, using
1049 \DWFORMrefsigeight{} to
1050 refer to the type signature
1051 \addtoindexx{type signature}
1057 % DWARF4 had a \DWATnamespace{} below,
1058 % but this error is fixed here to be \DWTAGnamespace.
1060 // Step 2: 'C' \DWTAGnamespace "N"
1062 // Step 3: 'D' \DWTAGclasstype
1064 // Step 4: 'A' \DWATname \DWFORMstring "A"
1065 0x41 0x03 0x08 0x41 0x00
1066 // Step 4: 'A' \DWATbytesize \DWFORMsdata 20
1068 // Step 7: First child ("v\_")
1069 // Step 3: 'D' \DWTAGmember
1071 // Step 4: 'A' \DWATname \DWFORMstring "v\_"
1072 0x41 0x03 0x08 0x76 0x5f 0x00
1073 // Step 4: 'A' \DWATaccessibility \DWFORMsdata \DWACCESSprivate
1075 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 0
1077 // Step 6: 'T' \DWATtype (type \#2)
1079 // Step 3: 'D' \DWTAGbasetype
1081 // Step 4: 'A' \DWATname \DWFORMstring "int"
1082 0x41 0x03 0x08 0x69 0x6e 0x74 0x00
1083 // Step 4: 'A' \DWATbytesize \DWFORMsdata 4
1085 // Step 4: 'A' \DWATencoding \DWFORMsdata \DWATEsigned
1087 // Step 7: End of \DWTAGbasetype "int"
1089 // Step 7: End of \DWTAGmember "v\_"
1091 // Step 7: Second child ("next")
1092 // Step 3: 'D' \DWTAGmember
1094 // Step 4: 'A' \DWATname \DWFORMstring "next"
1095 0x41 0x03 0x08 0x6e 0x65 0x78 0x74 0x00
1096 // Step 4: 'A' \DWATaccessibility \DWFORMsdata \DWACCESSprivate
1098 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 4
1102 \caption{Type signature example \#2: flattened byte stream}
1103 \label{fig:typesignatureexample2flattenedbytestream}
1110 // Step 6: 'T' \DWATtype (type \#3)
1112 // Step 3: 'D' \DWTAGpointertype
1114 // Step 5: 'N' \DWATtype
1116 // Step 5: 'C' \DWTAGnamespace "N" 'E'
1117 0x43 0x39 0x4e 0x00 0x45
1120 // Step 7: End of \DWTAGpointertype
1122 // Step 7: End of \DWTAGmember "next"
1124 // Step 7: Third child ("bp")
1125 // Step 3: 'D' \DWTAGmember
1127 // Step 4: 'A' \DWATname \DWFORMstring "bp"
1128 0x41 0x03 0x08 0x62 0x70 0x00
1129 // Step 4: 'A' \DWATaccessibility \DWFORMsdata \DWACCESSprivate
1131 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 8
1133 // Step 6: 'T' \DWATtype (type \#4)
1135 // Step 3: 'D' \DWTAGpointertype
1137 // Step 5: 'N' \DWATtype
1139 // Step 5: 'C' \DWTAGnamespace "N" 'E'
1140 0x43 0x39 0x4e 0x00 0x45
1143 // Step 7: End of \DWTAGpointertype
1145 // Step 7: End of \DWTAGmember "next"
1147 // Step 7: Fourth child ("c")
1148 // Step 3: 'D' \DWTAGmember
1150 // Step 4: 'A' \DWATname \DWFORMstring "c"
1151 0x41 0x03 0x08 0x63 0x00
1152 // Step 4: 'A' \DWATaccessibility \DWFORMsdata \DWACCESSprivate
1158 Figure~\ref{fig:typesignatureexample2flattenedbytestream}: Type signature example \#2: flattened byte stream \textit{(continued)}
1166 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 12
1168 // Step 6: 'T' \DWATtype (type \#5)
1170 // Step 2: 'C' \DWTAGnamespace "N"
1172 // Step 3: 'D' \DWTAGstructuretype
1174 // Step 4: 'A' \DWATname \DWFORMstring "C"
1175 0x41 0x03 0x08 0x43 0x00
1176 // Step 4: 'A' \DWATbytesize \DWFORMsdata 8
1178 // Step 7: First child ("x")
1179 // Step 3: 'D' \DWTAGmember
1181 // Step 4: 'A' \DWATname \DWFORMstring "x"
1182 0x41 0x03 0x08 0x78 0x00
1183 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 0
1185 // Step 6: 'R' \DWATtype (type \#2)
1187 // Step 7: End of \DWTAGmember "x"
1189 // Step 7: Second child ("y")
1190 // Step 3: 'D' \DWTAGmember
1192 // Step 4: 'A' \DWATname \DWFORMstring "y"
1193 0x41 0x03 0x08 0x79 0x00
1194 // Step 4: 'A' \DWATdatamemberlocation \DWFORMsdata 4
1196 // Step 6: 'R' \DWATtype (type \#2)
1198 // Step 7: End of \DWTAGmember "y"
1200 // Step 7: End of \DWTAGstructuretype "C"
1202 // Step 7: End of \DWTAGmember "c"
1204 // Step 7: Fifth child ("A")
1205 // Step 3: 'S' \DWTAGsubprogram "A"
1207 // Step 7: Sixth child ("v")
1208 // Step 3: 'S' \DWTAGsubprogram "v"
1210 // Step 7: End of \DWTAGstructuretype "A"
1216 Figure~\ref{fig:typesignatureexample2flattenedbytestream}: Type signature example \#2: flattened byte stream \textit{(concluded)}
1220 In computing a signature for the type \texttt{N::A}, flatten the type
1221 description into a byte stream according to the procedure
1223 Section \refersec{datarep:typesignaturecomputation}.
1224 The result is shown in
1225 Figure \refersec{fig:typesignatureexample2flattenedbytestream}.
1227 Running an \MDfive{} hash over this byte stream, and taking the
1228 low-order 64 bits, yields the final signature: 0xd6d160f5
1231 A source file that includes this header file may declare a
1232 variable of type \texttt{N::A}, and its DWARF information may look
1234 Figure \refersec{fig:typesignatureexampleusage}.
1245 \DWATtype : (signature) 0xd6d160f5 5589f6e9
1250 \caption{Type signature example usage}
1251 \label{fig:typesignatureexampleusage}
1254 \subsection{Type Signature Computation Grammar}
1255 \label{app:typesignaturecomputationgrammar}
1257 Figure \refersec{fig:typesignaturecomputationgrammar}
1258 \addtoindexx{type signature!computation grammar}
1259 presents a semi-formal grammar that may aid in understanding
1260 how the bytes of the flattened type description are formed
1261 during the type signature computation algorithm of
1262 Section \refersec{datarep:typesignaturecomputation}.
1266 %FIXME: The index entries here with \addtoindexx are ineffective.
1269 : opt-context debug-entry attributes children
1270 opt-context // Step 2
1271 : 'C' tag-code string opt-context
1273 debug-entry // Step 3
1275 attributes // Steps 4, 5, 6
1276 : attribute attributes
1279 : 'A' at-code form-encoded-value // Normal attributes
1280 : 'N' at-code opt-context 'E' string // Reference to type by name
1281 : 'R' at-code back-ref // Back-reference to visited type
1282 : 'T' at-code signature // Recursive type
1287 : 'S' tag-code string
1294 : \DWFORMsdata value \addtoindexx{constant class}
1295 : \DWFORMflag value \addtoindexx{flag class}
1296 : \DWFORMstring string \addtoindexx{string class}
1297 : \DWFORMblock \nolink{block} \addtoindexx{block class}
1298 \DWFORMstring \addtoindexx{string class}
1300 \DWFORMblock \addtoindexx{block class}
1302 \DWFORMflag \addtoindexx{flag class}
1304 \DWFORMsdata \addtoindexx{constant class}
1309 : <ULEB128> <fixed-length-block> // The ULEB128 gives the length of the \nolink{block}
1313 : <null-terminated-string>
1318 \caption{Type signature computation grammar}
1319 \label{fig:typesignaturecomputationgrammar}
1323 \subsection{Declarations Completing Non-Defining Declarations}
1324 \label{app:declarationscompletingnondefiningdeclarations}
1325 Consider a compilation unit that contains a definition of the member
1326 function \texttt{N::A::v()} from
1327 Figure \refersec{fig:typesignatureexamplescsource}.
1328 A possible representation of the
1329 debug information for this function in the compilation unit is shown
1330 in Figure \refersec{fig:completingedeclarationofamemberfunctiondwarf}.
1340 \DWATdeclaration{} : true
1341 \DWATsignature{} : 0xd6d160f5 5589f6e9
1347 \DWATdeclline{} : 13
1348 \DWATtype{} : reference to L3
1349 \DWATdeclaration{} : 1
1350 \DWTAGformalparameter
1351 \DWATtype{} : reference to L4
1352 \DWATartificial{} : 1
1357 \DWATencoding{} : \DWATEsigned
1362 \DWATtype{} : reference to L1
1365 \DWATspecification{} : reference to L2
1367 \DWATdeclline{} : 25
1375 \caption{Completing declaration of a member function: DWARF \mbox{encoding}}
1376 \label{fig:completingedeclarationofamemberfunctiondwarf}
1381 \section{Summary of Compression Techniques}
1382 \label{app:summaryofcompressiontechniques}
1383 \subsection{\#include compression}
1384 \label{app:includecompression}
1386 \addtoindex{C++} has a much greater
1388 \addtoindex{C} with the number and
1389 size of the headers included and the amount of data in each,
1390 but even with \addtoindex{C}
1391 there is substantial header file information
1394 A reasonable approach is to put each header file in its own
1395 \addtoindex{section group}, using the naming rules mentioned above. The
1396 section groups are marked to ensure duplicate removal.
1398 All data instances and code instances (even if they came
1399 from the header files above) are put
1400 \addtoindexx{section group}
1401 into non-section group
1402 sections such as the base object file
1403 \dotdebuginfo{} section.
1405 \subsection{Eliminating function duplication}
1406 \label{app:eliminatingfunctionduplication}
1409 Function templates (C++) result in code for the same template
1410 instantiation being compiled into multiple archives or
1411 relocatable object files. The linker wants to keep only one of a
1412 given entity. The DWARF description, and everything else for
1413 this function, should be reduced to just a single copy.
1416 For each such code group (function template in this example)
1417 the compiler assigns a name for the group which will match
1418 all other instantiations of this function but match nothing
1421 \addtoindexx{section group}
1422 section groups are marked to ensure duplicate
1423 removal, so that the second and subsequent definitions seen
1424 by the static linker are simply discarded.
1428 \dotdebuginfo{} sections follow the approach
1429 suggested above, but the naming rule is slightly
1430 different in that the \texttt{\textless file-designator\textgreater}
1431 should be interpreted as a \texttt{\textless file-designator\textgreater}.
1434 \subsection{Single-function-per-DWARF-compilation-unit}
1435 \label{app:singlefunctionperdwarfcompilationunit}
1437 Section groups can help make it easy for a linker to completely
1438 remove unused functions.
1441 \addtoindexx{section group}
1442 section groups are not marked for duplicate removal,
1443 since the functions are not duplicates of anything.
1445 Each function is given a compilation unit and a section
1446 group. Each such compilation unit is complete, with its own
1447 text, data, and DWARF sections.
1449 There will also be a compilation unit that has the file\dash level
1450 declarations and definitions. Other per\dash function compilation
1451 unit DWARF information (\dotdebuginfo{}) points to this common
1452 file\dash level compilation unit using
1455 Section groups can use \DWFORMrefaddr{} and internal labels
1456 (section\dash relative relocations) to refer to the main object
1457 file sections, as the
1458 \addtoindexx{section group}
1459 section groups here are either deleted
1460 as unused or kept. There is no possibility (aside from error)
1461 of a group from some other compilation being used in place
1462 of one of these groups.
1465 \subsection{Inlining and out-of-line-instances}
1466 \label{app:inliningandoutoflineinstances}
1469 \addtoindexx{abstract instance}
1470 \addtoindexx{concrete out-of-line instance}
1471 and concrete-out-of-line instances may be
1472 put in distinct compilation units using
1473 \addtoindexx{section group}
1476 makes possible some useful duplicate DWARF elimination.
1478 \textit{No special provision for eliminating class duplication
1479 resulting from template instantiation is made here, though
1480 nothing prevents eliminating such duplicates using section
1484 \subsection{Separate Type Units}
1485 \label{app:separatetypeunits}
1487 Each complete declaration of a globally-visible type can be
1488 \addtoindexx{type unit}
1489 placed in its own separate type section, with a group key
1490 derived from the type signature. The linker can then remove
1491 all duplicate type declarations based on the key.