2 \label{chap:introduction}
4 This document defines a format for describing programs to
5 facilitate user source level debugging. This description
6 can be generated by compilers, assemblers and linkage
7 editors. It can be used by debuggers and other tools.
8 The debugging information format does not favor the design of any
10 Instead, the goal is to create a method
11 of communicating an accurate picture of the source program
12 to any debugger in a form that is extensible to different
13 languages while retaining compatibility.
16 debugging information format is open-ended, allowing for
17 the addition of new debugging information to accommodate new
18 languages or debugger capabilities while remaining compatible
19 with other languages or different debuggers.
21 \section{Purpose and Scope}
22 The debugging information format described in this document is
23 designed to meet the symbolic, source-level debugging needs of
24 different languages in a unified fashion by requiring language
25 independent debugging information whenever possible.
27 of individual languages, such as \addtoindex{C++} virtual functions or
28 \addtoindex{Fortran} common
29 \nolink{blocks}, are accommodated by creating attributes
30 that are used only for those languages.
32 believed to cover most debugging information needs of
34 \addtoindex{C}, \addtoindex{C++}, \addtoindex{COBOL},
35 and \addtoindex{Fortran}; it also covers the basic needs
36 of various other languages.
38 This document describes \DWARFVersionV,
40 of debugging information based on the DWARF format.
41 \DWARFVersionV{} extends \DWARFVersionIV{}
42 in a compatible manner.
44 The intended audience for this document is the developers
45 of both producers and consumers of debugging information,
46 typically compilers, debuggers and other tools that need to
47 interpret a binary program in terms of its original source.
52 There are two major pieces to the description of the DWARF
53 format in this document. The first piece is the informational
54 content of the debugging entries. The second piece is the
55 way the debugging information is encoded and represented in
58 The informational content is described in Chapters
59 \ref{chap:generaldescription} through
60 \ref{chap:otherdebugginginformation}. Chapter
61 \ref{chap:generaldescription}
62 describes the overall structure of the information
63 and attributes that are common to many or all of the different
64 debugging information entries. Chapters
65 \ref{chap:programscopeentries},
66 \ref{chap:dataobjectandobjectlistentries} and
67 \ref{chap:typeentries} describe
68 the specific debugging information entries and how they
69 communicate the necessary information about the source program
70 to a debugger. Chapter \ref{chap:otherdebugginginformation}
71 describes debugging information
72 contained outside of the debugging information entries. The
73 encoding of the DWARF information is presented in Chapter
74 \ref{datarep:datarepresentation}.
76 This organization closely follows that used in the
77 \DWARFVersionIV{} document. Except where needed to incorporate
78 new material or to correct errors, the \DWARFVersionIV{}
79 text is generally reused in this document with little or
82 In the following sections, text in normal font describes
83 required aspects of the DWARF format. Text in \textit{italics} is
84 explanatory or supplementary material, and not part of the
85 format definition itself. The several appendices consist only
86 of explanatory or supplementary material, and are not part
87 of the formal definition.
89 \section{Objectives and Rationale}
91 DWARF has had a set of objectives since its inception which have
92 guided the design and evolution of the debugging format. A discussion
93 of these objectives and the rationale behind them may help with an
94 understanding of the DWARF Debugging Format.
96 Although DWARF Version 1 was developed in the late 1980's as a
97 format to support debugging C programs written for AT\&T hardware
98 running SVR4, \DWARFVersionII{} and later has evolved far beyond
99 this origin. One difference between DWARF and other formats
100 is that the latter are often specific to a particular language,
101 architecture, and/or operating system.
103 \subsection{Language Independence}
104 DWARF is applicable to a broad range of existing procedural
105 languages and is designed to be extensible to future languages.
106 These languages may be considered to be "C-like" but the
107 characteristics of C are not incorporated into DWARF Version 2
108 and later, unlike DWARF Version 1 and other debugging formats.
109 DWARF abstracts concepts as much as possible so that the
110 description can be used to describe a program in any language.
111 As an example, the DWARF descriptions used to describe C functions,
112 Pascal subroutines, and Fortran subprograms are all the same,
113 with different attributes used to specify the differences between
114 these similar programming language features.
116 On occasion, there is a feature which is specific to one
117 particular language and which doesn't appear to have more
118 general application. For these, DWARF has a description
119 designed to meet the language requirements, although, to the
120 extent possible, an effort is made to generalize the attribute.
121 An example of this is the \DWTAGconditionNAME{}
122 debugging information entry,
123 used to describe \addtoindex{COBOL} level 88 conditions, which
124 is described in abstract terms rather than COBOL-specific terms.
125 Conceivably, this TAG might be used with a different language
126 which had similar functionality.
128 \subsection{Architecture Independence}
129 DWARF can be used with a wide range of processor architectures,
130 whether byte or word oriented, linear or segmented, with any
131 word or byte size. DWARF can be used with Von Neumann architectures,
132 using a single address space for both code and data; Harvard
133 architectures, with separate code and data address spaces; and
134 potentially for other architectures such as DSPs with their
135 idiosyncratic memory organizations. DWARF can be used with
136 common register-oriented architectures or with stack architectures.
138 DWARF assumes that memory has individual units (words or bytes)
139 which have unique addresses which are ordered. (Some architectures
140 like the i386 can represent the same physical machine location with
141 different segment and offset pairs. Identifying aliases is an
142 implementation issue.)
145 \subsection{Operating System Independence}
146 DWARF is widely associated with SVR4 Unix and similar operating
147 systems like BSD and Linux. DWARF fits well with the section
148 organization of the ELF object file format. Nonetheless, DWARF
149 attempts to be independent of either the OS or the object file
150 format. There have been implementations of DWARF debugging
151 data in COFF, Mach-O and other object file formats.
153 DWARF assumes that any object file format will be able to
154 distinguish the various DWARF data sections in some fashion,
157 DWARF makes a few assumptions about functionality provided by
158 the underlying operating system. DWARF data sections can be
159 read sequentially and independently.
160 Each DWARF data section is a sequence of 8-bit bytes,
161 numbered starting with zero. The presence of offsets from one
162 DWARF data section into other data sections does not imply that
163 the underlying OS must be able to position files randomly; a
164 data section could be read sequentially and indexed using the offset.
166 \subsection{Compact Data Representation}
167 The DWARF description is designed to be a compact file-oriented
170 There are several encodings which achieve this goal, such as the
171 TAG and attribute abbreviations or the line number encoding.
172 References from one section to another, especially to refer to
173 strings, allow these sections to be compacted to eliminate
176 There are multiple schemes for eliminating duplicate data or
177 reducing the size of the DWARF debug data associated with a
178 given file. These include COMDAT, used to eliminate duplicate
179 function or data definitions, the split DWARF object files
180 which allow a consumer to find DWARF data in files other than
181 the executable, or the type units, which allow similar type
182 definitions from multiple compilations to be combined.
184 In most cases, it is anticipated that DWARF
185 debug data will be read by a consumer (usually a debugger) and
186 converted into a more efficiently accessed internal representation.
187 For the most part, the DWARF data in a section is not the same as
188 this internal representation.
191 \subsection{Efficient Processing}
192 DWARF is designed to be processed efficiently, so that a
193 producer (a compiler) can generate the debug descriptions
194 incrementally and a consumer can read only the descriptions
195 which it needs at a given time. The data formats are designed
196 to be efficiently interpreted by a consumer.
198 As mentioned, there is a tension between this objective and
199 the preceding one. A DWARF data representation which resembles
200 an internal data representation may lead to faster processing,
201 but at the expense of larger data files. This may also constrain
202 the possible implementations.
204 \subsection{Implementation Independence}
205 DWARF attempts to allow developers the greatest flexibility
206 in designing implementations, without mandating any particular
207 design decisions. Issues which can be described as
208 quality-of-implementation are avoided.
210 \subsection{Explicit Rather Than Implicit Description}
211 DWARF describes the source to object translation explicitly
212 rather than using common practice or convention as an implicit
213 understanding between producer and consumer. For example, where
214 other debugging formats assume that a debugger knows how to
215 virtually unwind the stack, moving from one stack frame to the next using
216 implicit knowledge about the architecture or operating system,
217 DWARF makes this explicit in the Call Frame Information description.
219 \subsection{Avoid Duplication of Information}
220 DWARF has a goal of describing characteristics of a program once,
221 rather than repeating the same information multiple times. The
222 string sections can be compacted to eliminate duplicate strings,
223 for example. Other compaction schemes or references between
224 sections support this. Whether a particular implementation is
225 effective at eliminating duplicate data, or even attempts to,
226 is a quality-of-implementation issue.
228 \subsection{Leverage Other Standards}
229 Where another standard exists which describes how to interpret
230 aspects of a program, DWARF defers to that standard rather than
231 attempting to duplicate the description. For example, C++ has
232 specific rules for deciding which function to call depending
233 name, scope, argument types, and other factors. DWARF describes
234 the functions and arguments, but doesn't attempt to describe
235 how one would be selected by a consumer performing any particular
238 \subsection{Limited Dependence on Tools}
239 DWARF data is designed so that it can be processed by commonly
240 available assemblers, linkers, and other support programs,
241 without requiring additional functionality specifically to
242 support DWARF data. This may require the implementer to be
243 careful that they do not generate DWARF data which cannot be
244 processed by these programs. Conversely, an assembler which
245 can generate LEB128 (Little-Endian Base 128)
246 values may allow the compiler to generate
247 more compact descriptions, and a linker which understands the
248 format of string sections can merge these sections. Whether
249 or not an implementation includes these functions is a
250 quality-of-implementation issue, not mandated by the DWARF
253 \subsection{Separate Description From Implementation}
254 DWARF intends to describe the translation of a program from
255 source to object, while neither mandating any particular design
256 nor making any other design difficult. For example, DWARF
257 describes how the arguments and local variables in a function
258 are to be described, but doesn't specify how this data is
259 collected or organized by a producer. Where a particular DWARF
260 feature anticipates that it will be implemented in a certain
261 fashion, informative text will suggest but not require this design.
263 \subsection{Permissive Rather Than Prescriptive}
264 The DWARF Standard specifies the meaning of DWARF descriptions. It does not
265 specify in detail what a particular producer
269 generate for any source to
270 object conversion. One producer may generate a more complete description
271 than another, it may describe features in a different order (unless the
272 standard explicitly requires a particular order), or it may use
273 different abbreviations or compression methods. Similarly, DWARF does not
274 specify exactly what a particular consumer should do with each part of the
275 description, although we believe that the potential uses for each description
278 DWARF is permissive, allowing different producers to generate different
279 descriptions for the same source to object conversion, and permitting
280 different consumers to provide more or less functionality or information
281 to the user. This may result in debugging information being larger or
282 smaller, compilers or debuggers which are faster or slower, and more or
283 less functional. These are described as differences in "Quality of
286 Each producer conforming to the DWARF standard must follow the format and
287 meaning as specified in the standard. As long as the DWARF description
288 generated follows this specification, the producer is generating valid DWARF.
289 For example, DWARF allows a producer to identify the end of a function
290 prologue in the Line Information so that a debugger can stop at this location.
291 A producer which does this is generating valid DWARF, as is another which
292 doesn't. As another example, one producer may generate descriptions
293 for variables which are moved from memory to a register in a certain range,
294 while another may only describe the variable's location in memory. Both are
295 valid DWARF descriptions, while a consumer using the former would be able
296 to provide more accurate values for the variable while executing in that
297 range than a consumer using the latter.
300 In this document, where the word \doublequote{may} is used, the producer has
301 the option to follow the description or not. Where the text says
302 \doublequote{may not}, this is prohibited. Where the text says \doublequote{should},
303 this is advice about best practice, but is not a requirement.
306 \subsection{Vendor Extensibility}
307 This document does not attempt to cover all interesting
308 languages or even to cover all of the possible debugging
309 information needs for its primary target languages.
311 the document provides vendors a way to define their own
312 debugging information tags, attributes, base type encodings,
313 location operations, language names, calling conventions and
314 call frame instructions by reserving a subset of the valid
315 values for these constructs for vendor specific additions
316 and defining related naming conventions.
318 debugging information entries and attributes defined here in
320 Future versions of this document will not use
321 names or values reserved for vendor specific additions.
322 All names and values not reserved for vendor additions, however,
323 are reserved for future versions of this document.
325 Where this specification provides a means for
326 describing the source language, implementors are expected
327 to adhere to that specification.
328 For language features that
329 are not supported, implementors may use existing attributes
330 in novel ways or add vendor-defined attributes.
332 who make extensions are strongly encouraged to design them
333 to be compatible with this specification in the absence of
336 The DWARF format is organized so that a consumer can skip over
337 data which it does not recognize. This may allow a consumer
338 to read and process files generated according to a later
339 version of this standard or which contain vendor extensions,
340 albeit possibly in a degraded manner.
342 \section{Changes From Version 4 to Version 5}
343 \addtoindexx{DWARF Version 5}
344 The following is a list of the major changes made to the
345 DWARF Debugging Information Format since Version 4 was published.
346 The list is not meant to be exhaustive.
348 \item The \dotdebugtypes{}
349 %\addtoindexi{\texttt{.debug\_types}}{\texttt{.debug\_types} (Version 4)}
350 section introduced in \DWARFVersionIV{}
351 is eliminated and its contents instead contained in \dotdebuginfo{} sections.
352 \item Add support for collecting common DWARF information
353 (debugging information entries and macro definitions)
354 across multiple executable and shared files and keeping it in a single
355 \addtoindex{supplementary object file}.
357 \item A new line number program header format
358 provides the ability to use an MD5 hash to validate
359 the source file version in use, allows pooling
360 of directory and file name strings and makes provision for vendor-defined
361 extensions. It also adds a string section specific to the line number table
363 to properly support the common practice of stripping all DWARF sections
364 except for line number information.
366 \item Add a split object file and package representations to allow most
367 DWARF information to be kept separate from an executable
368 or shared image. This includes new sections
369 \dotdebugaddr, \dotdebugstroffsets, \dotdebugabbrevdwo, \dotdebuginfodwo,
370 \dotdebuglinedwo, \dotdebugloclistsdwo, \dotdebugmacrodwo, \dotdebugstrdwo,
371 \dotdebugstroffsetsdwo, \dotdebugcuindex{} and \dotdebugtuindex{}
372 together with new forms of attribute value for referencing these sections.
373 This enhances DWARF support by reducing executable program size and
374 by improving link times.
375 \item Replace the \dotdebugmacinfo{} macro information representation with
376 with a \dotdebugmacro{} representation that can potentially be much more compact.
378 \item Replace the \dotdebugpubnames{} and \dotdebugpubtypes{} sections
379 with a single and more functional name index section, \dotdebugnames{}.
381 \item Replace the location list and range list sections (\texttt{.debug\_loc}
382 and \texttt{.debug\_ranges}, respectively) with new sections (\dotdebugloclists{}
383 and \dotdebugrnglists) and new representations that
384 save space and processing time by eliminating most related
385 object file relocations.
387 \item Add a new debugging information entry (\DWTAGcallsiteNAME), related
388 attributes and DWARF expression operators to describe call site information,
389 including identification of tail calls and tail recursion.
390 \item Add improved support for \addtoindex{FORTRAN} assumed rank arrays
391 (\DWTAGgenericsubrangeNAME), dynamic rank arrays (\DWATrankNAME)
392 and co-arrays (\DWTAGcoarraytypeNAME{}).
393 \item Add new operations that allow support for
394 a DWARF expression stack containing typed values.
395 \item Add improved support for the \addtoindex{C++}:
396 \texttt{auto} return type, deleted member functions (\DWATdeletedNAME),
397 as well as defaulted constructors and destructors (\DWATdefaultedNAME).
398 \item Add a new attribute (\DWATnoreturnNAME{}), to identify
399 a subprogram that does not return to its caller.
400 \item Add language codes for C 2011, C++ 2003, C++ 2011, C++ 2014,
401 Dylan, Fortran 2003, Fortran 2008, Go, Haskell,
402 Julia, Modula 3, Ocaml, OpenCL, Rust and Swift.
403 \item Numerous other more minor additions to improve functionality
407 DWARF Version 5 is compatible with DWARF Version 4 except as follows:
409 \item The compilation unit header (in the \dotdebuginfo{} section) has
410 a new \HFNunittype{} field.
412 \item New operand forms for attribute values are defined
413 (\DWFORMaddrxNAME, \DWFORMdatasixteenNAME, \DWFORMimplicitconstNAME,
416 \DWFORMloclistxNAME, \DWFORMrnglistxNAME,
418 \DWFORMrefsupNAME, \DWFORMstrpsupNAME{} and \DWFORMstrxNAME).
421 \textit{Because a pre-DWARF Version 5 consumer will not be able to interpret
422 these even to ignore and skip over them, new forms must be
423 considered incompatible additions.}
424 \item The line number table header is substantially revised.
426 \item A location list entry
427 with the address range \mbox{(0, \textit{maximum-address})} is defined
428 as the new default location list entry.
429 \item In a string type, the \DWATbytesizeNAME{} attribute is re-defined
430 to always describe the size of the string type.
431 (Previously it described the size of the optional string length data
432 field if the \DWATstringlengthNAME{} attribute was also present.)
435 While not strictly an incompatibility, the macro information
436 representation is completely new; further, producers
437 and consumers may optionally continue to support the older
438 representation. While the two representations cannot both be
439 used in the same compilation unit, they can co-exist in
440 executable or shared images.
442 Similar comments apply to replacement of the \dotdebugpubnames{}
443 and \dotdebugpubtypes{} sections with the new \dotdebugnames{}
447 \section{Changes from Version 3 to Version 4}
448 \addtoindexx{DWARF Version 4}
449 The following is a list of the major changes made to the
450 DWARF Debugging Information Format since Version 3 was
451 published. The list is not meant to be exhaustive.
454 Section 2.6 (Location Descriptions)
455 to better distinguish DWARF location descriptions, which
456 compute the location where a value is found (such as an
457 address in memory or a register name) from DWARF expressions,
458 which compute a final value (such as an array bound).
459 \item Add support for bundled instructions on machine architectures
460 where instructions do not occupy a whole number of bytes.
461 \item Add a new attribute form for section offsets,
462 \DWFORMsecoffsetNAME,\addtoindexx{section offset}
463 to replace the use of
464 \DWFORMdatafourNAME{} and \DWFORMdataeightNAME{} for section offsets.
465 \item Add an attribute, \DWATmainsubprogramNAME, to identify the main subprogram of a
467 \item Define default array lower bound values for each supported language.
468 \item Add a new technique using separate type units, type signatures and \COMDAT{} sections to
469 improve compression and duplicate elimination of DWARF information.
470 \item Add support for new \addtoindex{C++} language constructs, including rvalue references, generalized
471 constant expressions, Unicode character types and template aliases.
472 \item Clarify and generalize support for packed arrays and structures.
473 \item Add new line number table support to facilitate profile based compiler optimization.
474 \item Add additional support for template parameters in instantiations.
475 \item Add support for strongly typed enumerations in languages (such as \addtoindex{C++}) that have two
476 kinds of enumeration declarations.
478 \addtoindex{DWARF Version 4} is compatible with
479 \addtoindex{DWARF Version 3} except as follows:
481 \item DWARF attributes that use any of the new forms of attribute value representation (for
482 section offsets, flag compression, type signature references, and so on) cannot be read by
483 \addtoindex{DWARF Version 3}
484 consumers because the consumer will not know how to skip over the
485 unexpected form of data.
486 \item DWARF frame and line number table sections include additional fields that affect the location
487 and interpretation of other data in the section.
490 \section{Changes from Version 2 to Version 3}
491 \addtoindexx{DWARF Version 3}
492 The following is a list of the major differences between
493 Version 2 and Version 3 of the DWARF Debugging Information
494 Format. The list is not meant to be exhaustive.
497 Make provision for DWARF information files that are larger
500 Allow attributes to refer to debugging information entries
501 in other shared libraries.
503 Add support for \addtoindex{Fortran 90} modules as well as allocatable
504 array and pointer types.
506 Add additional base types for \addtoindex{C} (as revised for 1999).
508 Add support for \addtoindex{Java} and \addtoindex{COBOL}.
510 Add namespace support for \addtoindex{C++}.
512 Add an optional section for global type names (similar to
513 the global section for objects and functions).
515 Adopt \addtoindex{UTF-8} as the preferred representation of program name strings.
517 Add improved support for optimized code (discontiguous
518 scopes, end of prologue determination, multiple section
520 \item Improve the ability to eliminate
521 duplicate DWARF information during linking.
524 \addtoindex{DWARF Version 3}
526 \addtoindex{DWARF Version 2} except as follows:
529 Certain very large values of the initial length fields that
530 begin DWARF sections as well as certain structures are reserved
531 to act as escape codes for future extension; one such extension
532 is defined to increase the possible size of DWARF descriptions
533 (see Section \refersec{datarep:32bitand64bitdwarfformats}).
535 References that use the attribute form
537 are specified to be four bytes in the DWARF 32-bit format and
538 eight bytes in the DWARF 64-bit format, while
539 \addtoindex{DWARF Version 2}
540 specifies that such references have the same size as an
541 address on the target system (see Sections
542 \refersec{datarep:32bitand64bitdwarfformats} and
543 \refersec{datarep:attributeencodings}).
545 The return\_address\_register field in a Common Information
546 Entry record for call frame information is changed to unsigned
547 LEB representation (see Section
548 \refersec{chap:structureofcallframeinformation}).
551 \section{Changes from Version 1 to Version 2}
552 \addtoindex{DWARF Version 2}
553 describes the second generation of debugging
554 information based on the DWARF format. While
555 \addtoindex{DWARF Version 2}
556 provides new debugging information not available in
557 Version 1, the primary focus of the changes for Version
558 2 is the representation of the information, rather than
559 the information content itself. The basic structure of
560 the Version 2 format remains as in Version 1: the debugging
561 information is represented as a series of debugging information
562 entries, each containing one or more attributes (name/value
563 pairs). The Version 2 representation, however, is much more
564 compact than the Version 1 representation. In some cases,
565 this greater density has been achieved at the expense of
566 additional complexity or greater difficulty in producing and
567 processing the DWARF information. The definers believe that the
568 reduction in I/O and in memory paging should more than make
569 up for any increase in processing time.
573 of information changed from Version 1 to Version 2, so that
574 Version 2 DWARF information is not binary compatible with
575 Version 1 information. To make it easier for consumers to
576 support both Version 1 and Version 2 DWARF information, the
577 Version 2 information has been moved to a different object
578 file section, \dotdebuginfo{}.
581 A summary of the major changes made in
582 \addtoindex{DWARF Version 2}
583 compared to the DWARF Version 1 may be found in the
584 \addtoindex{DWARF Version 2}