I think your description of how ranges work is correct. I've made all your requested changes (I think). Attached is the latest patch for review. :-) -- Caroline cmtice@google.com gdb/ChangeLog: 2020-07-09 Caroline Tice * dwarf2/read.c (RNGLIST_HEADER_SIZE32) New constant definition. (RNGLIST_HEADER_SIZE64): New constant definition. (struct dwop_section_names): Add rnglists_dwo. (dwop_section_names): Add .debug_rnglists.dwo, .zdebug_rnglists.dwo. (struct dwo_sections): Add rnglists field. (read_attribut_reprocess): Add tag parameter. (dwarf2_ranges_read): Add tag parameter. (cu_debug_rnglists_section): New function (decl & definition). (dwarf2_locate_dwo_sections): Add code to read rnglists_dwo section. (dwarf2_rnglists_process): Add a dwarf_tag parameter, for the kind of die whose range is being checked; add code to read the rnglist section from the dwo file rather than from the main objfile, if appropriate. Add cases for DW_RLE_base_addressx, DW_RLE_startx_length, DW_RLE_startx_endx. Also, update to only add the base address to DW_RLE_offset_pairs (not to all ranges), moving test inside if-condition and updating complaint message. (dwarf2_ranges_process): Add dwarf tag parameter and pass it to dwarf2_rnglists_process. (dwarf2_ranges_read): Add dwarf tag parameter and pass it to dwarf2_ranges_process. (dwarf2_get_pc_bounds): Check for DW_FORM_rnglistx when setting need_ranges_base and update comment appropriately. Also pass die tag to dwarf2_ranges_read. (dwarf2_record_block_ranges): Check for DW_FORM_rnglistx when setting need_ranges_base and update comment appropriately. Also pass die tag to dwarf2_ranges_read. (read_full_die_1): Add code to read DW_AT_rnglists_base and assign to cu->ranges_base. Also pass die tag to read_attribute_reprocess. (partial_die_info::read): Check for DW_FORM_rnglistx when setting need_ranges_base and update comment appropriately. Also pass die tag to read_attribute_reprocess and dwarf2_ranges_read. (read_loclist_header): Rename function to read_loclists_rnglists_header, and update function comment appropriately. (read_loclist_index): Call read_loclists_rnglists_header instead of read_loclist_header. (read_rnglist_index): New function. (read_attribute_reprocess): Add tag parameter. Add code for DW_FORM_rnglistx, passing tag to read_rnglist_index. (read_attribute_value): Mark DW_FORM_rnglistx with need_reprocess. gdb/testsuite/ChangeLog: 2020-07-09 Caroline Tice * gdb.dwarf2/dw5-rnglist-test.cc: New file. * gdb.dwarf2/dw5-rnglist-test.exp: New file. On Fri, Jul 3, 2020 at 10:11 PM Simon Marchi wrote: > > On 2020-07-03 6:47 p.m., Caroline Tice wrote: > > " > > > > I have made most of your requested changes, which are in the attached > > patch. I disagree with one of your comments, and I had a question > > about another: > > > > First the disagreement. You said: > > > >>> @@ -18684,7 +18764,8 @@ partial_die_info::read (const struct die_reader_specs *reader, > >>> /* It would be nice to reuse dwarf2_get_pc_bounds here, > >>> but that requires a full DIE, so instead we just > >>> reimplement it. */ > >>> - int need_ranges_base = tag != DW_TAG_compile_unit; > >>> + int need_ranges_base = (tag != DW_TAG_compile_unit > >>> + && attr.form != DW_FORM_rnglistx); > >> > >> It looks like this fix is unrelated from the rest of the patch, which deals > >> about reading rnglists from dwo files. It's better to have on fix per patch. > >> So I think this fix would deserve its own patch, with a commit message that > >> clearly explains the issue and the fix. > >> > > This is actually a required fix for dealing with rnglists. Without > > it, when the DW_AT_ranges uses the form > > DW_FORM_rnglistx, the wrong value is calculated and the entire rnglist > > is ignored (if you turn on complaints this will happen). So without > > this fix, I can't test any of the other code for fixing rnglists, with > > or without .dwo files. > > Ah ok, I must have understood it wrong then. I find all of this really confusing. > > I'll summarize what I think I understand, please let me know if I say anything wrong. > We are dealing with two similar ways of dealing with address ranges, one is the > pre-standard GNU version here, which was using DW_AT_GNU_ranges_base: > > https://gcc.gnu.org/wiki/DebugFission > > and the version in DWARF 5. > > In the pre-standard version, only one range section was present, and it was in the > main linked file. DW_AT_ranges attributes in the dwo, the form DW_FORM_sec_offset, > are actually pointing at the section in the main file, at offset > > CU's DW_AT_GNU_ranges_base value (found in the skeleton, in the main file) + DIE's DW_AT_ranges > > If there is a DW_AT_ranges attribute in the skeleton CU, in the main linked file, > then it is absolute, it is _not_ added to the DW_AT_GNU_ranges_base value. Hence > the `needs_range_base` thingy. The `ranges_offset` variable in this function is > the direct offset to the range list to read. > > In DWARF 5, we have a ranges section in the main linked file and one in the dwo. DW_AT_ranges > attributes refer to the ranges section of the file they are in. The DW_AT_rnglists_base > points to the beginning of the range table for that CU. The actual value of DW_AT_ranges > attribute (when using the DW_FORM_rnglistx form) is an index in the offset table. However, > due to the reprocessing you added in this patch, the attribute value is now in fact the > offset with `DW_AT_rnglists_base` already added. And _that's_ the reason why > DW_FORM_rnglistx attributes should not have the range base applied. > > I think the comment above in the code is misleading and needs to be updated, this one: > > /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton. > We take advantage of the fact that DW_AT_ranges does not appear > in DW_TAG_compile_unit of DWO files. */ > > In commit 18a8505e38fc ("Dwarf 5: Handle debug_str_offsets and indexed attributes > that have base offsets."), the it was changed like so: > > - /* DW_AT_ranges_base does not apply to DIEs from the DWO skeleton. > + /* DW_AT_rnglists_base does not apply to DIEs from the DWO skeleton. > > I think it was a wrong change, because it mixes two different things. DW_AT_rnglists_base > is a DWARF 5 thing, and the range base not applying to the values from the skeleton is a > pre-standard thing. I think it should say something like: > > /* DW_AT_GNU_ranges_base does not apply to DIEs from the DWO skeleton. > We take advantage of the fact that DW_AT_ranges does not appear > in DW_TAG_compile_unit of DWO files. > > Attributes of form DW_AT_rnglistx have already had their value changed > by read_rnglist_index and already include DW_AT_rnglists_base, so don't > add the ranges base either. */ > > > > >> I'm also not sure I understand it: when the form is DW_FORM_rnglistx, don't > >> we want to sometimes apply the base coming from DW_AT_rnglists_base? > > > > It all depends on what form is being used. If the DW_AT_ranges is > > using DW_FORM_sec_offset (which is what > > GCC generates), then you do need to add the base value.But if you > > are using the DW_FORM_rnglisx (which is what clang generates), the > > base value has already been considered/incorporated into the attribute > > so adding it in (again) causes the value to be incorrect (again, turn > > on complaints and see what happens if you try to add the base into > > something with DW_FORM_rnglistx). From the DWARF5 spec: "[A] rnglist > > is either represented as: An index into the .debug_rnglists section > > (DW_FORM_rnglistx). The ...operand identifies an offset location > > relative to the base of that section...[or] An offset into the > > .debug_rnglists section (DW_FORM_sec_offset). The operand consists of > > a byte offset from the beginning of the .debug_rnglists section." > > Note the DW_FORM_rnglistx is already relative to the base. > > Ok. What was not clear to me in the first pass was that read_rnglist_index > already has modified the attribute value to include DW_AT_rnglists_base. > > Now, about values of class rnglist and form DW_FORM_sec_offset in DWARF 5. > The spec says this about them: > > - An offset into the .debug_rnglists section (DW_FORM_sec_offset). > The operand consists of a byte offset from the beginning of the > .debug_rnglists section. It is relocatable in a relocatable object file, > and relocated in an executable or shared object file. In the 32-bit > DWARF format, this offset is a 4-byte unsigned value; in the 64-bit > DWARF format, it is an 8-byte unsigned value (see Section 7.4 on > page 196). > > Since it says that the value is already relocated in an executable, does it > mean we should not add the base for these, in DWARF 5? > > > Now, my question: I'm not quite clear as to whether you really want > > me to change the code in > > cu_debug_rnglists_section or not. I wrote the code mostly by copying > > the code for loclists as closely as possible, and making whatever > > changes to that were required to make it correct for rnglists. Is it > > better to have the code more consistent with the loclists code?If > > you really want me to make the changes there you suggest, then I will; > > I think the situation is a bit different than for `cu_debug_loc_section`: > when you have a dwo, I don't think there is the possibility of having an > attribute referring to a location list in the skeleton. So if you have a > dwo, you'll always want the loc section from the dwo. So there, the pattern > > if (there is a dwo) > return the dwo section; > > return the main file section; > > makes more sense. But in the case of range lists, we have attributes > in both the skeleton and the dwo that require range lists, and the right > section to return depends on where the attribute is. That's why, when I > see this pattern: > > if (there is a dwo) > return the dwo section; > > return the main file section; > > that pattern, I find it looks wrong, because the right section to return > does not depend (only) on whether there is dwo. As we saw, it only works > "by chance" because this is only called once for the skeleton (when > reading DW_AT_ranges) and the `cu->dwo_unit` pointer happens not to be set > up yet. It is also one refactor away from breaking, if for some reason we > reorder some code and the `cu->dwo_unit` pointer gets initialized earlier. > > > I just wanted confirmation that you really want that change (i.e. > > adding the dwarf tag and checking vs. DW_TAG_compile_unit, & calling > > it from dwarf2_rnglists_process). > > I think that checking DW_TAG_compile_unit is still a bit of a hack - ideally, > the caller could just tell us if the DIE comes from a dwo or not. But at > least it's logical and based on some DWARF knowledge. It would certainly > require a comment to explain it, because it wouldn't be obvious. > > > @@ -1379,11 +1382,16 @@ static void read_call_site_scope (struct die_info *die, struct dwarf2_cu *cu); > > static void read_variable (struct die_info *die, struct dwarf2_cu *cu); > > > > static int dwarf2_ranges_read (unsigned, CORE_ADDR *, CORE_ADDR *, > > - struct dwarf2_cu *, dwarf2_psymtab *); > > + struct dwarf2_cu *, dwarf2_psymtab *, > > + dwarf_tag); > > This declaration is unneeded and can simply be removed. Could you push an > obvious patch that does this?. > > > > @@ -13917,8 +13969,20 @@ dwarf2_rnglists_process (unsigned offset, struct dwarf2_cu *cu, > > if (range_beginning == range_end) > > continue; > > > > - range_beginning += *base; > > - range_end += *base; > > + /* Only DW_RLE_offset_pair needs the base address added. */ > > + if (rlet == DW_RLE_offset_pair) > > + { > > + if (!base.has_value ()) > > + { > > + /* We have no valid base address for the ranges > > + data. */ > > + complaint (_("Invalid .debug_rnglists data (no base address)")); > > The comment fits on a single line. But it can probably be changed to be more > specific to DW_RLE_offset_pair. Maybe the complaint could be more specific too, > and mention DW_RLE_offset_pair. It would be valid to have .debug_rnglists without > a base address (I think?), but here it's invalid because there is not base address > _and_ we have encountered a DW_RLE_offset_pair. > > > @@ -19094,6 +19167,57 @@ read_loclist_index (struct dwarf2_cu *cu, ULONGEST loclist_index) > > return bfd_get_64 (abfd, info_ptr) + loclist_base; > > } > > > > +/* Given a DW_FORM_rnglistx value rnglist_index, fetch the offset from the > > + array of offsets in the .debug_rnglists section. */ > > +static CORE_ADDR > > +read_rnglist_index (struct dwarf2_cu *cu, ULONGEST rnglist_index) > > +{ > > + struct dwarf2_per_objfile *dwarf2_per_objfile = cu->per_objfile; > > + struct objfile *objfile = dwarf2_per_objfile->objfile; > > + bfd *abfd = objfile->obfd; > > + ULONGEST rnglist_header_size = > > + (cu->header.initial_length_size == 4 ? LOCLIST_HEADER_SIZE32 > > + : LOCLIST_HEADER_SIZE64); > > Should this really refer to loclist header sizes? If they are the same > sizes as range list table headers, I suppose it's just a coincidence. > > > + ULONGEST rnglist_base = > > + (cu->dwo_unit) ? rnglist_header_size : cu->ranges_base; > > cu->dwo_unit != nullptr > > > + ULONGEST start_offset = > > + rnglist_base + rnglist_index * cu->header.offset_size; > > + > > + /* Get rnglists section. */ > > + struct dwarf2_section_info *section = cu_debug_rnglists_section (cu); > > + if (section == nullptr) > > + error (_("Cannot find .debug_rnglists section [in module %s]"), > > + objfile_name(objfile)); > > Space before paren. > > > + > > + /* Read the rnglists section content. */ > > + section->read (objfile); > > + if (section->buffer == NULL) > > In new code, I suggest using nullptr instead of NULL, just for consistency. > > > + error (_("DW_FORM_rnglistx used without .debug_rnglists section " > > + "[in module %s]"), > > + objfile_name (objfile)); > > + > > + /* Verify the rnglist header is valid (same as loclist header). */ > > + struct loclist_header header; > > + read_loclist_header (&header, section); > > Ok well, if we are going to take advantage that they are the same, the name > of the function and types should reflect it. `read_loclist_header` should > become `read_loclists_rnglists_header` (using the plural to match the section > names). > > > + if (rnglist_index >= header.offset_entry_count) > > + error (_("DW_FORM_rnglistx index pointing outside of " > > + ".debug_rnglists offset array [in module %s]"), > > + objfile_name(objfile)); > > Space before paren. > > > + > > + /* Validate that the offset is within the section's range. */ > > + if (start_offset >= section->size) > > + error (_("DW_FORM_rnglistx pointing outside of " > > + ".debug_rnglists section [in module %s]"), > > + objfile_name(objfile)); > > + > > + const gdb_byte *info_ptr = section->buffer + start_offset; > > + > > + if (cu->header.offset_size == 4) > > + return read_4_bytes (abfd, info_ptr) + rnglist_base; > > + else > > + return read_8_bytes (abfd, info_ptr) + rnglist_base; > > An idea for a subsequent cleanup, we could have a `read_offset` function > that does > > if (cu->header.offset_size == 4) > return read_4_bytes (abfd, info_ptr); > else > return read_8_bytes (abfd, info_ptr); > > There are a few spots that could use it. > > Simon