why am I getting a precise bus fault exception (PRECISERR) on what looks like a perfectly fine aligned access (cortex-m7)

I'm getting a HardFault that results from a forced/escalated Precise Bus Fault Exception, as indicated by the PRECISERR bit in the BFSR register, and I can't seem to figure out why it is occurring. The exception occurs from within vendor-supplied startup code that previously executed fine, and I cant see any alignment or memory-related issues.

The offending instruction is ldrlt r0, [r1], #4 on the first iteration through the loop, where the value stored in r1 is 0x00040458

The full instruction sequence is shown below, where other relevant symbols used in r2 and r3 are defined in the comments

/*     Loop to copy data from read only memory to RAM. The ranges
 *      of copy from/to are specified by following symbols evaluated in
 *      linker script.
 *      __etext: End of code section, i.e., begin of data sections to copy from.
 *      __data_start__/__data_end__: RAM address range that data should be
 *      __noncachedata_start__/__noncachedata_end__ : none cachable region
 *      copied to. Both must be aligned to 4 bytes boundary.  */

    ldr    r1, =__etext          /* equal to 0x00040458 */
    ldr    r2, =__data_start__   /* equal to 0x20000000 */
    ldr    r3, =__data_end__     /* equal to 0x20000224 */

.LC0:
    cmp     r2, r3
    ittt    lt
    ldrlt   r0, [r1], #4  /* <---- exception triggered here */
    strlt   r0, [r2], #4
    blt    .LC0

The offending address listed in BFAR is 0x00040458, which corresponds to the value in r1 and is a perfectly valid 32-bit aligned address within the ITCM region (0x0 --> 0x0007FFFF).

Not sure what else could be causing this exception if the memory access itself looks fine. The exception was introduced by expanding the .text section in my linker file, as shown below

MEMORY
{
  m_interrupts  (RX)  : ORIGIN = 0x00000000, LENGTH = 0x00000400
  m_text        (RX)  : ORIGIN = 0x00000400, LENGTH = 0x00074000  /* changed from LENGTH = 0x0003FC00 */
  m_data        (RW)  : ORIGIN = 0x20000000, LENGTH = 0x00020000
  m_data2       (RW)  : ORIGIN = 0x20200000, LENGTH = 0x00020000
}

If it isn't an alignment issue, I'm not sure what it could be? But 0x00040458 is most definitely word-aligned, as is 0x0004045C which results from the #4 offset to the ldr instruction.

Also, why is 0x0004045C not shown in BFAR, since the cortex-m7 TRM says the ldr instruction applies the offset to the target register value before the memory access occurs??

Full exception registers shown below for completeness

SCB exception regs