Because there is no instruction that moves just half of a 64-bit FPR
to/from a GPR, we need to use scratch memory for this operation. This
code can theoretically use:
1. DMI_DATA, if it is memory mapped in the target.
2. DMI_PROGBUF, if it is writable in the target.
3. A user-configured address.
I have only tested this code very lightly. One reason is that gdb thinks
that on RV32 harts every register is 32 bits wide. Another is that this
is mostly proof-of-concept to satisfy the small program buffer code
review, which I don't want to drag out forever.
Existing tests don't realize that floating support was broken with
RV32D, and don't realize that it still doesn't work because of the gdb
problem mentioned above.
This change improves Issue #110 but there's more work to be done.
Change-Id: I99b8a36e5fea26f1d9e16e36cf99adc7be26b944