Some interesting reading about routine calls in ARM:http://infocenter.arm.com/help/topic/co ... /ATPCS.pdf
I haven't finished it yet, but so far I understand the job of saving the SP and registers is left to the Called routine, not the calling one.
Since from the loader the called routine is the startup assembler code the linker script places at the start, and we change the SP already before calling it, I think we need to manually provision for saving that before the call.
R0-R3 do not need to be preserved by the called routine. So we need to do this from the loader:
1. Save R4-R11 to the stack
2. Save SP to a known iram position (just a pointer type variable should work).
3. Save a return address somewhere (exRAM or IRAM?).
4. Change SP
5. Call App
6. (Return address pointing to this instruction). Load SP from variable in step 3.
7. Pop R4-R11 from stack.
8. Reconfigure NVIC VTOR.
9. Enable interrupts
From the app, to return, we have to:
1. Disable interrupts.
2. read the address saved in step 3 above.
3. load it to the PC, that will take the PC to step 6.
The peripherals state will be indeterminate at that point. Specially the USB peripheral, and any other using a buffer, since the buffer address was probably changed, and other configuration registers too.
EDIT: The more I think about it, the more I think it may be better to reset the MCU to restart the program in flash when the app is finished like Roger suggested, otherwise there too many things to control.
Of course you can make sure the Loader and the app do not use the same peripherals, so the peripherals for the loader have not changed state on return, but that limits what you do. Seems better to just reboot and let the loader pick a new app.