swift - Assembler on 64-bit iOS (A64) -
i'm trying replace methods asm-implementations. target arm64 on ios (iphone 5s or newer). want use dedicated assembler-file, inline assembler comes additional overhead, , quite cumbersome use a64 memory offsets.
there not documentation on over internet, i'm kind of unsure if how way go. therefore, i'll describe process followed move function asm.
the candidate function question 256-bit integer comparison function.
uint256.h
@import foundation; typedef struct { uint64_t value[4]; } uint256; bool eq256(const uint256 *lhs, const uint256 *rhs);
bridging-header.h
#import "uint256.h"
reference implementation (swift)
let result = x.value.0 == y.value.0 && x.value.1 == y.value.1 && x.value.2 == y.value.2 && x.value.3 == y.value.3
uint256.s
.globl _eq256 .align 2 _eq256: ldp x9, x10, [x0] ldp x11, x12, [x1] cmp x9, x11 ccmp x10, x12, 0, eq ldp x9, x10, [x0, 16] ldp x11, x12, [x1, 16] ccmp x9, x11, 0, eq ccmp x10, x12, 0, eq cset x0, eq ret
resources found
section 5.1.1 of procedure call standard arm 64-bit architecture (aarch64) document explains purpose of each register during procedure calls.
ios specific deviations.
ios assembler directives.
questions
i've tested code using xctest, creating 2 random numbers, running both swift , asm implementations on them , verifying both report same result. code seems correct.
in asm file:
.align
seems optimization - necessary, , if yes, correct value align to?is there source clearly explains how calling convention specific function signature is?
a. how can know inputs passed via
x0
,x1
?b. how can know correct pass output in
x0
?c. how can know safe clobber
x9
-x12
, status registers?d. function called same way when call c instead of swift?
what "indirect result location register" mean
r8
register description in arm document?do need other assembler directives besides
.globl
?when set breakpoints, debugger seems confused is, showing incorrect lines etc. doing wrong?
- the
.align 2
directive required program correctness. a64 instructions need aligned on 32-bit boundaries. - the documentation linked seems clear me , unfortunately isn't place ask recommendations.
- you can determine registers
lhs
,rhs
stored inx0
,x1
by following instructions given in section 5.4.2 (parameter passing rules) of procedure call standard arm 64-bit architecture (aarch64) document linked. since parameters both pointers specific rule applies c.7. - you can determine register used return values in following instructions given section 5.5 (result return). has following same rules parameters. since function returns integer rule c.7 applies , value returned in x0.
- it's safe change values stored in registers x9 through x12 because they're listed temporary registers in table given in section 5.1.1 (general-purpose registers)
- the question whether function called same way in swift in c. both procedure call standard document , apple specific exceptions document linked defined in terms of c , c++. presumably swift follows same conventions don't know if apple has made explicit anywhere.
- you can determine registers
- the purpose of r8 described in section 5.5 (result return). it's used when return value big fit registers used return values. in case caller creates buffer return value , puts address in r8. function copies return value in register.
- i don't believe need else in example assembly program.
- you've asked many questions. should post separate , more detailed question describing problem.
i should 1 advantage of writing code using inline assembly wouldn't have worry of this. following untested c code shouldn't unwieldy:
bool eq256(const uint256 *lhs, const uint256 *rhs) { const __int128 *lv = (__int128 const *) lhs->value; const __int128 *rv = (__int128 const *) rhs->value; uint64_t l1, l2, r1, r2, ret; asm("ldp %1, %2, %5\n\t" "ldp %3, %4, %6\n\t" "cmp %1, %3\n\t" "ccmp %2, %4, 0, eq\n\t" "ldp %1, %2, %7\n\t" "ldp %3, %4, %8\r\n" "ccmp %1, %3, 0, eq\n\t" "ccmp %2, %4, 0, eq\n\t" "cset %0, eq\n\t", : "=r" (ret), "=r" (l1), "=r" (l2), "=r" (r1), "=r" (r2) : "ump" (lv[0]), "ump" (rv[0]), "ump" (lv[1]), "ump" (rv[1]) : "cc") return ret; }
ok, maybe it's little unwieldy.
Comments
Post a Comment