* configure.ac: Force -mlp64 to CFLAGS on HP-UX ia64 port.
It is the only supported mode, and expects gcc as C compiler.
* include/lightning.h, lib/jit_ia64-cpu.c, lib/jit_ia64.c:
Correct ia64 port to work on HP-UX that runs it in big endian
mode.
* check/varargs.tst: Correct misplaced .align directive
that was causing the double buffer to not be aligned at
8 bytes.
* lib/jit_ia64-cpu.c:
Properly implement abi for excess arguments passed on
stack.
Simplify load/store with immediate displacement argument
with zero value.
Simplify some calls to "subi" changing to "addi" with
a negative argument.
Remove some #if 0'ed code, that could be useful in
special conditions, but the most useful one would be
to "optimize" "static" jit functions, but for the sake
of simplicity, jit functions are implemented in a way
that can be passed back to C code as C function pointers.
Add an attribute to prototypes of several unused functions.
These functions are defined for the sake of implementing all
Itanium documented instructions, but a significant amount of
them is not used by lightning.
* lib/jit_ia64-fpu.c: Simplify load/store with zero immediate
displacement and add unused attribute for functions not used
by lightning, but required to provide macros implementing all
Itanium documented instructions.
* lib/jit_ia64.c: Update for the properly implemented abi
for stack arguments.
* lib/lightning.c: Mark an unused function as such.
lib/jit_ia64-cpu.c:
Correct immediate range check of integer comparisons when
inverting arguments.
Correct gei_u that was not decrementing immediate when
inverting arguments.
Correct b?add* and b?sub* that were not properly updating
the result register.
* lib/jit_ia64-cpu.c: Correct wrong mapping of 2 instructions
in "M-, stop, M-, stop" translation, that was ignoring the
last stop (implemented as a nop I- stop).
* lib/jit_ia64-fpu.c: Properly implement fnorm.s and fnorm.d,
as well as the proper integer to float or double conversion.
* lib/jit_ia64-cpu.c: Correct bogus implementation of ldr_T
for signed integers, that was using ld1.s, ld2.s and ld4.s.
The ".s" stands for speculative load, not sign extend.
* lib/jit_ia64-fpu.c: Correct bogus implementation of ldxr_T
for float and double. The third (actually, second) argument
is indeed added to the base register, but the base register
is modified. The actual M7 implementation was already correct,
just the ldxr_f and ldxr_d implementation that was kept in
a prototype state, misinterpreting what M7 does.
* lib/jit_ia64-cpu.c: Correct X2 pattern matching by preventing
it to attempt to require a stop between the L and the X
instruction; that is, check the registers and predicates
before emitting the L instruction, not after.
* lib/jit_ia64-fpu.c: Slightly simplify and correct
divr_f and divrd_d implementation.
* check/lightning.c: Add __ia64__ preprocessor define
on Itanium.
* check/alu.inc, check/clobber.tst, check/float.tst: Define
several macros conditionally to __ia64__. This is required
because __ia64__ jit generation can use way too many memory,
due to not implementing instruction reordering to avoid
as much as possible "stops", what causes way too many nops
to be generated, as well as the fact that division and
remainder requires function calls, and float division
requires significant code to implement.
* include/lightning.h: Add new backend specific movr_w_d,
movr_d_w and movi_d_w codes as helpers to ia64 varargs
functions arguments.
* lib/jit_ia64-cpu.c:
Correct wrong encoding of A5 small integers.
Correct define of "mux" instruction modifiers.
Correct ordering of arguments and predicates of cmp_xy
implementation with immediate arguments; like most other
codes with an immediate, the immediate is the second, not
the third argument.
* lib/jit_ia64-fpu.c: Actual implementation of the code
to move to/from gpr to/from fpr, to implement varargs abi.
* lib/jit_ia64.c: Make fpr argument registers not allocatable
as temporaries, no need for the extra checks when there are
plenty registers.
* lib/jit_print.c, lib/lightning.c: Minor updates for the
new movr_w_d, movr_d_w and movi_d_w codes.
* lib/jit_ia64-cpu.c, lib/jit_ia64-fpu.c: Correct code to
also insert a stop to break an instruction group if a
register is written more than once in the same group.
This may happen if a register is argument and result of
some lightning call (not a real instruction). The most
common case should be code in the pattern:
movl rn=largenum
...
mov rn=smallnum
where "rn" would end up holding "largenum".
But the problem possibly could happen in other circumstances.
* include/lightning/jit_ia64.h, lib/jit_ia64-cpu.c,
lib/jit_ia64-fpu.c, lib/jit_ia64.c:
Relocate JIT_Rn registers to the local registers, as, like
float registers, div/rem and sqrt are implemented as function
calls, and may overwrite non saved scratch registers.
Change patch_at to receive a jit_code_t instead of a
jit_node_t, so that it is easier to "inline" patches when
some instruction requires complex code to implement, e.g.
uneq and ltgt.
Correct arguments to FMA and FMA like instructions that,
due to a cut&paste error were passing the wrong argument
to the related F- implementation function.
Rewrite ltgt to return the proper result if one (or both)
of the arguments is unordered.
* include/lightning/jit_ia64.h, include/lightning/jit_private.h,
lib/jit_ia64-cpu.c, lib/jit_ia64-fpu.c, lib/jit_ia64.c,
lib/lightning.c: Rework code to detect need of a "stop" to
also handle predicates, as if a predicate is written, it
cannot be read in the same instruction group.
Use a single jit_regset_t variable for all registers when
checking need for a stop (increment value by 128 for
float registers).
Correct wrong "subi" implementation, as the code executed
is r0=im-r1, not r0=r1-im.
Use standard lightning 6 fpr registers, and rework to
use callee save float registers, that may be spill/reloaded
in prolog/epilog. This is required because some jit
instructions implementations need to call functions; currently
integer div/mod and float sqrt, what may change the value of
scratch float registers.
Rework point of "sync" of branches that need to return a
patch'able address, because the need for a "stop" before a
predicate read causes all branches to be the instruction
in slot 0, as there is no template to "stop" and branch
in the same instruction "bundle".
* include/lightning/jit_ia64.h, lib/jit_ia64-cpu.c,
lib/jit_ia64-fpu.c, lib/jit_ia64.c: New files implementing
the basic infrastructure of an Itanium port. The code
compiles and can generate jit for basic hello world like
functions.
* check/lightning.c, configure.ac, include/lightning.h,
include/lightning/Makefile.am, include/lightning/jit_private.h,
lib/Makefile.am, lib/lightning.c: Update for the Itanium
port.
* lib/jit_mips-cpu.c, lib/jit_mips.c: Correct typo and
make the jit_carry register local to the jit_state_t.
This matches code reviewed in the Itanium port, that
should use the same base logic to handle carry/borrow.