Fill memory with 16 bits value

Página 1/2
| 2

Por aoineko

Paragon (1138)

imagem de aoineko

06-03-2023, 09:18

What is the quicker way to fill memory with a 16 bits value?
We can't use the ldir trick here and I can't think about something else than a simple loop with:

ld (HL),C
inc HL
ld (HL),B
inc HL

Any other way?
Self modified code is intersting but it is out of the scope of my use case.

Entrar ou registrar-se para comentar

Por Micha

Expert (110)

imagem de Micha

06-03-2023, 09:36

You could try something like:

ld [OldSP],sp
ld sp,hl
push bc
push bc
push bc
push bc

ld sp,[OldSp]

but remember the pushes will fill memory from back to front.

Por bore

Master (182)

imagem de bore

06-03-2023, 09:39

Why can't you use ldir?
Just place DE at HL+2 and go?

Por Metalion

Paragon (1628)

imagem de Metalion

06-03-2023, 10:46

You can still use ldir :

ld hl,address
ld de,value
push hl
ld (hl),e
inc hl
ld (hl),d
inc hl
ex de,hl
pop hl
ld bc,length-2

Por aoineko

Paragon (1138)

imagem de aoineko

06-03-2023, 12:51

Cool. Many thanks!

Por santiontanon

Paragon (1832)

imagem de santiontanon

06-03-2023, 15:53

Micha's push solution is the fastest btw. It's also useful when you need to fill it with an 8bit value and use ldir. The fastest solution is to load the 8bit value twice in a 16bit register and use "push" repeatedly. This was a classic trick used in many games in the past to clear/write to memory fast!

Por ro

Scribe (5061)

imagem de ro

06-03-2023, 16:07

You know, I've seen that "trick" with using the StackPointer before. And just like other tricks, there's a down point on that; namely it becomes harder to read the code. Sure, if you're a very experienced coder it'll be just fine right.

Coding that Z80 is beautiful, and you want to make fast and efficient code. But at the cost of getting dirt in. I am a big fan of clean-code. Clean-code isn't always the fastest. But it's super efficient on reading back the lines and understand what is happening. No need for extra comments explaining what happens.

So, using the SP trick and push stuff in RAM is not my fave way. Having said that, it's an old trick that works very well Smile

2 cents.

Por Grauw

Ascended (10821)

imagem de Grauw

06-03-2023, 17:07

And remember, you can make LDIR significantly faster for large blocks like so.

If the block size is not fixed amount and so you would want to use FastLDIR, its self modifying code could be eliminated (at some extra set-up cost) like so:

    xor a
    sub c
    and 16 - 1
    add a,a
    push hl
    add a,FastLDIR_Loop & 0xFF
    ld l,a
    ld a,0
    adc a,FastLDIR_Loop >> 8
    ld h,a
    ex (sp),hl
    ; ...

Por bore

Master (182)

imagem de bore

06-03-2023, 18:25

I don't think the SP-variant is particularly unreadable compared to any other assembly code.
Just put it in a subroutine with some comments and you're fine.

The problem with it is that you are hogging the SP so any interrupt will use your destination as stack if you don't disable them.
If it is usable to you or not depends on if you can accept the interrupts to be disabled for the duration of the copy.

OTOH you save something like 12 cycles per byte by using push instead of ldi so you can probably restore SP and enable interrupts every 8th byte and still save cycles by using the push-method.

Por Prodatron

Paragon (1857)

imagem de Prodatron

06-03-2023, 19:04

@Grauw, very nice! I just wonder why you need a DI/EI here?

ld (FastLDIR_jumpOffset),a

Por theNestruo

Champion (430)

imagem de theNestruo

06-03-2023, 19:44

Prodatron wrote:

@Grauw, very nice! I just wonder why you need a DI/EI here?
ld (FastLDIR_jumpOffset),a

This is a wild guess (it is not my code) but: ei enables interrupts after the next instruction, so the code within the interrupts disabled section is actually:

    ld (FastLDIR_jumpOffset),a
    jr nz,$  ; self modifying code

If interrupts were enabled, another call to this routine (e.g.: during the interruption) with a different value would overwrite the offset written, making the first call to ldir an incorrect number of bytes.

Página 1/2
| 2