Discussion:
Recursion - is there a faster way?
(too old to reply)
Bob Latham
2018-01-20 11:29:02 UTC
Permalink
I'm playing with some assembler code to scan directory and sub directories
for unwanted files and to list files I do want. I do have working code but I
think it could be made faster and need advice.

Currently it uses GBPB10 to catalogue a directory. I look through the objects
one at a time looking for another directory. When found a directory is added
to the path, loops back to scan again. The code does all the work when R4=-1
and I've reached the end of the deepest directory. It is on the way back up
that things get done.

I think my code is a bit slow because I scan with R3=1 so I get one object at
a time. I do this because I'm unsure about buffer sizes and how to handle
large amounts of data with R3 any bigger.

I did wonder if there is a quicker way to find directories within directories
than looking at one object at a time to see what I get. I feel there should
be something similar to pathname wild cards so before the scan the OS knows I
only want directories but the PRM doesn't list that as far as I can see.

Thoughts anyone?

Thanks.

Bob.
--
Bob Latham
Stourbridge, West Midlands
Steve Fryatt
2018-01-20 12:22:52 UTC
Permalink
On 20 Jan, Bob Latham wrote in message
Post by Bob Latham
I'm playing with some assembler code to scan directory and sub directories
for unwanted files and to list files I do want. I do have working code but
I think it could be made faster and need advice.
Currently it uses GBPB10 to catalogue a directory. I look through the
objects one at a time looking for another directory. When found a
directory is added to the path, loops back to scan again. The code does
all the work when R4=-1 and I've reached the end of the deepest directory.
It is on the way back up that things get done.
I think my code is a bit slow because I scan with R3=1 so I get one object
at a time. I do this because I'm unsure about buffer sizes and how to
handle large amounts of data with R3 any bigger.
That's almost certainly your main problem. There's an overhead to calling
OS_GBPB, so you're far better off providing a "big" buffer and getting more
results back for each attempt.

The only issue is that with recursion, you need to start stacking your
buffers as well as everything else, since you need to be able to come back
and process any remaining items from a buffer after dropping into a
subdirectory and calling OS_GBPB on that with a new buffer. This is what
Locate 2 does, but being written in C it has the advantage of being able to
use an array of structs to handle the memory allocation headache.

The other thing that makes a difference, if you're multitasking, is getting
the timeslicing right. Locate 1 returned control to the Wimp at fixed points
in the scan cycle (whenever it dropped back out of a directory, IIRC).
Locate 2 returns after fixed time periods, processing as many directories as
it can each time.

Between these two optimisations, Locate 2 turned out to be significantly
faster than Locate 1.
Post by Bob Latham
I did wonder if there is a quicker way to find directories within
directories than looking at one object at a time to see what I get. I feel
there should be something similar to pathname wild cards so before the
scan the OS knows I only want directories but the PRM doesn't list that as
far as I can see.
No, you just grab everything and read through it, checking the object types
as you go.

One final random observation: stick with OS_GBPB 10. 11 and 12 look
interesting, but feedback from Locate users over the years showed that
several fairly common third-party image filing systems (which I won't name
and shame) don't support them.
--
Steve Fryatt - Leeds, England

http://www.stevefryatt.org.uk/
Bob Latham
2018-01-20 13:49:05 UTC
Permalink
Post by Steve Fryatt
On 20 Jan, Bob Latham wrote in message
Post by Bob Latham
I think my code is a bit slow because I scan with R3=1 so I get one object
at a time. I do this because I'm unsure about buffer sizes and how to
handle large amounts of data with R3 any bigger.
That's almost certainly your main problem. There's an overhead to calling
OS_GBPB, so you're far better off providing a "big" buffer and getting more
results back for each attempt.
The only issue is that with recursion, you need to start stacking your
buffers as well as everything else, since you need to be able to come back
and process any remaining items from a buffer after dropping into a
subdirectory and calling OS_GBPB on that with a new buffer. This is what
Locate 2 does, but being written in C it has the advantage of being able to
use an array of structs to handle the memory allocation headache.
The other thing that makes a difference, if you're multitasking, is getting
the timeslicing right. Locate 1 returned control to the Wimp at fixed points
in the scan cycle (whenever it dropped back out of a directory, IIRC).
Locate 2 returns after fixed time periods, processing as many directories as
it can each time.
Between these two optimisations, Locate 2 turned out to be significantly
faster than Locate 1.
Post by Bob Latham
I did wonder if there is a quicker way to find directories within
directories than looking at one object at a time to see what I get. I feel
there should be something similar to pathname wild cards so before the
scan the OS knows I only want directories but the PRM doesn't list that as
far as I can see.
No, you just grab everything and read through it, checking the object types
as you go.
One final random observation: stick with OS_GBPB 10. 11 and 12 look
interesting, but feedback from Locate users over the years showed that
several fairly common third-party image filing systems (which I won't name
and shame) don't support them.
I understand Steve thanks for your time and extra info.

It is funny though, I've been running this codes for a few years to scan my
music library on my NAS via Lanman98. Didn't have nay problems it just
worked. I'm trying to swap to Sunfish simply because it claims to be able to
do UTF-8 no other reason. But I have lots of issues with Sunfish I didn't
have before, I've cleared many but my current code gets slower and slower and
eventually kills the machine because the RMA is full, what is causing that (I
have no idea at all) and why does this not happen with Lanman98?

Puzzled.

Thanks again.

Cheers

Bob.
--
Bob Latham
Stourbridge, West Midlands
Steve Fryatt
2018-01-20 14:05:39 UTC
Permalink
On 20 Jan, Bob Latham wrote in message
Post by Bob Latham
It is funny though, I've been running this codes for a few years to scan
my music library on my NAS via Lanman98. Didn't have nay problems it just
worked. I'm trying to swap to Sunfish simply because it claims to be able
to do UTF-8 no other reason. But I have lots of issues with Sunfish I
didn't have before, I've cleared many but my current code gets slower and
slower and eventually kills the machine because the RMA is full, what is
causing that (I have no idea at all) and why does this not happen with
Lanman98?
Without more information, it's impossible to say. Locate and Sunfish get on
OK as far as I know (NFS is my main file storage, so Sunfish is in daily use
here), so I don't think it's an inherent Sunfish issue.

If you make your code available, maybe someone can have a look at it? Also,
is there any reason for using assembly language, where you have to sort out
so many simple issues yourself, instead of a higher level language -- it's
not as if scanning some MP3s over a network is going to be /that/ speed
critical?
--
Steve Fryatt - Leeds, England

http://www.stevefryatt.org.uk/
Bob Latham
2018-01-20 16:08:26 UTC
Permalink
Post by Steve Fryatt
On 20 Jan, Bob Latham wrote in message
Post by Bob Latham
It is funny though, I've been running this codes for a few years to scan
my music library on my NAS via Lanman98. Didn't have nay problems it just
worked. I'm trying to swap to Sunfish simply because it claims to be able
to do UTF-8 no other reason. But I have lots of issues with Sunfish I
didn't have before, I've cleared many but my current code gets slower and
slower and eventually kills the machine because the RMA is full, what is
causing that (I have no idea at all) and why does this not happen with
Lanman98?
Without more information, it's impossible to say.
I've just found that my code runs much better, faster and does not slow down
or crash the RMA on a Risc PC but my ARMiniX dies a death pretty quickly.

It will not be load store alignment issues as I've never attempted that in my
assembler code. So I'm thinking it is something to do with not being coded
for 32 bit properly. Something to do with the status register or the stack I
suspect.

I'll have to google for a reminder of what I can/ cannot do in 32bit.
Post by Steve Fryatt
Locate and Sunfish get on OK as far as I know (NFS is my main file storage,
so Sunfish is in daily use here), so I don't think it's an inherent Sunfish
issue.
No I wouldn't think so either, it is just odd how LM98 doesn't seem to mind.
Post by Steve Fryatt
If you make your code available, maybe someone can have a look at it?
I code for pleasure these days and it is quite amateurish I'm sure, can I try
a few other things first?
Post by Steve Fryatt
Also, is there any reason for using assembly language, where you have to
sort out so many simple issues yourself, instead of a higher level language
-- it's not as if scanning some MP3s over a network is going to be /that/
speed critical?
I wish I found C useable, it's too hard for a simple person like me. Not so
much C itself but how to do it on RO, I've tried a few times and just given
up after hitting brick walls. I did at one time write some simple C for the
PC which is easier and there are good step by step tutorials.

Mp3s, Mp3s!!! Shock horror! Curse the thought. Wouldn't be seen dead playing
audibly inferior lossy formats, it has to be flac the only sensible choice.
:-)

Thanks.

Cheers,

Bob.
--
Bob Latham
Stourbridge, West Midlands
Martin
2018-01-20 17:53:05 UTC
Permalink
Post by Bob Latham
Post by Steve Fryatt
On 20 Jan, Bob Latham wrote in message
Post by Bob Latham
It is funny though, I've been running this codes for a few years to
scan my music library on my NAS via Lanman98. Didn't have nay
problems it just worked. I'm trying to swap to Sunfish simply
because it claims to be able to do UTF-8 no other reason. But I
have lots of issues with Sunfish I didn't have before, I've cleared
many but my current code gets slower and slower and eventually
kills the machine because the RMA is full, what is causing that (I
have no idea at all) and why does this not happen with Lanman98?
Without more information, it's impossible to say.
I've just found that my code runs much better, faster and does not slow
down or crash the RMA on a Risc PC but my ARMiniX dies a death pretty
quickly.
That must be a major clue. I think.
Post by Bob Latham
So I'm thinking it is something to do with not being coded for 32 bit
properly. Something to do with the status register or the stack I
suspect.
I'll have to google for a reminder of what I can/ cannot do in 32bit.
Assuming you tested with the same version and settings of Sunfish, then
if you are running RISC OS 4.39 on the RPC, and some variety of 5.23 on
the ARMiniX, I would be more inclined to first suspect OS differences
myself, not 26/32 bit differences. Though I cannot think of any that fit
your symptoms!

Does using LM98 on both machines make the program behave the same
(allowing for the machine differences!)?
--
Martin Avison
Note that unfortunately this email address will become invalid
without notice if (when) any spam is received.
Bob Latham
2018-01-21 09:21:06 UTC
Permalink
Post by Martin
Post by Bob Latham
I've just found that my code runs much better, faster and does not slow
down or crash the RMA on a Risc PC but my ARMiniX dies a death pretty
quickly.
That must be a major clue. I think.
You'd think so wouldn't you.
Post by Martin
Post by Bob Latham
So I'm thinking it is something to do with not being coded for 32 bit
properly. Something to do with the status register or the stack I
suspect.
I'll have to google for a reminder of what I can/ cannot do in 32bit.
Assuming you tested with the same version and settings of Sunfish, then
if you are running RISC OS 4.39 on the RPC, and some variety of 5.23 on
the ARMiniX, I would be more inclined to first suspect OS differences
myself, not 26/32 bit differences. Though I cannot think of any that fit
your symptoms!
Interesting thought.
Post by Martin
Does using LM98 on both machines make the program behave the same
(allowing for the machine differences!)?
This is where it gets very odd. The program runs fine on both machines with
Lanman98, no crashes or anything and on the ARMiniX it flies, way faster than
the RPC.

Using Sunfish on the RPC makes little difference, it now works fine and
nothing noticeable speed wise.

Using sunfish on the ARMiniX and problems. It is very slow to start, gets
even slower and eventually crashes the RMA.

I don't know how to find what is doing for the RMA.

Thanks for your interest and help.


Bob.
--
Bob Latham
Stourbridge, West Midlands
Martin
2018-01-21 10:42:35 UTC
Permalink
Post by Bob Latham
Post by Bob Latham
I've just found that my code runs much better, faster and does not
slow down or crash the RMA on a Risc PC but my ARMiniX dies a death
pretty quickly.
[Snip]
Post by Bob Latham
This is where it gets very odd. The program runs fine on both machines
with Lanman98, no crashes or anything and on the ARMiniX it flies, way
faster than the RPC.
Using Sunfish on the RPC makes little difference, it now works fine and
nothing noticeable speed wise.
Using sunfish on the ARMiniX and problems. It is very slow to start,
gets even slower and eventually crashes the RMA.
Very strange - especially as I suspect the RO5 RMA will be able to be
larger than the RO4 RMA.

One difference is that a RO4 Heap will have block lengths which are
multiples of 8, but in RO5.18 it was changed to multiples of 4. Not that
I can see how that could affect your problem!
Post by Bob Latham
I don't know how to find what is doing for the RMA.
<plug mode>
If you were to install my Reporter, you may be able to find some more
information about what is happening.

It provides a command *ReportHeap, which by default will check the RMA
Heap for validity, and display something like...

Heap RMA=&20000000 Max=256M Size=15740K Free=28208 Largest=6268
Used=15712K (3642) Freed=27552 (76) Unused=656

... which shows the sizes of each of the areas of the Heap, with the
numbers of blocks involved in brackets.

If a parm of L is added, it will list each of the used, freed and unused
blocks, with address, length and type. If Lnn is used, the first nn bytes
of each blocks are also listed in hex and character. These options can
generate a lot of output, but it may be worthwhile to get a better idea
of what sort of blocks are being created, and even their contents.

There is also included with Reporter an example BASIC program which will
run the *ReportHeap command at intervals, and it is easy to change the
interval or the parameters. See Reporter Help -> Commands -> ReportHeap
for details.

If the command finds the Heap has been corrupted, it will display the
error found and the contents of the Heap around the error, which is
useful where something is ovewriting memory.

PM me (see Help) if you have any specific questions.
</plug mode>


Martin
--
Martin Avison
Note that unfortunately this email address will become invalid
without notice if (when) any spam is received.
Bob Latham
2018-01-24 10:21:17 UTC
Permalink
Post by Martin
If you were to install my Reporter, you may be able to find some more
information about what is happening.
Thanks Martin, it is my intention to do just that but I don't have the
time just at the moment, hopefully soon.

Cheers,

Bob.

Loading...