For Loops discussion

Discostew

Something I just found out.

If you have a loop that has a high iteration count to it, and want to know what may be causing it to be slow in executing, check to see if you have any variables being created within it. For instance...

for i = 0 to 20000 loop
repeat

This could be done in about 1.8ms. But, the moment you add a line such as this into it...

for i = 0 to 20000 loop
  j = 1
repeat

This number jumps to 19.3ms, causing the program to dip in frame rate. This is because the loop creates a local scope for itself, so any variable within it not found outside of it is considered needing to be created. Variable creation takes time. Now, what if you were to create the variable outside of the loop?

j = 0
for i = 0 to 20000 loop
  j = 1
repeat

This code drops processing time from 19.3ms to about 4.4ms, all because the variable is created 1 time rather than 20000 times.

Nisse5

@Discostew Great optimizing tips, thanks! These kind of things are always valuable to find out.

Martin

The first loop creates 20,000 new variables and the second creates 1 and assigns a value to it 20,000 times. It’s as simple as that. And yes, there is overhead to creating new variables instead of assigning values to existing ones

Zero Division

@Martin I'm having a little trouble understanding why it would work this way. It seems the first loop should create one local variable once and then update its value 19,999 times.

Does the variable go out of scope upon reaching repeat? If so, why? It seems like it shouldn't go out of scope until the loop exits.

Discostew

@Zero-Division Look at it from the perspective of a C-like block and FOR loop

{
   int j = 1;
}
--------------------------------------------
for( int i = 0; i < 20000; i++ )
{
   int j = 1;
}

Everything within the block is of a local scope compared to outside of it. Upon the end of the block, all variables initialized within get wiped from the stack. For the FOR statement, it technically only iterates what immediately follows it, but in this case, what follows is a block, so all statements within it are grouped as one from its perspective. But, as stated before, everything within a block is of a local scope to the rest outside, which includes the FOR statement. So as what was previously mentioned, the end of the block wipes those initialized variables, and in this case, it still does it even if it wraps back around to the FOR statement.

FUZE puts together the FOR loop a little differently, but the principle remains the same. Look at it like this. You have the FOR statement, the LOOP that begins the block, and REPEAT that ends the block. Treat everything between LOOP and REPEAT as its own local scope.

Zero Division

@Discostew said in Hints and Tips:

@Zero-Division Look at it from the perspective of a C-like block and FOR loop
{
   int j = 1;
}
--------------------------------------------
for( int i = 0; i < 20000; i++ )
{
   int j = 1;
}
Everything within the block is of a local scope compared to outside of it. Upon the end of the block, all variables initialized within get wiped from the stack. For the FOR statement, it technically only iterates what immediately follows it, but in this case, what follows is a block, so all statements within it are grouped as one from its perspective. But, as stated before, everything within a block is of a local scope to the rest outside, which includes the FOR statement. So as what was previously mentioned, the end of the block wipes those initialized variables, and in this case, it still does it even if it wraps back around to the FOR statement.

FUZE puts together the FOR loop a little differently, but the principle remains the same. Look at it like this. You have the FOR statement, the LOOP that begins the block, and REPEAT that ends the block. Treat everything between LOOP and REPEAT as its own local scope.

I get what you're saying is happening, but to my mind the lexical context shouldn't be discarded until the loop ends. At best, we're missing a VM optimization here.

If Smalltalk -- a dynamic language that has to render its entire IDE from high-level dynamic code each and every frame -- can get decent performance out of a loop like this, a language like FUZE which isn't (currently, at least) bogged down by object-oriented message dispatch should be able to do so as well.

Zero Division

Maybe I should also say: it makes sense that there would be a performance hit if the type of the variable "j" were to change inside of the loop (making that go fast involves a VM optimization which Nintendo might not allow, namely polymorphic inline cacheing.) But if the variable has already been allocated within the loop's lexical context and its type doesn't change in the program, it should just be assigned to, rather than being re-initialized.

Jongjungbu

That’s not how a for loop usually works. Each iteration is unique. What you suggest is the interpreter (or compiler in other cases) treats the first iteration as a unique one where variables are initialized and declared. All further loops it should ignore any new variables and check if that variable already exists solely in a previous for iteration. That would be nice but contrary to how iterations are normally handled.

Zero Division

@Jongjungbu said in Hints and Tips:

That’s not how a for loop usually works. Each iteration is unique. What you suggest is the interpreter (or compiler in other cases) treats the first iteration as a unique one where variables are initialized and declared. All further loops it should ignore any new variables and check if that variable already exists solely in a previous for iteration. That would be nice but contrary to how iterations are normally handled.

I don't see the point of doing it the way it is currently unless you want programs to run more slowly. If I'm guessing right, we aren't really talking about one variable. It sounds like the entire block context is being discarded and rebuilt every time through the loop. That doesn't seem right to me, at least not in a dynamic language. There's clearly an available optimization here, because this:

0 to: 19999 do: [ :i |
    | j |
    j := 1
]

takes about the same amount of time to run as

| j |
j := 1
0 to: 19999 do: [ :i |
    j := 1
]

...in my Smalltalk environment, and both are just garden variety Smalltalk for loops. So I know I'm not crazy. Both consistently execute between 0.00005 and 0.0001 seconds on my machine; the variance is about the same for both, and are mostly caused by other things vying for cycles, like my OS.

Jongjungbu

What you don’t see behind the scenes (what is more obvious in other languages) is in the “for loop” you would see “int j = 1” in the loop block.
Then in the second “for loop” iteration, this would be an error because you’ve already declared and initialized this variable as an integer equal to 1. You can’t do it again, unless you treat each iteration of the loop as a new scope.

faz808

This is very fast - but can it be improved on?
BTW, see if you can spot the two syntax errors...

for y =  1 to 10 loop
 clear()
 for I = 1 to 100000 loop
  point1 = {random(gwidth()),random(gheight()}
  point2 = {random(gwidth()),random(gheight()}
  col = {random(101)/100,random(101)/100,random(101)/100,1}
  line(point1,point2,col)
 repeat
 update()
repeat
sleep(2)

I'm having loads of fun with FUZE. Thanks.

Discostew

@faz808 Missing the close parenthesis on your use of random on the 2 points.

As far as speeding it up, you currently have your points and color variables created each iteration of I. Just throw their creation outside of the loops.

pianofire

@Discostew Just timed this and it improves performance from 5.6 secs to 3.4

faz808

In the mid eighties Elite for the Spectrum home computer was released. This very successful space game used a simple line routine to draw a dozen or so space ships. I can remember it being a bit jerky, but as it incorporated hidden line removal and only took up about 40 kB of Z80 code this was a bit of an achievement. The secret to hidden line removal is to define all your 3d points in a clockwise direction. The rear facing polygons will appear anticlockwise and a simple calculation will enable you to jump over the drawing routine and only plot the clockwise faces.
I've already written a simple spinning wire cube in qbBasic and it runs ok. How easy it will be to convert the qb code to FUZE I'm about to find out. But it will certainly keep me out of mischief for a while!

pobtastic

@faz808 Check out this Pico-8 game video

At some point I might try and make a demo similar to this, I'm sure if Pico-8 can do it then we certainly can!

Spacemario

@pobtastic Holy buckets is that cool! I love the faux reflection effect. I totally bet Fuze can do something like this.