Computed GOTO, in MS BASIC!

Jumping for fun with MS / Commodore / AppleSoft BASIC.

An abstract title illustration

Some BASICs have a really nice feature: computed GOTO, where we can use any numeric value for a jump target, not just a constant. However, these seem to be more on the rarer side of things, and MS BASIC, or any of its descendants, like Commodore BASIC or AppleSoft BASIC, is certainly none of them. Computed GOTO is not for the hoi poli. — Or is it?

Here, we’re going to explore ways to modify BASIC by BASIC, and by BASIC alone, and by some self-modifying code trickery. The beauty of this is that this should be a rather general solution and much less platform dependent than, say, a machine language implementation. Endianness may be an issue to be aware of, but otherwise… Here, we do it for a 6502-based system, which is little-endian, but we’ll point out, where this matters.

Computed GOTO — What it’s All About

You’ll be probably familar with the ways of GOTO and GOSUB: the command is followed by a line number, which serves as a label, and as this executes — boom — we jump there.

10 PRINT " BASIC IS GREAT! ";
20 GOTO 10

The problem being, it’s always line 10, we‘re jumping to. There is ON … GOTO (and ON … GOSUB), which provides kind of a jump table, but this is also rather limiting, as the condition, we’re jumping on, must be a 1-based integer index in consecuitive numbers.

Now have a look at the cornucopia of wonders, which is computed GOTO, where the target can be any numeric value:

10 A=20
20 PRINT "COMPUTED GOTO IS ";
30 GOSUB 100+INT(RND(1)*5)*10
40 PRINT "! ";
50 GOTO A
100 PRINT "GREAT";:RETURN
110 PRINT "FANTASTIC";:RETURN
120 PRINT "WONDERFUL";:RETURN
130 PRINT "FABULOUS";:RETURN
140 PRINT "COOL";:RETURN

Implementing Computed GOTO in BASIC

As we have mention rather frequently, MS BASIC stores a tokenized BASIC program as linked list of the following form:

Screenshot, containing the BASIC program '10 GOTO 100' and a disassembly of the program in memory (see below)
A simple program in Commodore BASIC.

In other words, our line of BASIC is stored like this (here on a Commodore PET):

addr  code                semantics

0401  0B 04               link: $040B
0403  0A 00               line# 10
0405  89                  token GOTO
0406  20 31 30 30         ascii « 100»
040A  00                  -EOL-
040B  00 00               -EOT- (link = null)

Some crucial observations:

Meaning, as there’s little magic about jump targets — they are just plain ASCII text —, we should be able to manipulte this in-memory. All we have to know is the location of that constant in memory, and, maybe the specific token for GOTO, in order to locate this.

Mind that we’ll be doing this here for Commodore BASIC, but the solution should be a more general one, provided the following observations:

If we enter the same line on AppleSoft BASIC (which starts it’s programs in memory at 0x801 or decimal 2049, just like the C64) and read it out using PEEK(), we’ll get:

Mind how the blank separating “GOTO” from the target line number has not been stored. Similar to Commodore BASIC the leading blank just after the line number has been ignored, as well, but lists as two separating blanks. Still, there are sufficiently close similarities to work with.

Rewriting Target Line Numbers in Memory

Knowing this — and assuming for the time that we also know the address in memory, here given in variable “LA”, as in line address (all our varibles will start with the prefix “L”), — we may rewrite the target line number as provided in variable “LN” by a construct like this:

10 LN$=STR$(LN):LL=LEN(LN$)
20 FOR LI=2 TO LL:POKE LA+LI,ASC(MID$(LN$,LI,1)):NEXT
30 POKE LA+LI,58:REM ":"
40 GOTO00000:

All we have to do is to reserve a few bytes for the target line number, here by “00000:”, which should be enough to hold any legitimate line number (which can’t exceed 5 digits). By dropping a terminating colon immediately after this, we haven’t to care about the remaining line, as this will be ignored anyway.

E.g., if LN holds the value 20, line #40 will become:

40 GOTO20:00:

Mind that MID$() starts at a 1-based index and has the general form of,

MID$(<string>, <index>, <length>)

and that “STR$()” will include a leading blank in sign position for any positive numbers.
Hence, our read-poke loop starts at index 2, where we find the first digit:

A$=STR$(100)

READY.
? CHR$(34);A$;CHR$(34)
" 100"

READY.
? LEN(A$)
 4

READY.

? MID$("ABC",2,1)
B

READY.

? MID$(A$,2,1)
1

READY.
█

Adding some measures for sanity and to prevent any intrusions into other bits of code (since we skip the first character, we won’t have to worry about signs), we can pack this into just two lines of BASIC:

1 LN$=STR$(INT(LN)):LL=LEN(LN$):IFLL>6THENLL=6
2 FORLI=2TOLL:POKELA+LI,ASC(MID$(LN$,LI,1)):NEXT:POKELA+LI,58:GOTO00000:

Since BASIC starts any lookup for line numbers at the very beginning, we put this at the beginning, as well, for optimal performance.

Now, all that’s left to do is to find the location of our “GOTO” token in memory. Variable LA will be the address of this less one, to compensate for our loop offset.

Finding a Specific Token in Memory

For this we need to know a few things as a precondition:

First, we’ll find the line in question, progressing along the line-links, justy like BASIC does it. In order to go out of the way of any other code, we’ll put this on the very top of available line numbers (as this will be only run once, the search penalty isn’t really a concern for this):

63899 REM INITIALIZE COMPUTED GOTO
63900 LN=2
63905 LA=4*256+1 :REM $401 (PET)
63910 FOR LI=0 TO 32767
63915 LL=PEEK(LA)+PEEK(LA+1)*256
63920 IF LL=0 THEN PRINT "ERROR: LINE";L;" NOT FOUND.":END
63925 IF PEEK(LA+2)+PEEK(LA+3)*256=LN THEN 63935
63930 LA=LL:NEXT
63935 PRINT "FOUND LINE";L;" AT";LA

First of all, we’ll be using the same variables as above, namely LA, LI, LL, and LN (by this also allocating these variables at the very beginning of the varible list, which provides a small speed-bump to our GOTO routine).

Hopefully, we found our line, thus arriving at line 63935. Now it’s a matter of finding our token, for wich we’ll reuse variable LL, and variable LA will be reused as a pointer into the program text of this line, which starts at offset 4. Variable LN now figures as the program byte to be inspected.

63935 LL=137:FOR LA=LA+4 TO 32767
63940 LN=PEEK(LA):IF LN=LL THEN LA=LA-1:RETURN :REM FOUND THE TOKEN
63945 IF LN>0 THEN NEXT
63950 PRINT "ERROR: 'GOTO' NOT FOUND.":END

Step-by-step:

Knowing the Start of BASIC

As already mentioned, a BASIC program starts in memory on any Commodore PET at 0x0401, on the C64 at 0x0801, and generally on the Apple ][ with AppleSoft BASIC at 0x0801, as well. (I think, there’s also an option to load a BASIC program to 0x6001 into high/memory on the Apple ][, but I’m really not savvy enough about this system to provide serious advice.)

However, there are also systems, where the BASIC start varies, depending on the memory configuration. Like it’s the case with the VIC-20. How can we deal with this?

MS BASIC has a system pointer, “TXTTAB”, which points at the very start of the BASIC program text in memory. This is found in Commodre BASIC at the following locations:

Version BASIC 1 BASIC 2.0–4.0 BASIC V. 2.0
System PET 2001, “Old ROM” all PETs VIC-20 C64
TXTTAB 0x7A–0x7B 0x28–0x29 0x2B–0x2C 0x2B–0x2C
decimal 122–123 40–41 43–44 43–44

Therefore, we can empirically figure out the start of BASIC effectively, as in:

LA=PEEK(43)+PEEK(44)*256:REM COMMODORE BASIC V.2

(However, as this may introduce further complications — compare the PET — we here go with a hard-coded address, assuming that you know the system configuration, this is for, already.)

Putting it All Together

Putting these parts together, we arrive at all we need. We simply jump to line 1 and this will forward us to whatever line number happens to be in variable “LN” (as in line number), which works for both GOSUB and GOTO (and GO TO, as well).

Here for the PET:

0 GOSUB63900:GOTO3
1 LN$=STR$(INT(LN)):LL=LEN(LN$):IFLL>6THENLL=6
2 FORLI=2TOLL:POKELA+LI,ASC(MID$(LN$,LI,1)):NEXT:POKELA+LI,58:GOTO00000:
3 REM START HERE

10 PRINT ".COMPUTED GOTO INITILIZED."
15 PRINT
20 FOR LN=100 TO 120 STEP 10
30 PRINT "BASIC IS ";:GOSUB 1:PRINT "! ";
40 NEXT
50 LN=20:GOTO 1
100 PRINT "GREAT";:RETURN
110 PRINT "FANTASTIC";:RETURN
120 PRINT "COOL";:RETURN

63899 REM INITIALIZE COMPUTED GOTO
63900 LN=2:LA=4*256+1:LN$=""
63905 FORLI=0TO32767:LL=PEEK(LA)+PEEK(LA+1)*256
63910 IFLL=0THENPRINT"ERROR: LINE";L;" NOT FOUND.":END
63915 IFPEEK(LA+2)+PEEK(LA+3)*256=LNTHEN63925
63920 LA=LL:NEXT
63925 LL=137:FORLA=LA+4TO32767:LN=PEEK(LA):IFLN=LLTHENLA=LA-1:RETURN
63930 IFLN>0THEN NEXT
63935 PRINT"ERROR: 'GOTO' NOT FOUND.":END
Screenshot of the program running on an emulated Commodore PET.
Our program running on an emulated PET.

This may be used as a template, simply exchange the core program at lines 10 to 120 by your own code.

Run the program in in-browser emulation on a PET:

Download the PRG file for the PET:

And here is a ready-to-use version for the C64:

0 GOSUB63900:GOTO3
1 LN$=STR$(INT(LN)):LL=LEN(LN$):IFLL>6THENLL=6
2 FORLI=2TOLL:POKELA+LI,ASC(MID$(LN$,LI,1)):NEXT:POKELA+LI,58:GOTO00000:
3 REM START HERE
10 PRINT ".COMPUTED GOTO INITILIZED."
15 PRINT
20 FOR LN=100 TO 120 STEP 10
30 PRINT "BASIC IS ";:GOSUB 1:PRINT "! ";
40 NEXT
50 LN=20:GOTO 1
100 PRINT "GREAT";:RETURN
110 PRINT "FANTASTIC";:RETURN
120 PRINT "COOL";:RETURN
63899 REM INITIALIZE COMPUTED GOTO
63900 LN=2:LA=8*256+1:LN$=""
63905 FORLI=0TO32767:LL=PEEK(LA)+PEEK(LA+1)*256
63910 IFLL=0THENPRINT"ERROR: LINE";L;" NOT FOUND.":END
63915 IFPEEK(LA+2)+PEEK(LA+3)*256=LNTHEN63925
63920 LA=LL:NEXT
63925 LL=137:FORLA=LA+4TO32767:LN=PEEK(LA):IFLN=LLTHENLA=LA-1:RETURN
63930 IFLN>0THEN NEXT
63935 PRINT"ERROR: 'GOTO' NOT FOUND.":END

Or in lower-case, ready to be pasted to VICE:

0 gosub63900:goto3
1 ln$=str$(int(ln)):ll=len(ln$):ifll>6thenll=6
2 forli=2toll:pokela+li,asc(mid$(ln$,li,1)):next:pokela+li,58:goto00000:
3 rem start here
10 print ".computed goto initilized."
15 print
20 for ln=100 to 120 step 10
30 print "basic is ";:gosub 1:print "! ";
40 next
50 ln=20:goto 1
100 print "great";:return
110 print "fantastic";:return
120 print "cool";:return
63899 rem initialize computed goto
63900 ln=2:la=8*256+1:ln$=""
63905 forli=0to32767:ll=peek(la)+peek(la+1)*256
63910 ifll=0thenprint"error: line";l;" not found.":end
63915 ifpeek(la+2)+peek(la+3)*256=lnthen63925
63920 la=ll:next
63925 ll=137:forla=la+4to32767:ln=peek(la):ifln=llthenla=la-1:return
63930 ifln>0then next
63935 print"error: 'goto' not found.":end
Screenshot ofthe program running on an emulated C64 (VICE).
Our program running on an emulated C64 (VICE).

Download the PRG file for the C64:

And this version (program start at 0x801, GOTO = 171) should work for AppleSoft BASIC (untested):

0 GOSUB63900:GOTO3
1 LN$=STR$(INT(LN)):LL=LEN(LN$):IFLL>6THENLL=6
2 FORLI=2TOLL:POKELA+LI,ASC(MID$(LN$,LI,1)):NEXT:POKELA+LI,58:GOTO00000:
3 REM START HERE
10 PRINT ".COMPUTED GOTO INITILIZED."
15 PRINT
20 FOR LN=100 TO 120 STEP 10
30 PRINT "BASIC IS ";:GOSUB 1:PRINT "! ";
40 NEXT
50 LN=20:GOTO 1
100 PRINT "GREAT";:RETURN
110 PRINT "FANTASTIC";:RETURN
120 PRINT "COOL";:RETURN
63899 REM INITIALIZE COMPUTED GOTO
63900 LN=2:LA=8*256+1:LN$=""
63905 FORLI=0TO32767:LL=PEEK(LA)+PEEK(LA+1)*256
63910 IFLL=0THENPRINT"ERROR: LINE";L;" NOT FOUND.":END
63915 IFPEEK(LA+2)+PEEK(LA+3)*256=LNTHEN63925
63920 LA=LL:NEXT
63925 LL=171:FORLA=LA+4TO32767:LN=PEEK(LA):IFLN=LLTHENLA=LA-1:RETURN
63930 IFLN>0THEN NEXT
63935 PRINT"ERROR: 'GOTO' NOT FOUND.":END

Other BASICs — beyond Microsoft

Similar solutions should be possible for other versions of BASIC, which store their programs in a mix of tokenized commands and plain ASCII. However, some adjustments must be considered. E.g., Apple’s own INTEGER BASIC stores lines more like an array, with a single byte offset length of the line (including this offset byte) in front (where MS BASIC has the 2-byte link address), but still followed by a 2-byte binary integer representation of the line number. BUSINESS BASIC for the Aplle /// is similar, but where INTEGER BASIC terminates a line by the value 0x01 as a an end-of-line marker, BUSINESS BASIC uses 0x00, like MS BASIC. Still, this will allow us to travers the code in search of our line number, and we should be able to identify the token for GOTO, which is really all, we need.

offset INTEGER BASIC BUSINESS BASIC
0 offset length offset length
1–2 line number line number
3… BASIC payload BASIC payload
EOL 0x01 0x00

We’ll just have to accomodate for a read of just a single byte for the offset and we’ll add this to current line number as we advance to next line. Something along the lines of (here assuming a little-endian system and that a line offset of zero marks the end of the program),

63905 FORLI=0TO32767:LL=PEEK(LA)
63910 IFLL=0THENPRINT"ERROR: LINE";L;" NOT FOUND.":END
63915 IFPEEK(LA+1)+PEEK(LA+2)*256=LNTHENLA=LA+LL:GOTO63925
63920 LA=LA+LL:NEXT

and in line #63925 (assuming that the valid memory range for a BASIC program doesn’t exceed 0x7FFF or decimal 32767):

FOR LA=LA+3 TO 32767

Some BASICs, like INTEGER BASIC, may use subscripted strings, instead of MID$(), or use a lower limit for the maximum line number, as they treat line numbers as signed integers (like BBC BASIC).

Moreover, using integer variables for any of this may improve performance on any systems that make proper use of them (unlike Commodore BASIC, where we would experience an actual speed penalty, as thes values would have to be converted to floating point, back and fro.)

However, unlike the versions of MS BASIC, we just inspected, other BASICs may allow for a variable start address in memory, which to figure out is left as an exercise to the reader.

— And by this, we close for today. —