;************************************************************************** ;* * ;* Fast Z-80 Unsqueezer * ;* V1.0 1/10/86 * ;* -Steven Greenberg * ;************************************************************************** ;(FOR CP/M 2.2+, Z-80 only) ; ; This program unsqueezes standard (Greenlaw style) squeezed programs. ;This is the original (no frills) version; simply specify the filename ;to be unsqueezed on the command line. The result filename will be gen- ;erated automatically and written to the default drive. ; ; The difference between this unsqueezer and others lies in its unique ;architecture. This results in speed increases ranging from a factor of ;about 2 (compared to fastest assembly coded programs previously avail- ;able) on up to factors greater than 5 (compared to standard C-coded ver- ;sions). Also note that the current .COM file is less than 1K in length. ; ; The architecture, very briefly, is as follows. First the "dictionary" ;info contained in the squeezed file is "compiled" into a decoding pro- ;gram, which sits in memory just above the .COM file. Since each squeezed ;file generates its own unique program, the program is a highly efficient ;subroutine for performing the actual unsqueeze operation. ; ; Thanks to Jeff D. Wilson for many worthwhile suggestions. ; ; Note: This version requires a minimum TPA of about 28k, but input & ; output buffer sizes could be easily changed by changing the ; equ's for "ibufsz" & "obufsz" at the end of this prog & re- ; assembling it. The next version will perform a dynamic as- ; signment of these buffers based on available memory. I have not ; done tests on the relative speed / memory tradeoffs. This ver- ; sion has a 2.25k input buffer & a 16k output buffer. It also ; automatically allocates a worst case size of 8K for "codtbl"; ; the next version will probably allocate only the exact amount ; needed for the file being un-squeezed. ; ; This program was assembled on an Z80ASM SLR SuperFast Relocating Macro ; Assembler (and they mean fast! like 6000 lines/minute!). No macros are ; used however, or anything else too assembler-specific. ;_________________________________________________________________________ ; ; SQUEEZED FILE FORMAT: Here is the format of the mysterious "squeezed" ;file, originally defined by Richard Greenlaw back around '79. ; ; BYTES 0,1: 76,FF always. This "identifies" the file as squeezed. ; ; BYTES 2,3: 16-bit checksum, lo-byte first. It is the sum of all ; bytes as they appear in the GENERATED file, modulo 2^16. ; ; BYTES 4, ... 4+n: Variable length filename field, containing the name ; of the original file before it was squeezed. The end ; of this field is defined by a zero byte. The "." char ; separating the filename from its .ext is included as ; as a char, and its position may float. ; ; BYTE 5+n: Zero, as mentioned above ; ; BYTE 6+n,7+n: The length of the "dictionary", which follows. The ; length of the dictionary is given as the total num- ; ber of "nodes". Each node is comprised of two pairs ; of two bytes each. Thus the dictonary consists of ; the next (4 * l) bytes, where "l" was the length ; specified. According to Huffman theory, the maximum ; number of nodes ever needed will be twice the #of ; different chars represented, in this case (257 * 2). ; [256 unique bytes plus a special end-of-file char]. ; A further description of what these 2 pairs of bytes ; (nodes) mean is provided later. ; ; BYTES 8+n, ... (8+n)+(4*l) -1 : The dictionary, as described above. ; ; REMAINING BYTES: Squeezed code; to be thought of as a continuous bit ; stream where byte boundaries are insignificant. Chars ; are defined by codes of varying lengths of bits (The ; more frequent the char the shorter the code). The ; bit stream is LS bit thru MS bit, then move on to ; next byte. The end of the file cannont be detected ; until the special end-of-file code "SPEOF" is decoded. ; ; ADDITIONAL NOTE: Multiple contiguous occurrences of the identical char ; are encoded by the 3-byte sequence: ; ; 90H ; ; When "char" is first decoded, it is outputed normally. ; When the 90H is decoded, a flag is set, the next char ; is decoded for a count val, and the first char is re- ; peated an additional "count-1" times. Encoding the byte ; 90H itself is achieved by the sequence 90H 00H; i.e. ; when the count is 0 send a 90H, the byte before the ; 90H was irrelevant. ;___________________________________________________________________________ ; ;--- Opcode equates --- ; jpop: equ 0C3h ;opcode for "jp" instruction ;--- Ascii equates --- ; cr equ 0DH lf equ 0AH ;--- CP/M address equates --- ; dfcb equ 5CH ;default file control block ddma equ 80H ;default DMA address cpm equ 0000H ;warm boot jump address bdos equ 0005H ;BDOS entrypoint ;--- BDOS function equates --- ; prtstr: equ 9 ;print string to console open: equ 15 ;open file close: equ 16 ;close file srchf: equ 17 ;search for file erase: equ 19 ;erase file read: equ 20 ;read file (sequential) write: equ 21 ;write file (sequential) make: equ 22 ;make file setdma: equ 26 ;set DMA address org 100h ;Program entry addr entry: jp start db 'Copyright (c) Steven Greenberg 1/10/86 201-670-8724; ' db 'may be freely reproduced for non-profit applications.' start: ld hl,(bdos+1) ;get top of memory ld a,h ;for sizing (below) ld sp,hl ;set sp scf ;set up for Z-80 / 8085 test (8080 ???) jr c,$+6 ;if processor is a Z-80, inst. will branch ;8085 will perf a RIM, then an 04 ($+6 becomes 04) ; which is just an INC B, then continue with jp notz80 ; this jp instr. nop ;so the z-80 branch will be correct sub a,eobfhi+2 ;(this is "$+6") jr nc,enough ;mem size test ld de,lakmem ;"Not enough memory..." jp fatal ;(fatal error) notz80: ld de,wrngup ;"Program requires Z-80 processor" jp fatal ;(equally fatal) enough: ld de,logo ;version#, etc call mesage ; ld de,dfcb ;open input file spec'd on command line ld c,open ; call bdosav ; or a ; jp z,clone ;br if successful ld de,err1 ;else, "Input file not found" jp fatal ; ;_________________________________________________________________________ ; ;Before going too much further, take this opportunity ; to "clone" a 16-byte template of code into memory ; 512 times. This forms the skeleton for the compiled ; block of code "codtbl". Various specific instructions ; and data will overwrite sections of this template ; after the dictionary info is read. clone: ;code to duplicate the 16-byte template 512 times ld hl,tmplat ;first xfer one copy to the beg of "codtbl" ld de,codtbl ; ld bc,16 ; ldir ; ld hl,codtbl ;now copy it 511 more times. ld bc,511*16 ;(note de already points to "codtbl+16") ldir ;that does it! ;_________________________________________________________________________ ; call reload ;Now load up the input buffer. The input buffer, ; the output buffer, and "codtbl" are all page ; aligned and of page multiple lengths. There are ; no other criteria for the lengths of the 2 ; bfrs, except that the input buffer should have a ; minimum length of 2K plus 1 more page. This ; guarantees that the entire dictionary (plus misc. ; header info) will be read in on the 1st pass, ; simplifying the program. ld a,(HL) ;"reload" leaves hl pointing at beg of "ibuf" cp 76H ;check for "Squeezed File Header" code 76H,FFH jr nz,ntsqzd ;br if not a squeezed file inc L ;note buffer starts on a page boundry inc (HL) ;Chk for FF [clobber it along the way] ntsqzd: ld de,nsqmsg ;meanwhile , prep for poss err msg jp nz,fatal ;fatal "not squeezed" condition ld de,arrow ;"---->" call mesage ;ok, print an arrow ld hl,(ibuf+2) ;get the 16 bit checksum and save it for later ld (chksum),hl ;goes there ld hl,ibuf+3 ;init pointer past 76FF and 2 byte checksum (-1) ld ix,ofcb+1 ;init pointer to filename area of output FCB eatlp: inc l ;filename << 256 chars! ld a,(hl) ;eat up the file name or a ;a zero byte indicates end of filename jr z,ateit ;br when that is encountered cp '.' ;check for name / ext division char jr z,isdot ;br when encountered ld (ix+0),a ;else copy filename char to output FCB inc ix ;and incr that pointer jp eatlp ;continue isdot: ld ix,ofcb+9 ;when "." is encountered, skip to the ext bytes ;of the output FCB. (any remaining non-ext bytes ;were init'd to blank). Do not copy the "." to ;the output FCB. jp eatlp ;continue ateit: ld (hl),'$' ;clobber the zero with a '$' for blw func call push hl ;but first save current pointer ld de,ibuf+4 ;beg of file name call mesage ;print the filename to the console ld de,crlf ;a cr/lf sequence call mesage ; pop hl ;now back to business... ld de,ddma ;"srchf" func outputs stuff to the DMA addr! ld c,setdma ; redirect this to the unused default DMA area call bdosav ; ld de,ofcb ;see if output file already exists ld c,srchf ; call bdosav ; or a ; jp m,ok3 ;br if not ld c,erase ;if so, erase the old one (*** prompt?? ***) call bdosav ; ok3: ld c,make ;in either case, make the new file call bdosav ; ld c,open ;and open it call bdosav ; or a ; jp p,maktbl ;err cond check ld de,err2 ;"File open error" jp fatal ; ;________________________________________________________________________ ; ; Now create "codtbl" by overwriting certain sections of the template ;created above. The dictionary contains 4 byte nodes- these are converted ;into 16 byte code segments. The max length of the original dictionary is ;(257*4) bytes or about 2K; corresponding to a max "codtbl" length of 8K. ; ; NODE DEFINITION: As mentioned above, a "node" consists of two pairs ; of bytes. The first pair corresponds to a zero bit, the latter ; to a "1". To decode a character, we start at node #0. A bit is ; pulled off the bit stream. We then use the 1st or 2nd byte pair ; depending on the bit value. The byte pair takes on 1 of 2 forms- ; "nn FF" or "xx x0". The "nn FF" type is a terminal node- it means ; we have our next output value- that value specifically being the ; 1's complement (makes it more mysterious!) of "nn". If the node ; is of the second type, it is a pointer to another node (an absol- ; ute offset from the beg of the dictionary in terms of node#, must ; be multiplied by 4 for a byte offset. It has a max value of 513, ; and is expressed as a 16-bit #, lo-byte first). In this case we ; go to that node, pull another bit off the input stream, and con- ; tinue the process. ; ; There is actually a 3rd node type, which just comes up once. Its ; form is "FF FE"; it is a special end-of-file marker called "speof". ; ; HOW IT WORKS: Each node is converted into a 16-byte (actually 13 plus ; 3 nop's) series of instructions. These instructions actually per- ; orm the unsqueezing operation. The whole block of code starts at ; "codtbl" (which is also the entrypoint). Each 13 byte "node code" ; consists of a 7 byte header. The header shifts out the next bit ; from reg "b", then conditionally branches to the first or second ; half of the remaining code (3 bytes per half, 7+3+3 = 13). The 3 ; bytes in each half are either the 2 instructions "ld a," fol- ; by "ret" (terminal node) or the single instruction "jp ". ; All calls to "codtbl" eventually hit a terminal node and perf a ; normal return. The only exception is the special end-of-file node ; which compiles to "jp speof"; on this particular return the stack ; is manually adjusted to compensate for the lack of a "ret" instr. ; maktbl: ; --- Create the "node code" table --- inc l ;now points one past the filename eof char ld e,(hl) ;get #of nodes (next word after filename eof) inc l ; ld d,(hl) ;hi byte of same inc l ; ld a,d ; sub 3 ;an additional file validity check- although the ; #of nodes could be in theory as high as 0201 or ; so, it should never approach 0300 or higher jp nc,ntsqzd ;if this happens, assume it is not a squeezed file ;hl indexes thru source dictionary (already ; initialized), & HL is current dest pointer ; (indexes thru "codtbl"). de is initialized ; to the # of nodes ; and is decr'd to 0. exx ;init some constants ld DE,10 ;used for incrementing HL ld HL,codtbl+7 ;init HL itself exx ; ;remember, the whole "header" code and some other ; instructions are there already (from when ; "template" was duplicated). Only specific details ; need now be filled in. nodelp: call makhaf ;make the first (0) half-node call makhaf ;2nd (1) half-node exx ;source pointer has already been incr'd 4 ;times, as desired. Dest pointer has only ;been incr'd 6 times, however. add HL,DE ;incr by 10 more, taking care of that exx ; dec de ;loop counter ld a,d ; or e ; jp nz,nodelp ;continue till done with all defined nodes ;_____ ; jp run ;go run! hl is is ready, pointing to the first ; byte of squeezed code ;_________________________________________________________________________ ; makhaf: ;Create a "half-node" ld c,(hl) ; } inc hl ; } ld a,(hl) ; } get a byte pair from the dictionary inc hl ; } or a ;if it is negative, it is "terminal" jp m,termod ;br if that is the case ;else create code for one half-node of the ;non-terminal variety (byte pair in a,c) ;multiply pointer by 16 (bytes/node in "codtbl") sla c ; } rla ; } sla c ; } rla ; } mult by 16 sla c ; } rla ; } sla c ; } rla ; } add cdtblh ;add offset to beg of "codtbl", (page aligned) ld b,a ;now bc has the jmp addr altent: push bc ;save it exx ;switch to dest pointers ld (HL),jpop ;insert the "jp" opcode inc L ;remember "codtbl" is page aligned pop BC ;get addr back ld (HL),C ;jump addr, lo inc L ; ld (HL),B ;jump addr, hi inc L ; exx ;back to source pointers ret ;thats all ;----- termod: ;create a half-node of the terminal variety cp a,0FEH ;check for special EOF terminal node jr z,speof ;br for that unique case ld a,c ;else this byte is the complement of rtn'd val cpl ; exx ;switch to dest pointers inc L ;only the 2nd of 3 bytes need be inserted ld (HL),a ;put it in inc L ; inc L ;but still make sure HL gets incr'd 3 times exx ; ret ;that's all speof: ld bc,done ;special eof returns to a special address in ;mainline code, rather than using "ret". ;stack is adjusted accordingly there. jp altent ;use convenient code subsection from above; ;___________________________________________________________________ ; tmplat: ;The following 3 instructions form the header code ;for every node. They are identical for every ;node (since the jump is relative). srl b ;shift out next bit call z,refill ;refill reg when empty jr c,bitis1 ;if bit is "1", execute 2nd half of node ;After the header code gets executed, one of two ;halves of the remainder of the node gets exec- ;uted. Which half depends on the bit shifted out ;above ("0" for the first half, "1" for the 2nd). ; ;Each half-node has two possible forms. The term- ;inal form loads an appropriate value and returns. ;The non-terminal form jumps to the header of another ;node. This 16-byte "node-code template" assumes the ;former case by default, since 2 of 3 bytes in that ;case are fixed (only the value need be inserted). ;If it turns out to be the latter case, all 3 bytes ;will be overwritten with a "jmp" opcode plus an addr. bitis0: ld a,00H ;(00H gets replaced with actual val to be rtn'd) ret bitis1: ld a,00H ;2nd half of the node, likewise ret nop ; } nop ; } so the template is exactly 16 bytes long nop ; } ;_________________________________________________________________________ ; refill: ;code to refill reg b with the next byte inc l ; jr z,posrld ;if l is zero, we may be at end of buffer cont: ld b,(hl) ;else get next byte ;Now pre-shift out the next bit, shifting in ;a "1" from the left. Since the leftmost bit ;in the reg is a guaranteed "1", testing the ;zero stat of the reg is a necesssary and suf- ;ficient condition for determining that all the ;bits in the reg have been used up (see header ;code for "tmplate"). ; ;The only things to be careful of is that the ;the last bit is NOT used, and that the bit now ;in the C flag IS used upon return from this subr. ; scf ;to shift in the flag bit rr b ;do that; also shift out real bit as described ret ;that's it posrld: ;inc hi-byte... inc h ;chk if time to reload w/ additional pages of data ld a,eibfhi ;(hi byte of the loc 1 past the end) cp h ; call z,reload ;reload if necessary (resets hl appropriately) jp cont ; ;------------------------------------------------------------------------- ; run: ;main loop to perform the unsqueeze ; exx ;-- initializations --- ld HL,obuf ;the alt regs are now used for output pointers ld BC,0 ;HL is the output pointer; C always has a copy exx ;of the prev char output, B is a "repeat flag" dec hl ;init to point to first byte of squeezed code-1 ld de,0 ;init running checksum accumulator to zero ld b,d ;and init b to zero so first call will immed- ;iately call in the actual first byte mainlp: call codtbl ;unsqueeze a char call send ;output it to the output buffer jp mainlp ;and repeat "forever" (see "speof" above") ;------------------------------------------------------------------------- ; reload: push af ;relaod the input bfr; reset hl to the beg of it push bc ;this routine assumes an input buffer starting push de ; on a page boundry & of page multiple length ld b,ibufsz ;loop counter- length of input buffer in pages ld d,ibufhi ;beg of buffer (hi) rldlp: ld e,0 ;lo byte of current dma call rdsec ;read in 128 bytes (1/2 page) jr nz,rldrtn ;(return if eof enecountered) ld e,80H ;to read in the next half page call rdsec ;do that jr nz,rldrtn ;as above inc d ;next page djnz rldlp ;loop till done rldrtn: pop de ;restore regs pop bc ; pop af ; ld hl,ibuf ;reset input pointer back to the beg... ret ;... and return ;----- rdsec: ;subr for abv- rd 128 bytes to mem starting at hl push de ;save dma addr before clobbering it with an fcb ld c,setdma ;set dma func call bdosav ; ld de,dfcb ;input FCB ld c,read ; call bdosav ;readd a sector pop de ;restore dma to orig dma addr or a ;set non-zero status if end-of-file encountered ret ; ;------------------------------------ done: inc sp ;when "speof" is encountered, a jump to here is inc sp ; made to exit "codtbl", rather then the normal ; return instr; adjust the stack immediately. ld a,(chksum+0) ;make sure the checksum checks out cp e ;lo-byte jr nz,nfg ;br if nfg ld a,(chksum+1) ;likewise cp d ; jr z,cksmok ;ok! nfg: ld de,chkerr ;"Checksum error detected" call mesage ; report the warning, but let the guy have his ; anyway, for what's its worth. cksmok: exx ;switch to output set of registers and a ;clear carry ld DE,obuf-7FH ;the total #of bytes generated should always be a ; multiple of 128. This assumption is not made, ; however, as it may not be true if the file was ; squeezed on non-CP/M systems. sbc HL,DE ;compute #of bytes to write- specifically current ; pointer value minus start of buffer addr. To ; this we add 7FH, (done by adding to the start ; addr abv). If the byte count was in fact a mult- ; iple of 128, this has no effect; otherwise it ; makes sure the final sector gets written sla L ; rl H ;divide #of bytes by 128 to determine #of sectors ld B,H ;use b as the counter call wrtout ;will write "b" sectors to the output file exx ;back to normal set of regs ld de,ofcb ;close the output file ld c,close ; call bdosav ; ld de,dfcb ;likewise the input file ld c,close ; call bdosav ; jp cpm ;Voila! ;------------------------------------------------------------------- ; wrtout: ;Write "b" 128 byte sectors to the output file ld de,obuf ;init dma addr to the beg of the output bfr wrtlp: ld c,setdma ;set dma to there call bdosav ; push de ;save that addr ld de,ofcb ;spec the output file ld c,write ;write a sec call bdosav ; pop de ;addr as saved abv djnz nexsec ;decr counter & continue if not done ret ; nexsec: ld e,80H ;else incr by 1/2 page (bfrs page aligned) ld c,setdma ; call bdosav ; push de ;save dma pntr ld de,ofcb ;output FCB ld c,write ;write another sec call bdosav ; pop de ;get back pointer inc d ;inc hi-byte, 0 the lo to effect another 80H incr ld e,0 ; djnz wrtlp ;decr sector counter; continue till done ret ; ;------------------------------------------------------------------- ; bdosav: push bc ;bdos call w/ all regs saved (including alts) push de ; [necessary?]; ix,ix and af are not saved push hl exx push bc push de push hl exx call bdos exx pop hl pop de pop bc exx pop hl pop de pop bc ret ;___________________________________________________________________ ; fatal: ld bc,cpm ;for fatal errors- put 0000 on the stack, push bc ; then the ret below will return to CP/M mesage: push bc ;de is pointing at a relevant message ld c,prtstr ;print string call bdosav ; pop bc ; ret ; ;___________________________________________________________________ ; send: ;snd char to the output bfr, plus related processing exx ;alt regs used for all output processing srl B ;if reg is "1", repeat flag is set (note- ; clears itself automatically) jr c,repeat ;go perf the repeat cp 90H ;else see if char is the repeat spec jr z,setrpt ;br if so ld C,a ;else nothing special- but always keep call out ;else just output the char; exx ;back to normal regs ret ; setrpt: inc B ;set repeat flag; count val will come as next byte exx ;(note- don't clobber C with the "90H"; it still ret ; has the prev char- the one to be repeated) repeat: or a ;repeat flag was previously set- current jr z,snd90h ; byte in a is a count val. A zero count is dec a ; a special case which means send 90H itself. ld B,a ;count-1 goes in b (-1 since orig occurrence ld a,C ; of the byte was processed). Now put byte in a. again: ; } call out ; } repeat B occurrences of byte in a djnz again ; } (note- lvs B, the rpt flag, 0 as desired) exx ;restore regs & rtn ret ; snd90h: ld a,90H ;special case code to send the byte 90H itself call out ; exx ; ret ;(90H "squeezes" int 3 bytes!) out: ;output char in a directly to the output buffer exx ; } ld c,a ; } add e ; } ld e,a ; } accumulate the 16 bit checksum jr nc,nocary ; } inc d ; } nocary: ld a,c ; } exx ; } ld (HL),a ;put it into the next avail position inc L ;incr pointer ret nz ;ret if not passing a page boundry inc H ;incr pointer high byte, and check limit ld L,a ;use L, which we know is 0, for temp storage ld a,eobfhi ;limit cp H ;check ld a,L ;but first restore regs ld L,0 ; ret nz ;ret if limit not reached push af push bc ld b,obufsz*2 ;number of 128 byte secs to write call wrtout pop bc pop af ld hl,obuf ret ;_________________________________________________________________________ ; logo: db cr,lf,'FU Fast Unsqueezer v 1.0$' err1: db cr,lf,'Input file not found.$' err2: db cr,lf,'File open error.$' arrow: db cr,lf,'----> $' lakmem: db cr,lf,'Not enough memory for this version.$' nsqmsg: db cr,lf,'Not a squeezed file.$' chkerr: db cr,lf,'Checksum error detected.$' wrngup: db cr,lf,'Program requires Z-80 processor.$' crlf: db cr,lf,'$' ofcb: db 0,' ',0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 chksum: ds 2 pff equ $+00FFH ;to get 1st available page (on a page codtbl equ pff and 0FF00H ; boundry) after the prog, add FFH to ; the last addr and "and" rslt w/ FF00H ibufsz equ 9 ;(pages) ;minimum input buffer size is 9 pages to ; guarantee that the max possible diction- ; ary size (512 bytes plus overhead) will ; be read be read in the first pass. obufsz equ 64 ;(pages) ;output buffer can be any page multiple ; length (note it too is also page ; aligned). Increases beyond 64 pages ; (16k) may do little to improve perf. ibuf equ codtbl+(16*512) ; } define the beginning of the 2 buffers obuf equ ibuf+(ibufsz*100H) ; } mentioned abv (& the end of the 2nd) eobuf equ obuf+(obufsz*100H) ; } using the equ's from just above cdtblh equ high codtbl ; } ibufhi equ high ibuf ; } high bytes of the beginning addresses eibfhi equ high obuf ; } of the buffers just defined eobfhi equ high eobuf ; } end ative, it is "terminal" jp m,termod ;br if that ret nz ;ret if limit not reached push af push bc ld b,obufsz*2 ;nu