The only way I've been able to get any type of readable ASCII TEXT
from the .tif's is to do the following for each
tif:
convert -density 1200 -resize 40% xaaa.tif -density 1 xaaa120040.tif
Then, OCR it with Irfanview with the KADMOS Plugin Installed.
For the first Page I get the following ASCII:
CHAR 000000RG CHCR 000216R CHlF 000224R
CHRTAB = ****** G CHSPC 000232R CH05 000014R
CH1 000024R CH15 000062R CH2 000070R
-
CH23 000110R CH24 000134R CH25 000140R
CH26 000144R CH3 000150R CH4 000164R
_~__~___~_____~
CH5 080212R CR = 000015 CTLP = 000020
)DAC0 = ****** G DACl = ****** G DAC2 = ****** G
DUM = 000000 INC 000240R INC6 000242R
INC8 080244R LF = 000012 PC =%000007
R0 =%000000 R1 =%000001 R2 =%000002
R3 ~%000003 R4 =%000004 R5 =%000~~5
~___~~___~~
R6 =%000006 R7 =%000007 SP =%000006
SPACE = 000048 . = 000246R
END ?
;*****************************************************
,.
... , . .
;
; CHARACTER DISPLAY
~ VERSION 3C
;
; NOV 15,1974
,; _~~._ ~
; R0=PTR TO BUFFER OF CHARS-FIRST WORD #OF B~TES
; R1=BIT TEST ROTATING MASK
~ R2=CHARACTER INCERMENT-DETERMINES CHARACTER SIZE
; R3=POINTER AT CHAR DOT DATA
. ; R4=X POSITION OF FIRST CHAR
; R5=Y POSlTlON OF FTRST CHAR
;
000000 ~ R0=%0
000001 R1=%1
000002 R2=%2
?
000003 R3=%3
080004 R4=%4
000005 R5=%5
000006 R6=%6
000007 R7=%7
000007 PC=R7
000006 SP=R6
000020 CTLP=20
000040 SPACE=40
000015 CR=15
000012 LF=12 '
000000 DUM=0
.TITLE .CHAR
.GLOBL CHAR,DAC0,DAC1,0AC2,CHRTAB
000000 .CSECT ,
000000 012046 CHRR: MO~ (R0)+,-(SP) ;GET CHAR COUNT
000002 016702 MOV INC,R2 ~SET CHARACTER SIZE
000232
000006 012737~ MOV #-2048 .~#OAC2 ;TURN DOT OFF JUST IN CASE
1740~0
000000
000014 005316 CH05: DEC (SP> ;IS THERE MORE CHARS?
000016 002002 BGE CH1 ;~ES-GO DRAW THEM
000820 005726 TST (SP)+ ;NO-POP OLD CTR
000022 000207 RTS PC ;RLL
DONE!!!!!~!!~!!!!!1~!!!!11!
000024 112AA7 CW?l? MnWR (P0~~...P-e? .nrT r?uc,o
It's no where near 50% accurate, but it's the best I've got so far.
Page 2 is:
\ ~~~~1~
000042 001470 BEQ CHLF ;YES-GO LF ---
---~
000044 122703 CMPB #SPACE,R3 ;NO-IS THIS A SPACE?
000040
000850 001470 BEQ CHSPC ;~ES-GO SPACE
000052 003003 BGT 0H15 ;NO-IF LESS
THAN.SRAC~-B~~-~*RR
PAGE 001.
000054 122703 CMPB #137~R3 ;IS IT GREAYER TH8N-1~~2~
-----
000137
000060 002003 BGE CH2 ;NO-GOOD CHAR~
~--~~-~---~-----?
000062 012703~CH15: MOV #CHRTAB,R3 ~YES-BAD CHAR
0001300 ,
?
000066 000410 BR CH23
000070 162703 CH2: SUB #37,R3 ;ZERO FOR FIRST CHAR
IN~TABLE
000037
0000~ 010301 MOV R3,Rl ~SAVE VALE TEMP
~--~--~~-----
000076 006303 ASL R3 ;R3=R3*2
000100 060103 ADD R1~R3 ;R3=R3*3 -
000102 006303 ASL R3 ;R3=R3*6 (FOR 3 WORDS)
0801~ 062703~ ADD #CHRTAB,R3 ~~ P~ AT FIR~ CHAR~~ATR ~~~E
000000
000110 012701 CH23? MnV #20A,R1 ~SET TEST BIT
-~-~--~-_~-~
000200
000114 010546 MOV R5,-(SP> ;SAVE lNITIAL ~ POSIT~ON
000116 010467 MO~ R4,CH24+2 ;SA~E INITIAL X
80001.4 ?
00~122 010367 MOV R3,CH25+2 ;SAVE INITIAL CHAR PTR
08881
4
??,..___ ,.~ ~.,,~,_,,__?,.,
000126 005137~ COM @#DAC2 ;TURN DOT ON INTO CHAR
008008 ~
000132 000404 BR CH26 ;SKIP
000134 012704 CH24: MOV #DUM,R4 ;RESET PTR
000000
0AA14A A127A~ CH25: MOV #DU~ R~ ;RESET X
000000
008144 012780 CH26: MOV #6,R0 ;SET B~TE CTR '
000006
000150 130123 CH3: BITB Rl,(R3)+ ~IS THE BIT ON?
000152 001404 BEQ CH4 ;NO-DONT MOVE IN DACS
000154 010437' MnV R4,~#OAC0 ~MO~ X ~ --~- ~
000000 ~
000160 010537~ MO~ R5,@#DAC1 ;MOVE Y ~__-___ -
---
000000
000164 060204 CH4: ADD R2~R4 ;INCREMENT X
000166 005300 DEC R0 ;DONE ALL THE B~TES? _
000170 003367 BGT CH~ ;NO-SO FINISH -~~
000172 160205 SUB R2,R5 ;'DECREMENT y
000174 000241 ' CLC
000176 106001 RORB R1 ;MOVE TEST BIT DOWN
ONE ?
000200 001355 BNE CH24 ~IF NOT ZERO DRAW NEXT ROW
000202 060204 ADD R2,R4 ;ZERO-SETUP FOR NEXT CHAR
000284 8126A5 MOV (SP~+,R5 ~REP~ACE INITIRL Y
000206 005137~ COM @#DAC2 ;TURN DOT OFF OUT OF CHAR
000008
000212 012600 CH5: MOV (SP)+,R0 ;REPLACE PTR
000214 000677 BR CH05 ;GO TO IT AGRIN
000216 012704 CHCR: MOV #-2048.,R4 ;RESET X TO FAR LEFT
174000
l~3~~~2 ~~~77~ ~~
~u~ .,_~~
I don't know if it's even worth it to continue. Some of the page are so
dark
that they don't scan at all. The best ones aren't anything to brag about.
Ideas? Suggestions?
Thanks.
Larry