Jump to content
  • entries
    45
  • comments
    10
  • views
    10,549

The U.K. Atari Computer Owners Club Newsletter, or an issue with issuu


Atari_Ace

763 views

Recently I've been reading the Tyne & Wear Atari User's Group Newsletters which I mostly got from atarimania.com. They are missing a couple of issues, so I was relieved to find another set of scans at at https://www.strotmann.de/~cas/Infothek/ to fill in the gaps. Yay!

Anyhow, there is a long series in that newsletter by Keith Mayhew called "Cracking the Code", which teaches assembly language. I'm always an advocate for more assembly tutorials, and this one apparently is a reprint from another newsletter I'd not heard of before, The U.K. Atari Computer Owners Club Newsletter, or Monitor. So I went looking for that newsletter and found http://www.page6.org/ukacoc/archive/archive.htm, a complete archive of the newsletter, and the newsletter appears to be professionally published (minus the first few issues). Yes!!

I followed the link, and landed on the site issuu.com, where the content is hosted. And started to weep... Issuu.com provides the newsletters, but in a web viewer that won't export the data no matter how nice I ask it (it wants me there to show me endless ads of course). I was hoping to use these scans to patch up some of the listings which are difficult to read in the TWAUG scans, but if I can't get them out of the browser, they're useless to me.

Necessity being the mother of invention, I rolled my sleeves up and tried to figure out how issuu.com delivered the content.

First, let's open a newsletter in any browser that monitors network traffic (I used Microsoft Edge), and look at the requests made while paging through the newsletter. This showed me that the pages of the newsletter were in files called page_1.bin, page_2.bin, all with a similar complex URL from layers.isu.pub for a given newsletter.

Now, let's grab one of these binary files with wget and analyze it. The files start like so:

page_01.bin

 0  1f 8b 08 00 00 00 00 00-00 ff 9c bb 7b 38 93 7f   ............{8..
10  fc 3f 7e af c9 88 48 a9-c8 98 b2 52 92 1c 3a 6c   .?~...H....R..:l

If you search for "1f 8b 08 …" you'll get hits indicating this is a gzip file. OK, let's gunzip and dump it again:

page_01.xxx

 0  08 01 10 02 1a 03 31 38-30 20 a5 08 28 dc 0b 30   ......180 ..(..0
10  01 4a 08 0a 06 10 a5 08-18 dc 0b 5a 90 90 1a 0a   .J.........Z....
20  86 90 1a ff d8 ff e0 00-10 4a 46 49 46 00 01 01   .........JFIF...
30  00 00 01 00 01 00 00 ff-db 00 43 00 08 06 06 07   ..........C.....

OK, that JFIF signature is familiar, it means this is a JPEG file. Although the first 0x23 byte don't look right to me, so I need to rip those off. Fortunately, I wrote a script for that ages ago.

sub cut_file {
  my ($file, $offset, $size) = @_;

  $offset = hex $offset if $offset =~ /^0x/;
  $offset = -hex substr($offset, 1) if $offset =~ /^\-0x/;
  $size = hex $size if $size =~ /^0x/;
  my $buff;
  my $fileSize = -s $file;  $offset = $fileSize + $offset if $offset < 0;
  open my $fh, '+<', $file or die;  binmode $fh;
  seek $fh, $offset + $size, 0;
  read $fh, $buff, $fileSize - $offset - $size;
  seek $fh, $offset, 0;  print $fh $buff;
  truncate $fh, $fileSize - $size;
  close $fh;
}

If we put that in a script called cut.pl, we can fix these files via cut.pl page_1.jpg 0 0x23

And sure enough, once I remove the first 0x23 bytes, the file parses and displays as an ordinary JPEG file.

So for a given issue, you can generate a cbz (Comic Book viewer file) by doing something like so:

for /l %i in (1,1,60) do wget --no-check-certificate https://layers.isu.pub/<...>/page_%i.bin
for /l %i in (1,1,9) do ren page_%i.bin page_0%i.binren *.bin *.jpg.gz
gunzip *.gz
for %i in (*.jpg) do cut.pl "%i" 0 0x237z -tzip a ukacoc.cbz *.jpg

where <…> is the URL to the content, and 60 is just a number large enough to get all the pages (early issues had ~30 pages, later ones in the mid-50s).

Now to get to work transcribing some of these...

0 Comments


Recommended Comments

There are no comments to display.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...