IOS communication
We’ve left of with the simulator and UNICOS just starting up. I’ve gotten them as far as getting the first packet exchange between the OS and the IO subsystem. However ‘exciting’ that sounds (well, it was for me but I’m not exactly sane as I’m sure you’ve realized already) that’s not exactly the end-all and be-all of the function of an operating system.
My excitement was due to several reasons. Getting this far shows that the mainframe simulation is more or less functional. It also shows that my IO simulator is capable of capturing packets coming from the mainframe. Finally it shows that the bootstrap process is what I expected it to be and even more importantly the parameter file is properly processed. This last one needs a little explanation: how do I know? The number of CPUs are set in the parameter file. Since the kernel only took out one processor from reset, it shows that it used the parameter from the config file.
The next and current task is to start parsing and interpreting the packets that I receive from the mainframe.
So what does the first packet look like? Here’s the hex-dump of the data that arrived:
1 2 3 4 5 6 7 8 9 |
0xB408200020070100 - .. . ... 0x4F00070000000000 - O....... 0x0000000000000000 - ........ 0x0000000000000000 - ........ 0x0000000000000000 - ........ 0x0000000000000000 - ........ 0x0000000000000000 - ........ 0x0000000000000000 - ........ 0xB408200020070100 - .. . ... |
Not a lot to look at but not nothing either. To interpret the information, I need to know the layout. Luckily I’ve already found the old (X-MP style) and new (Y-MP and newer style) packet layout in the usr/include/sys directory. This layout seems to match what’s in epack.h
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
#define EPAK_MAXLEN 69 /* maximum length of IOS-E packets */ #define EPAK_MAGIC 45 /* packet magic number (octal 55 ) */ /* * Packet structure for IOS-E packets * Data sent to IOS-E begins and ends with packet header (Epacket) */ struct Epacket { uint Ep_magic:6; /* Magic number for verification */ uint Ep_length:10; /* Length (words) of packet */ uint Ep_kernid:3; /* Owning kernel's id + 1 */ uint Ep_source:4; /* Location of code which sent packet */ #ifdef CRAYELS uint Ep_cluster:4; /* IOS cluster */ uint Ep_proc:5; /* Location of code to process request */ #else uint Ep_cluster:5; /* IOS cluster */ uint Ep_proc:4; /* Location of code to process request */ #endif /* CRAYELS */ uint Ep_flags:8; /* Processing options */ uint Ep_lpath:8; /* Logical path of code to process req */ uint Ep_seq:8; /* Packet sequence number to validate */ uint Ep_ackseq:8; /* Seq number of last valid packet */ }; /* * Packet structure for variable length packets to/from IOS-E * This structure is supplied by the device drivers to epackout */ struct Ep_t { struct Epacket Epacket; uint Ep_type :8, /* Packet type */ :24, :32; uint Ep_data [EPAK_MAXLEN-2]; /* Variable length packet */ uint Ep_trailer; /* Repeat header for validation*/ }; |
So, let’s try to parse this packet! (Note: the machine is little-endean so the byte-order is reversed from x86) The first word is a packet header with some reasonable sequence numbers (0 and 1). I have no idea what the logical path is (hope I won’t need it). Flags are set to 0, whatever that is. The process is set to 2, the cluster is zeroed out. The source is 0, the kernel ID is 1, which makes sense. Then comes the length, which is set to 8. This is one less than the number of words received, but that might be OK: the last word is a repetition of the first one according to the Ep_t struct so this last word might not be accounted for in the packet length. The remaining field is the ‘magic’ and its value is 45, just as the header specifies it.
All looks good, this definitely looks like a valid ‘E’ packet. Now on to the second word: The most important field here is Ep_type, which is set to ‘O’. That – according to the header – is not a valid value. Or is it? It actually falls within PKG_BEG_OUT and PKT_END_OUT, just not listed as one of the known packet types.
Here’s where prior knowledge comes handy. The way COS handled these packets was this: requests from the mainframe used capital letters as request codes, responses from the IOS used the lower-case version of it. So, for example a disk request would go out with request code ‘D’ and the response would come back with code ‘d’. In that context, the packet makes total sense: this is an OWS request (code ‘O’) to which a reply with code ‘o’ have to be sent. This response code (‘o’) is indeed listed in epack.h. In fact, the list only seem to contain response codes.
So far so good, but what’s inside the packet?
It’s lucky that most tools list files in a directory in alphabetical order. Right next to epack.h, there are a set of files, one of which is epacko.h. It contains the ‘o’ packet layout, the one we’re looking at.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
/* USMID @(#)uts/c1/sys/epacko.h 100.1 04/16/98 12:49:37 */ /* COPYRIGHT CRAY RESEARCH, INC. * UNPUBLISHED -- ALL RIGHTS RESERVED UNDER * THE COPYRIGHT LAWS OF THE UNITED STATES. */ #ifndef __SYS_EPACKO_H_ #define __SYS_EPACKO_H_ /* * "O" packet definitions */ #define OWS_LVLA 1 /* OWS level that has new O packets */ #define OWS_LVLB 2 /* OWS level that has vital O packet */ #define OP_BUF 63 /* length of the 'path' buffer */ struct op_hdr { /* O packet header */ /* cray word 0 */ struct Epacket w0; /* common word 0 header */ /* cray word 1 */ uint op_typ : 8; /* packet type = O/o */ uint op_lvl : 8; /* Unicos run level */ uint op_req : 8; /* request code */ uint op_rsp : 8; /* response code (unused) */ uint op_cls : 8; /* cluster */ uint op_iop : 8; /* iop */ uint : 16; /* cray word 2 */ uint op_watchaddr : 32; /* address to monitor for activity */ uint op_drvrseq : 32; /* driver sequence number */ }; struct op_t { struct op_hdr hdr; /* cray word 3 */ char op_buf[OP_BUF * 8]; /* data */ }; #define OP_PATH 7 /* logical path for O packet */ /* * request codes */ #define OP_BOOT 1 /* boot the specified IOP */ #define OP_DOWN 2 /* mark the specified IOP as down */ #define OP_UP 3 /* mark the specified IOP as up */ #define OP_KILL 4 /* master clear and down the specified IOP */ #define OP_OBIT 5 /* obituary notice for the specified IOP */ #define OP_ALIVE 6 /* existence notice for the specified IOP */ #define OP_CLOCK 7 /* send current clock */ #define OP_PANIC 8 /* CPU panic message */ #define OP_DUMP 9 /* dump given IOP */ #define OP_ISTOP 10 /* inform CPU of IOP stop */ #define OP_IRESTART 11 /* inform CPU of IOP restart */ #define OP_CHGLVL 12 /* Unicos run level changed notice */ #define OP_RUNLVL 13 /* change the Unicos run level */ #define OP_SYSHALT 14 /* halt the Unicos system */ #define OP_IABORT 15 /* inform CPU of IOP abort */ #define OP_VITAL 16 /* check health of UNICOS */ #define OP_FPANIC 17 /* force a panic */ #define OP_PREPANIC 18 /* CPU started panic, may flush */ #define OP_ERRLOG 19 /* make errlog entry */ /* * response codes */ #define OP_OK 0 /* request completed successfully */ #define OP_ERROR 1 /* request completed with an error */ #define OP_NOFILE 2 /* boot request specified a bad path */ #define OP_OOPS 3 /* invalid request */ #endif /* __SYS_EPACKO_H_ */ |
This allows us to further crack the structure:
Request code is 7, which is OP_CLOCK and all other fields are set to 0.
This is fantastic! This is how the old COS OS-to-IOS communication was started though the details are different; by synchronizing the clocks between the IOS and the mainframe. This makes me fairly certain that I’m on the right path. So, how to answer to this request?
From what we’ve seen so far in the IOSD code for the X-MP, communication is packetized, the packet requests/responses use the same layout, only swap the upper-case request code to lower-case one.
From epacko.h it’s a bit unclear how big the response should be (op_t is clearly larger than the packet we’ve just received). Another trick I’ve seen in the IOSD code though is that responses often use the same buffer as the reply. They patch things up that need to be changed, fill-in what needs to be filled in, but otherwise keep the buffer the same.
If a similar mechanism is at play in the current implementation, the response should have the same length as the request – why else would they include all the 0 fields?
There are new fields though to consider, especially Ep_seq and Ep_ackseq in the header. It seems the response should acknowledge the reception of the response. My current hypothesis, is that both sides (OS and IOS) keep a running count of messages sent. They increment this number then put it in the Ep_seq field whenever they send a packet. They also send the latest (successfully) received packets’ Ep_seq field into their Ep_ackseq field, letting the other side know how far they’ve gotten in receiving messages. What happened if the received sequence numbers skipped? I don’t know but I don’t really have to worry about it unless I see that happening. And of course for now, I have a sample of 1.
At any rate, this is a simple-enough logic to code up:
Copy the request into the response, increment the internal Ep_seq counter, copy the requests’ Ep_seq into Ep_ackseq, change the message code from ‘O’ to ‘o’ and send it back.
Will it work though? I’ve been told to set the time, so I presume some of the response fields should contain the current time, date or both. But in what format? Is it a UNIX-style epoch? Or some string (like in COS)? Or a tm struct? All of these make some level of sense… One way to figure this out is to put a special value into the response (something that’s easy to search for) and see how it gets parsed and processed by generating an instruction trace. I did this once for COS, and I’m not looking forward for a repeat exercise. And setting the time properly is not terribly high on my priority list, so I decided to just zero out the response and call it good. If the parsing code is unhappy with this, I’ll know soon enough and it’s only then that I’ll have to dig deeper.
After patching up the simulator to generate a response and push it back into the channel, I re-ran the simulation. The request-response went as expected, and I received my next packet:
1 2 3 4 5 6 7 8 9 10 11 12 |
0xB40B3A0020000100 - ..:. ... 0x5A07040021001200 - Z...!... 0x30303A30303A3030 - 00:00:00 0x28474D5429207574 - (GMT) ut 0x732F63312F6D642F - s/c1/md/ 0x696370752E632D30 - icpu.c-0 0x323A20494E464F20 - 2: INFO 0x46696E616C20494F - Final IO 0x415641494C5F6D61 - AVAIL_ma 0x736B206973203031 - sk is 01 0x0D0A200020000100 - .. . ... 0xB40B3A0020000100 - ..:. ... |
String! I have strings! Something clearly meant for human consumption. And the packet type is (durm-roll) ‘Z’. Quick, what could that be? epack.h and epackz.h to the rescue: this is a terminal packet. The request code is set to ‘4’ which is ‘ZP_DATA’.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
struct zp_t { #ifdef MOTO uint : 6, zp_len : 10, : 16; uint fill; #else struct Epacket zp_ehdr; #endif uint zp_typ : 8, /* packet type = Z */ zp_flags: 8, /* see flags below */ zp_req : 8, /* request code */ : 8, /* response code (unused) */ zp_cc : 9, /* character count */ zp_dev : 7, /* device identification */ zp_seq : 8, /* sequence number of this packet */ zp_ack : 8; /* seq. number of last packet received */ char zp_buf[ZP_BUF]; /* data */ }; /* * request codes */ #define ZP_CON_RQ 1 /* connect request */ #define ZP_CON_RP 2 /* connect reply */ #define ZP_DIS 3 /* disconnect */ #define ZP_DATA 4 /* data with piggybacked acknowledgement */ #define ZP_WINCH 5 /* window size change information */ |
That’s a bit weird! I would have expected a ZP_CON_RQ first… I guess establishing a connection is implicit in sending data to a new terminal. Or maybe the main terminal is always assumed to exist?
Well, that again will have to wait a little, there’s more work to be done: I have to whip up a small terminal packet interpreter that can also generate the proper responses. Talking about which… There’s something curious going on here: this packet contains a second set of sequence and acknowledge number fields (zp_seq and zp_ack). Why do we need a second set? And how are these handled? For now, I’ll assume that these work in a similar way as the packet header ones and see how far that gets me.
With that – and some additional not too interesting bug-fixing – I’ve gotten a terminal open. Next, I had to implement disk read and write commands (I won’t bore you with the details), I was finally greeted by the following display:
There are some bugs as you can see (what is all that garbage?!) but hey, I have UNICOS talking to me!
Let’s finish up this piece right here. Next, I’ll talk about how the file-system worked and why is that important.