{"id":1587,"date":"2023-07-10T20:57:33","date_gmt":"2023-07-11T01:57:33","guid":{"rendered":"https:\/\/www.computercollection.net\/?p=1587"},"modified":"2023-07-10T20:57:33","modified_gmt":"2023-07-11T01:57:33","slug":"ibm-1410-fpga-off-to-the-races","status":"publish","type":"post","link":"https:\/\/www.computercollection.net\/index.php\/2023\/07\/10\/ibm-1410-fpga-off-to-the-races\/","title":{"rendered":"IBM 1410 FPGA: Off to the Races??"},"content":{"rendered":"\n<p>Once the 1415 console emulation was up and running, I was able to run diagnostics.  The first set of diagnostic failures arose in the Assembly Channel because the Automated Logic Diagrams I had for parts of the Assembly Channel were not for the 1410 Accelerator feature, unlike the vast majority of ALD pages I had.<\/p>\n\n\n\n<p>Once I fixed that, diagnostic CU01 ran OK in non-overlapped non-priority (interrupt) mode.  However, once I enabled overlapped I\/O with the priority feature in the diagnostic settings, the diagnostic errored out with an Instruction Check.    The overlapped I\/O is that of the 1415 console.  The diagnostic then monitors that (along with a priority interrupt) to make sure that the channel status information and the interrupt operate as expected.<\/p>\n\n\n\n<p>The Assembly Channel issue had been reproducible using a single instruction, and I could set that instruction in the initialization of the first 10K memory module, so I was able to troubleshoot it using simulation.  But not this one &#8211; it happens after 10s of thousands of instructions.  I had been less than confident about using the built-in logic analyzer capability that Vivado affords for Xilinx chips, but this problem left me no choice.  Fortunately, after just one false start, I was able to figure out how to make a change in the signals the logic analyzer had available, set triggers, and so on &#8211; so, not so bad.<\/p>\n\n\n\n<p>Here is what the problem looked like that caused the Instruction Check.  Note that the signal +S E CYCLE REQUIRED is going active (high) just <em>after<\/em> signal -S LOGIC GATE A.  Now, that should not be a problem, except that +S ERROR SAMPLE is also high at this point, and since Logic Gate A is active, as well as E Cycle Required, the logic in the CPU sees that as a possible problem &#8211; knowing that if E Cycle Required is active, it ought to be activating Logic Gate R rather than Logic Gate A.  (Note: At this point I had not included +S E CYCLE REQUIRED A (one of four different ways that +S E CYCLE REQUIRED can be asserted &#8211; and which turned out to be the &#8220;villan&#8221; in this case.  Anyway, here is what the output of the logic analyzer looked like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"720\" src=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE-1024x720.png\" alt=\"E Cycle Required Request with a Race Condition\" class=\"wp-image-1590\" srcset=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE-1024x720.png 1024w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE-300x211.png 300w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE-768x540.png 768w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE-1200x844.png 1200w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/202030706-E_CYCLE_REQUIRED-LGA-RACE.png 1251w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">E Cycle Required Request with a Race Condition<\/figcaption><\/figure>\n\n\n\n<p>Now, this was not occurring on all or even anywhere near a majority of overlapped I\/O operations.   Below is an example (using the very same FPGA configuration) of a successful overlap.  Note that in this example, +S E CYCLE REQUIRED is asserted much earlier &#8211; along with -S LOGIC GATE E so there is no race &#8211; E Cycle Required is ready and present long before the time of+S LOGIC GATE Z when the CPU makes the decision between Logic Gate A and Logic Gate R is made.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"700\" src=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230707-E_CYCLE_REQUIRED-NoRace-Example-1024x700.png\" alt=\"IBM 1410 FPGA Overlapped I\/O Cycle With No Race Condition\" class=\"wp-image-1589\" srcset=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230707-E_CYCLE_REQUIRED-NoRace-Example-1024x700.png 1024w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230707-E_CYCLE_REQUIRED-NoRace-Example-300x205.png 300w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230707-E_CYCLE_REQUIRED-NoRace-Example-768x525.png 768w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230707-E_CYCLE_REQUIRED-NoRace-Example.png 1189w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">IBM 1410 FPGA Overlapped I\/O Cycle With No Race Condition<\/figcaption><\/figure>\n\n\n\n<p>So, I went looking for possibilities:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Had I made a mistake when I entered the data for the associated ALDs?<\/li>\n\n\n\n<li>Was this a race condition caused by the fact that I insert &#8220;D&#8221; flip flops to disconnect any timing loop identified by my generation code (within a single ALD) or by Vivado during synthesis (involving multiple ALDs)?<\/li>\n\n\n\n<li>Was this a race condition caused because the FPGA gates are much faster than the original RTL logic SMS cards, and\/or the fact that LUTs are used to collapse combinatorial logic into a decision table that may eliminate multiple levels of original gates?<\/li>\n\n\n\n<li>Was this a very tight timing window in the original CPU?<\/li>\n<\/ul>\n\n\n\n<p>During my investigation I came upon this tidbit in manual 226-2692, IBM Customer Engineering Instruction-Reference 1411 Input-Output Operations on page 46:<\/p>\n\n\n\n<p><strong>Service Note<\/strong><br>Because close timing conditions occur in the areas listed below, excessive delay, or accumulated delays in the logic circuits may cause machine failures:<br><strong>CHANNEL REGISTERS AND CONTROLS<\/strong><br><em>E-cycle required<\/em><br>F-cycle required<br>E-cycle control<br>F-cycle control<br>Address channel<\/p>\n\n\n\n<p>Well, do any of those look familiar?  Like maybe the italicized one?  While I have resolved the issue, I do not know, for sure, the exact nature of why I ran into it.  The original CPU had the +S E CYCLE REQUIRED signal originating in (physical) Frame &#8220;D&#8221;, whereas the logic gate signals are created in Frame &#8220;C&#8221;, a couple of feet apart.   As a <em>guess<\/em> I think it is most likely the the FPGA logic is faster than the original hardware in this area, such  that +S E CYCLE REQUIRED could be being asserted earlier than the original engineers thought probable.<\/p>\n\n\n\n<p>Regardless, I faced the issue of what to do about it.  I had several choices:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>I could try and tweak my generated logic in terms of speed, by adding delays, to see if I could resolve it that way.<\/li>\n\n\n\n<li>I could inhibit +S E CYCLE REQUIRED A (the one that seemed problematic) in the presence of -S LOGIC GATE A so that in such a case, the E Cycle Required signal would not be asserted until near the end of the memory cycle of this enar miss.<\/li>\n\n\n\n<li>I could inhibit +S E CYCLE REQUIRED (so, all four possibilities) in the presence of -S LOGIC GATE A<\/li>\n<\/ul>\n\n\n\n<p>The first two choices might work, but would leave me in a situation where this problem could recur later on, in some other setting, whereas the third option would prevent it from happening <em>a priori<\/em>.  The only downside that I could see would be that it might prevent a device from transferring data to memory quite as fast as the original.<\/p>\n\n\n\n<p>So I investigated transfer speeds of various devices to see what they might be like:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Unit record devices have a core buffer in the IBM 1414 I\/O Synchronizer, and are slow enough anyway, that it would not likely make any observable difference.<\/li>\n\n\n\n<li>IBM 729 tape drives.  The 729 IV transfers at 112.5 IPS at 556 CPI, the 729 VI at 112.6 IPS at 800 CPI.  So, roughly, 112.6*800 = 90,080 cps, or a bit over 11 micro-seconds per character.  So, every other cycle would be fine on a an 1410 with the Accelerator feature with 4.0 microsecond cycles &#8211; allowing it to &#8220;steal&#8221; every other core cycle.<\/li>\n\n\n\n<li>IBM 7340 Hypertape Drives.  Some models are 112.5 IPS \/ 170,000 cps tape drives (so, roughly 1500 CPI), or 5.9 microseconds\/character.  That would require making transfers in non-overlapped mode (see the 1302 disk drive, below).  However, ones attached to a 1410 more likely operated at 34,000 cps &#8211; plenty slow enough.<\/li>\n\n\n\n<li>A 1405 Disk Drive spins at 1200 RPM (so only 20 rps) and has a 1000 character track size (so, say 1200 to be conservative).  That gives us about (1\/20)\/1200, or 50ms\/1200, or 41 microseconds per character &#8211; lots of time.<\/li>\n\n\n\n<li>A 1301 transfers characters at 90,100 cps, or 11 microseconds per character &#8211; so plenty of time using every other core storage cycle.<\/li>\n\n\n\n<li>A 1302 transfers characters at 184,000 cps, or 5.4 microseconds per character.  But on a 1410, these devices transfer data <em>only<\/em> in non-overlapped mode (even if the I\/O instruction specifies overlapped mode) &#8211; because they <em>must<\/em> use consecutive storage cycles for their data.<\/li>\n\n\n\n<li>A 1311 (the 2311 is not supported) has 2980 characters\/track, and rotates at 40ms\/revolution.  this gives us roughly 13 microseconds per character.<\/li>\n<\/ul>\n\n\n\n<p>These point to things being OK so long as the peripheral can &#8220;steal&#8221; every other core cycle.  On top of that, the 1410 Channels are double buffered, so even if sometimes it takes 5 cycles to get two characters in or out, operation should not be affected.  It would not seem the peripheral speed would prevent using the last option listed.<\/p>\n\n\n\n<p>So, I made the changes, labeling them with a fictitious ECO &#8220;JRJ001&#8221; in the database and tested &#8211; diagnostic CU01 now passes without problems.  Below is what the signals look like, timing wise.  I believe (but cannot prove) that what happened is that +S E CYCLE REQUIRED ended up delaying until the next possible &#8220;last logic gate&#8221; (in a given memory cycle), which is typically logic gate E, as is the case in this capture.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"800\" src=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230708-E_CYCLE_REQUIRED-AT-OPERATOR-AFTER-ECO-JRJ001-1024x800.png\" alt=\"E Cycle Required after Installing &quot;ECO&quot; JRJ001\" class=\"wp-image-1591\" srcset=\"https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230708-E_CYCLE_REQUIRED-AT-OPERATOR-AFTER-ECO-JRJ001-1024x800.png 1024w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230708-E_CYCLE_REQUIRED-AT-OPERATOR-AFTER-ECO-JRJ001-300x234.png 300w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230708-E_CYCLE_REQUIRED-AT-OPERATOR-AFTER-ECO-JRJ001-768x600.png 768w, https:\/\/www.computercollection.net\/wp-content\/uploads\/2023\/07\/20230708-E_CYCLE_REQUIRED-AT-OPERATOR-AFTER-ECO-JRJ001.png 1156w\" sizes=\"auto, (max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><figcaption class=\"wp-element-caption\">E Cycle Required after Installing &#8220;ECO&#8221; JRJ001<\/figcaption><\/figure>\n\n\n\n<p>So, what is next on the block?<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Several enhancements to the console program, including merging what is now the main root window with the 1415 console form &#8211; no need to have them separate.<\/li>\n\n\n\n<li>Run more diagnostics, by saving core images under my software simulator and transferring them to the FPGA memory to read them, as I did with CU01.<\/li>\n\n\n\n<li>Experiments with speeds: how fast can I run the 1411 CPU before it fails its diagnostics?<\/li>\n\n\n\n<li>Research into channel signals. I don&#8217;t have ALDs for the relevant IBM 1414 I\/O Synchronizers, though I do have ILDs, which pretty well define the logic. But rather than parroting exactly what the 1414s would have done, I will likely just use VHDL using some of that ILD logic as a wrapper around communication to and from the PC support program, at least at first.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Once the 1415 console emulation was up and running, I was able to run diagnostics. The first set of diagnostic failures arose in the Assembly Channel because the Automated Logic Diagrams I had for parts of the Assembly Channel were not for the 1410 Accelerator feature, unlike the vast majority of ALD pages I had. &hellip; <a href=\"https:\/\/www.computercollection.net\/index.php\/2023\/07\/10\/ibm-1410-fpga-off-to-the-races\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;IBM 1410 FPGA: Off to the Races??&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"post_folder":[],"class_list":["post-1587","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/posts\/1587","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/comments?post=1587"}],"version-history":[{"count":5,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/posts\/1587\/revisions"}],"predecessor-version":[{"id":1595,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/posts\/1587\/revisions\/1595"}],"wp:attachment":[{"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/media?parent=1587"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/categories?post=1587"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/tags?post=1587"},{"taxonomy":"post_folder","embeddable":true,"href":"https:\/\/www.computercollection.net\/index.php\/wp-json\/wp\/v2\/post_folder?post=1587"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}