DELUA technical manual, VAX diagnostic

20 Nov 2014

I have to say this about the LCM VAX: it successfully ran CMUIP for a
rather long time.  The pattern of failures, becoming more and more
frequent, seems to conform to a hardware issue rather than one of
software.  The machine has never had a large load - when I watched it
regularly, there were rarely more than a handful of users at any one time.
The machine has been power-cycled in response to errors, which would of
course reinitialize all transient data structures - and I do not believe
that CMUIP uses persistent caches (i.e. cached to disk).
Yes, LCM could just load UCX and perhaps whistle a happy tune.  It might be
an interesting experiment to do so and observe behavior - the changes in
the startup script could easily be commented in/out.  I'd certainly like to
see them continue to use CMUIP for historical reasons.  Multinet would also
be interesting, if policies have changed at LCM to provide for licensing
costs of that software.  When I was restoring the machine back in 2008-9,
Process Software offered a somewhat amusing 'discount' for an educational
institution.  -- Ian
On Wed, Nov 19, 2014 at 8:03 AM, Peter Coghlan <cctech at beyondthepale.ie>
wrote:
...

  One thing, though.
 I don't think that the error code from the $QIO in the OPCOM log is a
 VMS exit code. But I might be wrong on that.
 But that could do with some more examining.
  There is a poorly phrased entry in the CMU/IP FAQ which could give the
 impression that CMU/IP uses it's own error codes that are entirely
 different
 from VMS status codes.  What I think it is really trying to say is that
 like
 many VMS applications, CMU/IP defines _additional_ status codes that VMS
 does
 not already have suitable messages defined for and the text messages
 associated
 with these are not available unless the appropriate CMU/IP provided message
 files are loaded.
 Low numbered error codes such as 1C (and another favourite - 0C which is
 %SYSTEM-F-ACCVIO, access violation) come from system services and runtime
 library functions that are part of VMS and the message texts are made
 available
 automatically by VMS.  It is not the case that CMU/IP reporting an error
 code
 of 1C means something different to some part of VMS reporting it.  They
 both
 mean process quota exceeded.
 Directly underneath that entry in the FAQ, I found the following:
  3.1.2 >>>> IPACP CRASH DUE TO QUOTA
EXCEEDED
  [20-MAR-1995]
 For systems with a high IP load, IPACP may occasionally crash with a quota
 exceeded. This does not refer to disk quota, but to one of the process
 quota
 limits. Usually, the quota in question is BYTLM.
 The default BYTLM provided for IPACP (65536) is sufficient for only about
 20
 connections. IPACP takes about 32000 for itself and each connection takes
  about
  1872 bytes. This requirement is NOT currently
documented.
 To increase the BYTLM for the IPACP, modify the IP_STARTUP.COM procedure
 and
 change the value of the /BUFFER_LIMIT qualifier on the RUN command that
 starts
 the IPACP process. Then shut down and restart IPACP.
 At the current time, there also appears to be a memory leak in IPACP
 which has
 the effect of gradually reducing the available BYTLM over time. When this
 gets
 close to zero, IPACP will hang (as it retries) and then crash soon
 afterwards.
 It is therefore desirable to give IPACP more BYTLM than the typical load
 might
 suggest. If this sort of crash is experienced, increase the BYTLM by 50%
 and
 restart it.
                                                       <A.Harper at kcl.ac.uk

  Looks like my pagefile quota guess was wrong and the culprit is BYTLM.
 However,
 I suspect the underlying cause of this problem has never been fully
 addressed
 and increasing the quota will not help, or worse, will help for about a
 week
 before the problem returns even more frequently.
 I cannot overemphasise how much relief will be experienced on the
 replacement
 of CMU/IP by something that works properly or even by something that
 doesn't
 mess up as badly.  Problems that you didn't even know you had will go away,
 even ones which seemed unrelated to networking.  On sunny days, the sun
 will
 seem brighter and the sky bluer :-)
 In my previous posting, I forgot to mention that you can also try:
 $ MCR NCP SHOW KNOWN LINE COUNTERS
 if running DECnet.  This will give DECnet's view on any network media
 problems
 including those relating to other protocols going through the same network
 adapter.  It probably won't have much to say about hardware failures in the
 network adapter though.  Remember that on a half duplex ethernet,
 collisions
 are normal and expected but late collisions indicate a problem.
 Regards,
 Peter Coghlan.

--
Ian S. King, MSIS, MSCS
Ph.D. Candidate
The Information School
University of Washington
An optimist sees a glass half full. A pessimist sees it half empty. An
engineer sees it twice as large as it needs to be.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

DELUA technical manual, VAX diagnostic