PROXY  WHOIS  RQUOTE  TEXTS  SOFT  FOREX  BBOARD
 Music  Philosophy  Code  Literature  Russian

= ROOT|Technical|LinuxGazette|issue101.txt =

page 10 of 36



   Where do I start, and where do I look for clues? 

   Are all the logs found in /var/log, or are there others? 

   In what order should I look at the logs, and what should I look for? 

     (!) [Thomas] It depends what you think went wrong. Essentially:

/var/log/messages

     is where syslogd will dump all its data and so is the best place to look.
     But  there  may  well  be  application  specific  data  in /var/log
     (XFree86.0.log) is one such example.

   (?) Any pro-active steps I should be taking to get more info, should it
   happen again? 

   The specifics of my case: my file server (a 750 Mhz Athlon running Suse 9)
   simply locked up, and I couldn't get anything to display (GUI or command
   line). I knew the machine was in trouble, when it didn't respond to pings.
   I  had  to  hit  the  reset button to get it back (and deal with fsck,
   naturally). Funny thing is, the system clock reset itself to 28 minutes
   after midnight (when it should have read the middle of the afternoon), but
   didn't loose the date. Odd, that. The machine's been running 24/7 for about
   three weeks now (I set it up around then), and no sign of problems until
   now. 

     (!) [Thomas] This might be framebuffer related. At the lilo/grub prompt,
     type:

linux video=vga16:off

     (!) [Thomas] to see if that has any effect.

     There have been snippets of these effects metioned in the past. The one
     that springs to mind is:

     [120]http://linuxgazette.net/issue74/tag/9.html

     (!) [K.-H] There are ways of still getting kernel info (pro active steps):

     * plug an old printer into the lpX port and declare it the system console
       (kernel kompile parameter, and I don't know how exactly you activate it
       -- maybe inittab).
     * When running switch to system console (Alt-Ctrl-F10 on SuSE) and leave
       it there. It might show a kernel oops/panic there next crash.
     * search SuSE config for Magic SysRequest keys -- the function should be
       compiled in the kernel but has to be activated. Then you can press weird
       key-combinations like Alt-Ctrl-Sysreq-R for register dump, ...S for disk
       sync,... see /usr/src/linux/Documentation for details.
     * File server? What hardware? I had SCSI disks locking my system for
       various reasons (Tagged queuing incompatibilites of indiv. drives, too
       long cables,..)

   (?)  I'm  going  to keep your response handy -- several things to try.
   Meantime,  I  realized I was booting the thing into runlevel 5 (rather
   stupid, actually), so I've since changed it to 3. If it is, as someone
   suggested, a framebuffer problem, maybe that will solve it for now. I'm
   using a real old Voodoo 3 card I scrounged from my parts bin. If it happens
   again,  I'll have to tear the machine apart and start playing with the
   memory, as someone else here suggested. 

   (?) install and configure Linux is one thing. Learning how to do an autopsy
   seems to be quite another! 

     (!) [Thomas] That's because generally one doesn't do it quite like that.
     Problem diagnosis is situation dependant. In any givem situation there is
     often a small set of files and related information that you can analyse
     without having to worry about the rest of the system.

     Granted, this is related to how much information one is told at the time
     (if you've been on this list for as long as I have, you'll come to realise
     that usually we don't get any), and whether or not the person has tried to
     remedy it.

     In general though, poking around, taking an aspect of your system, looking
     at what it does, and how is all related and helful to you when you have to
     come to diagnose anything.

   (?) Yes, well, I looked at the messages log, but saw only a gap time-wise
   between cron processing around 4 in the morning, and the time of the crash.
   I'm not sure which of the other logs are important in that case. Where do I
   find the register dump (although I suspect it won't make much sense to me,
   rather like those register dumps you get in Windows XP)? 

     (!) [Thomas] Syslogd might have logged it, if the problem was software
     related, and indeed if the said program produced any errors. If hardware
     then it might not have, depending on the severity of the hardware failure.

   (?) I'm using a real old Voodoo 3 card I scrounged from my parts bin. If it
   happens again, I'll have to tear the machine apart and start playing with
   the memory, as someone else here suggested. 

     (!) [Thomas] It might be memory, but as the link I have you last time
     around said, memory problems tend to be more 'visible' in the sense that
     you get a lot of applications SEGFAULTing and SEGABRTing for no apparant
     reason. In such instances, installing and running 'memtest86' is usually
     of help.

     (!) [K.-H] Most of the time I had the great luck of oopes and kernel
=10=

1.4|5|6|7|8|9| < PREV = PAGE 10 = NEXT > |11|12|13|14|15|16.36

UP TO ROOT | UP TO DIR | TO FIRST PAGE

Google
 


E-mail Facebook Google Digg del.icio.us BlinkList Fark Furl Ma.gnolia Netscape NewsVine Reddit Slashdot Spurl StumbleUpon Technorati YahooMyWeb LiveJournal Blogmarks TwitThis Live News2.ru BobrDobr.ru Memori.ru MoeMesto.ru

0.0131171 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)