Ubuntu Server Troubleshooting

56 295 0
Tài liệu đã được kiểm tra trùng lặp
Ubuntu Server Troubleshooting

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

343 CHAPTER 14 Ubuntu Server Troubleshooting Fixing the Most Common Problems A lthough Ubuntu Server is an extremely stable server operating system, you might encounter problems occasionally, ranging from a Linux- related issue to a simple hard- ware failure. In this chapter you’ll learn how to troubleshoot some of the most common problems. Some say troubleshooting is difficult and requires years of experience. Experience indeed helps, but a good analytical mind is the most important troubleshooting tool. In day-to- day troubleshooting, you first have to determine where exactly a given problem has occurred. If, for example, you have a problem with a kernel module, it doesn’t make much sense to troubleshoot your web server. After determining the location and scope of the problem as well as you can, you can apply your skills to fix the problem. This requires that you have a good understanding of how the erratic system component is supposed to function and can choose the correct tool to repair it. This chapter first explains how to determine where exactly a problem has occurred. Next, it introduces you to some of the best troubleshooting tools to use. Finally, this chapter identifies some of the most common problems and explains how to fix them. N Note This chapter assumes that you are familiar with basic principles of Ubuntu system administration. If you want to refresh your knowledge, try Beginning Ubuntu LTS Server Administration, Second Edition, in which I explain essential concepts such as the boot procedure and kernel management. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 344 Identifying the Problem The most common first step when trying to identify a problem is to reboot your server and wait until the problem occurs. Most problems reveal themselves as your server boots, because most services are activated during the boot process. Therefore, knowing the different stages of the boot process is very important. If you succeed in determining the stage in which a problem occurs, you have made a good start in troubleshooting the problem. The following list summarizes the different phases in the boot process: 1. Hardware initialization: Hardware initialization occurs during the Power On Self Test (POST). During this phase, your computer reads the BIOSes of the different hardware components and performs a check to see if all devices can initialize properly. If any of them can’t initialize properly, you will not see the Grub prompt appear on your server and your server will warn you with clear error messages or beeps. If that happens, consult the documentation of your server to find out how to fix the hardware issue. 2. Grub loading: After initializing the hardware, the server accesses the boot device and reads the boot loader in the master boot record (MBR), which is the first sec- tor of 512 bytes at the beginning of the bootable hard drive. The MBR includes two important components. First is the Grub boot loader. This system component is installed in the first 446 bytes of the MBR and makes sure that the operating system on your server can load. To do this, Grub accesses its configuration in the directory +^kkp+cnq^ . Second in the MBR is the partition table. This component is essential for accessing all files on your server. If in this stage there is an error, you typically get a Grub error and, most important, the kernel will not start to load. If there is no error, you can access the Grub menu, displayed in Figure 14-1. So if you see that the kernel has started to load (see Figure 14-2), you know that your server has passed stages 1 and 2 successfully. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 345 Figure 14-1. If you see the Grub menu, the first 446 bytes of the MBR have been read. Figure 14-2. The kernel has started to load, which indicates the first two stages of your server’s boot procedure have completed successfully. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 346 3. Kernel and initrd loading: If you see that the kernel starts loading, that doesn’t guarantee success with regard to the kernel and the ramfs image that contains some required drivers. It’s possible that either the kernel itself or some of the driv- ers associated with the kernel still may not load. If that is the case, you will see the message “kernel panic” in most cases, or sometimes the kernel just stops load- ing, as in the example shown in Figure 14-3. Either way, you know for sure that the error is related to the kernel. You might get a kernel panic if you have tried to recompile the kernel and failed, or if one of the parameters that you have passed to Grub is wrong. A kernel panic can also be caused by a failing kernel module, but this is rare. So, if you’ve just recompiled your kernel and then get a kernel panic when you attempt to reboot, you know what is wrong (you did keep a copy of your old kernel, didn’t you?). If you didn’t recently recompile your kernel, check whether something has changed recently with regard to Grub parameters. If not, you may have a failing driver, or initrd. N Tip Grub by default is configured not to show information about the kernel initialization. To make trouble- shooting easier, I recommend removing the line that reads mqeap from the +^kkp+cnq^+iajq*hop file. If you see a olh]od9 statement, remove that as well. Figure 14-3. If the kernel just stops loading, the problem is definitely in phase 3 of the boot procedure. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 347 4. Upstart: On Ubuntu Server, Upstart is responsible for starting the ejep process and associated essential services. To do this, Upstart executes all scripts it finds in the directory +ap_+arajp*` (see Listing 14-1). You will rarely see messages that are related to Upstart itself, because it is just the service that is responsible for loading other services. If, however, none of the services on your server can initialize, or you get an error related to ejep (such as you can see in Figure 14-2), something may be wrong with Upstart. Make sure that its configuration directory, +ap_+arajp*` , is readable. Listing 14-1. To Start Important System Services, Upstart Reads the Configuration Files in /etc/event.d nkkp<IUH6+ap_+arajp*`ho _kjpnkh)]hp)`ahapan_-n_0n_)`ab]qhpoqhkcejppu/ppu2 hkc`n_.n_1n_Oppu-ppu0 n_,n_/n_2n_O)oqhkcejppu.ppu1 5. Essential services: Once Upstart has loaded, it starts executing the scripts it finds in +ap_+arajp*` . Basically, these scripts don’t execute anything, but just redirect you to other scripts that are in the directory +ap_+ejep*` and executed from the direc- tory that corresponds to the current runlevel. For example, if you are currently in runlevel 3, the services that are started are started from the directory +ap_+n_/*` (see Listing 14-2). There is such a directory for every runlevel between 0 and 6, inclusive, deter- mining exactly what should be started when entering a runlevel. As you can see in Listing 14-2, the runlevel directories don’t contain real files, but instead contain symbolic links to files that are located in the directory +ap_+ejep*` . Here the system finds the real services that it should start. If one of these script fails, you typically see an error. Because these are essential services, such as the service that loads file systems, your system will most likely stop, giving you a clear indication of what is wrong. If the problem is obvious, you can just fix the problem. In some cases, the problem might not be obvious, in which case you should look at the order in which the scripts are started and try to deduce from that order which script failed. For instance, if you notice that the SSH process never gets loaded, it is obvious that the problem is in one of the scripts executed just before that. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 348 Listing 14-2. The Order of the Runlevel Scripts May Help You to Find Which Script Failed nkkp<iah6+ap_+n_/*`ho)h pkp]h0 )ns)n))n))-nkkpnkkp112.,,4),0)-5,-6,1NA=@IA hnstnstnst-nkkpnkkp-4.,,4),0).5-061.O-,ouoghkc`):**+ejep*`+ouoghkc` hnstnstnst-nkkpnkkp/0.,,4),1),-,26-1O-,toanran)tknc)ejlqp)s]_ki): ± **+ejep*`+toanran)tknc)ejlqp)s]_ki hnstnstnst-nkkpnkkp-1.,,4),0).5-061.O--ghkc`):**+ejep*`+ghkc` hnstnstnst-nkkpnkkp-0.,,4),2) -0604O-.`^qo):**+ejep*`+`^qo hnstnstnst-nkkpnkkp-3.,,4),4).1-060.O-2klajrlj):**+ejep*`+klajrlj hnstnstnst-nkkpnkkp-0.,,4),0)/,-0611O-2ood):**+ejep*`+ood hnstnstnst-nkkpnkkp./.,,4),4)-1,1613O-3iuomh)j`^)ici): ± **+ejep*`+iuomh)j`^)ici hnstnstnst-nkkpnkkp-3.,,4),1)-3--6/0O-3lknpi]l):**+ejep*`+lknpi]l hnstnstnst-nkkpnkkp-5.,,4),4)-1,1613O-4iuomh)j`^):**+ejep*`+iuomh)j`^ hnstnstnst-nkkpnkkp-0.,,4),1)-3--6/0O-4jeo):**+ejep*`+jeo hnstnstnst-nkkpnkkp-1.,,4),4)-1,1613O-5iuomh):**+ejep*`+iuomh hnstnstnst-nkkpnkkp.0.,,4),1),--061.O-5lkopcnaomh)4*/): ± **+ejep*`+lkopcnaomh)4*/ hnstnstnst-nkkpnkkp-1.,,4),4)--,.61/O-5oh]l`):**+ejep*`+oh]l` hnstnstnst-nkkpnkkp.-.,,4),2)--,56 O.,`d_l/)nah]u):**+ejep*`+`d_l/)nah]u hnstnstnst-nkkpnkkp-0.,,4),1),--061.O.,a^kt):**+ejep*`+a^kt hnstnstnst-nkkpnkkp-1.,,4),3),5,/6,1O.,atei0):**+ejep*`+atei0 hnstnstnst-nkkpnkkp-3.,,4),1).3-16/1O.,eblhqc`):**+ejep*`+eblhqc` hnstnstnst-nkkpnkkp.-.,,4),3)/,,06,5O.,eo_oep]ncap):**+ejep*`+eo_oep]ncap hnstnstnst-nkkpnkkp-0.,,4),2).,-16,.O.,gri):**+ejep*`+gri hnstnstnst-nkkpnkkp.-.,,4),4)---,602O.,he^joo)h`]l):**+ejep*`+he^joo)h`]l hnstnstnst-nkkpnkkp.-.,,4),2).,-16,.O.,he^renp)^ej):**+ejep*`+he^renp)^ej hnstnstnst-nkkpnkkp.,.,,4),1)-3--6/0O.,jbo)_kiikj):**+ejep*`+jbo)_kiikj hnstnstnst-nkkpnkkp.3.,,4),1)-3--6/0O.,jbo)ganjah)oanran): ± **+ejep*`+jbo)ganjah)oanran hnstnstnst-nkkpnkkp./.,,4),1)-3--6/0O.,klaj^o`)ejap`): ± **+ejep*`+klaj^o`)ejap` hnstnstnst-nkkpnkkp-2.,,4),2)./,56-.O.,mq]cc]):**+ejep*`+mq]cc] hnstnstnst-nkkpnkkp-1.,,4),0).5-06,-O.,nouj_):**+ejep*`+nouj_ hnstnstnst-nkkpnkkp-1.,,4),0)/,-2614O.,o]i^]):**+ejep*`+o]i^] hnstnstnst-nkkpnkkp-3.,,4),1)-3-46.5O.,ouoop]p):**+ejep*`+ouoop]p hnstnstnst-nkkpnkkp-5.,,4),1)-3--6/0O.,pbpl`)dl]):**+ejep*`+pbpl`)dl] CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 349 hnstnstnst-nkkpnkkp-3.,,4),4)-0,56/2O.,sej^ej`):**+ejep*`+sej^ej` hnstnstnst-nkkpnkkp-2.,,4),2)--,560,O.,tejap`):**+ejep*`+tejap` hnstnstnst-nkkpnkkp-4.,,4),4)-0,06,4O.-mqkp]nl_):**+ejep*`+mqkp]nl_ hnstnstnst-nkkpnkkp-0.,,4),1)-3--6/0O./jpl):**+ejep*`+jpl hnstnstnst-nkkpnkkp-1.,,4),0).5-061.O.1i`]`i):**+ejep*`+i`]`i hnstnstnst-nkkpnkkp-3.,,4),3),5,/6,1O/,j]ceko.):**+ejep*`+j]ceko. hnstnstnst-nkkpnkkp-1.,,4),2).1-06.4O/,omqe`):**+ejep*`+omqe` hnstnstnst-nkkpnkkp .,,4),1)-3--6/0O0,`d_l/)oanran): ± **+ejep*`+`d_l/)oanran hnstnstnst-nkkpnkkp.2.,,4),1)-3-261-O0,`n^h)_heajpo)j]p): ± **+ejep*`+`n^h)_heajpo)j]p hnstnstnst-nkkpnkkp-0.,,4),3)/,,06,.O3,`n^`):**+ejep*`+`n^` hnstnstnst-nkkpnkkp-5.,,4),3)/,,06,.O31da]np^a]p):**+ejep*`+da]np^a]p hnstnstnst-nkkpnkkp-0.,,4),0).5-06,,O45]p`):**+ejep*`+]p` hnstnstnst-nkkpnkkp-0.,,4),0).5-06,,O45_nkj):**+ejep*`+_nkj hnstnstnst-nkkpnkkp-3.,,4),1),--061-O5-]l]_da.):**+ejep*`+]l]_da. hnstnstnst-nkkpnkkp-4.,,4),0).5-061.O55n_*hk_]h):**+ejep*`+n_*hk_]h hnstnstnst-nkkpnkkp-5.,,4),0).5-061.O55nijkhkcej):**+ejep*`+nijkhkcej 6. Networking: Among the most important of the nonessential services is network- ing. If networking fails, many other services will fail as well. So if you see that many services that depend on networking fail, check your network configuration. The network is started from the script +ap_+ejep*`+japskngejc . This script reads in +ap_+ japskng+ejpanb]_ao which network configuration it should start (see Listing 14-3). If something is wrong with your network, the most likely problem is an error in this script. Test network connectivity after you think you have fixed a network problem; ping is still the best utility to perform such tests. Listing 14-3. The /etc/init.d/networking Script Learns from /etc/network/interfaces Which Configuration to Initialize nkkp<iah6+ap_+japskng_]pejpanb]_ao Pdeobeha`ao_ne^aopdajapskngejpanb]_ao]r]eh]^hakjukqnouopai ]j`dkspk]_per]papdai*Bkniknaejbkni]pekj(oaaejpanb]_ao$1%* Pdahkkl^]_gjapskngejpanb]_a ]qpkhk eb]_ahkejaphkkl^]_g CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 350 Pdalnei]nujapskngejpanb]_a ]qpkapd, eb]_aapd,ejapop]pe_ ]``naoo-5.*-24*-*55 japi]og.11*.11*.11*, japskng-5.*-24*-*, ^nk]`_]op-5.*-24*-*.11 c]pas]u-5.*-24*-*.10 `jo)&klpekjo]naeilhaiajpa`^updanaokhr_kjbl]_g]ca(eb ± ejop]hha` `jo)j]iaoanrano-5/*35*./3*/5 `jo)oa]n_do]j`anr]jrqcp*jh ]qpk^n, eb]_a^n,ejapop]pe_ ]``naoo-5.*-24*-*55 japskng-5.*-24*-*, japi]og.11*.11*.11*, ^nk]`_]op-5.*-24*-*.11 c]pas]u-5.*-24*-*.10 ^ne`ca[lknpoapd, ^ne`ca[b`, ^ne`ca[dahhk. ^ne`ca[i]t]ca-. ^ne`ca[opklkbb ]qpkapd- eb]_aapd-ejapop]pe_ ]``naoo-,*,*,*-, japi]og.11*.11*.11*, japskng-,*,*,*, ^nk]`_]op-,*,*,*.11 7. Nonessential services: If you have made it this far, basically, your server is opera- tional. You still might have a service fail, though. If one of your services fails, the most likely problem is a configuration error in the service script. Check the docu- mentation about your service and try to repair the script. Once you have arrived at this stage, at least you know for sure that the problem exists in a particular service, so you can start troubleshooting at the right location. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 351 Troubleshooting Tools There are some very useful tools that you must have available before you start a trouble- shooting session: s ejep9+^ej+^]od : This Grub option enables you to load a shell immediately after the kernel has loaded successfully. sRescue a Broken System option: This option on the Ubuntu Server installation CD takes you to an environment in which you can apply your troubleshooting techniques. sA Linux live CD: One of my personal favorites, and thus covered in this section, is Knoppix ( dppl6++sss*gjkllet*_ki ), a live CD that contains lots of useful utilities that help you to troubleshoot a failing server. If you choose a different live CD, find one that doesn’t have restrictions and takes you to an unlimited root shell as fast as possible. Working with init=/bin/bash The tool that is easiest to use is the option ejep9+^ej+^]od that you can pass to Grub when booting. It takes you to the end of the third stage of the boot procedure, right after the kernel and initrd have been loaded. This option is useful in cases where you have found that the kernel can load successfully, but there is an essential problem later in the boot procedure. Here is how you can activate it: 1. Reboot your server. During the three seconds that Ubuntu Server by default shows the Grub prompt, press Escape to access the options available in the Grub menu, an example of which is shown in Figure 14-4. 2. Select the line that has the kernel image you want to start (typically, this is the first line) and press e to edit the commands that are in this boot loader menu option. This shows you the lines in the +^kkp+cnq^+iajq*hop file that are defined for this section (see Figure 14-5). CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 352 Figure 14-4. From the Grub menu, you can pass options to the boot loader. Figure 14-5. By selecting the section you want to start, you see the different lines that comprise that section. [...]... procedure describes how it works: 1 Put the Ubuntu Server installation CD in your server s optical drive and reboot your server Make sure it boots from the optical drive 2 When you see the installation interface shown in Figure 14-7, select Rescue a Broken System and press Enter 353 354 C HAPTER 14 UB U NTU S ER VER TR OU B L ES HOOTING Figure 14-7 On the Ubuntu Server installation CD, you’ll find an option... your troubleshooting session (see Figure 14-6) 4 You are now in a bash shell, without anything being mounted or started for you This offers you an excellent starting point for troubleshooting Mount your file systems and execute all services that you want to test by hand Figure 14-6 Using the option init=/bin/bash is the quickest way to access a troubleshooting shell Rescue a Broken System The Ubuntu Server. .. your server loads the correct code table and keyboard settings 4 After loading the appropriate keyboard settings, if your server has multiple network cards, you have to specify which network card you want to use Your server then tries to get an IP address from the DHCP server on your network Next, enter a temporary hostname It doesn’t really matter what you choose here, so the default hostname Ubuntu. .. the Listing 14-5, and can start troubleshooting Listing 14-5 Use chroot for Troubleshooting At this point you are ready to use your troubleshooting environment In the next section you will read about some scenarios in which a rescue environment like the Knoppix Live CD is useful Note There is no fundamental difference between using the Knoppix Live CD and using the Ubuntu Server Installation CD for your... is compiled If it is enabled, it has the value ; if it’s not, it has the value On Ubuntu Server, it is enabled by default If on your server it and reboot your is not enabled for some reason, put the following line in is enabled by default (see Chapter 4 for more information server to make sure that ): on Now when the server hangs, press Alt+Print Screen+t to tell your system to dump a stack trace to... just hard reset your server Both options restart your server Working with a Knoppix Rescue CD If you choose to work from a generic rescue disk, Knoppix is a good choice that offers you complete flexibility in repairing your server You can download Knoppix from In this section you’ll read how to boot from Knoppix and how to enter environment in which you can troubleshoot your Linux server a environment... Knoppix CD offers many useful utilities In the next section, use whichever solution you prefer Common Problems and How to Fix Them Although Ubuntu Server is a fairly stable server platform, you may encounter some problems This section gives you some hints for troubleshooting the following common problems: C HA P T E R 1 4 U B U N T U S E R V E R T R O U B LE S H O O T I N G Grub Errors The very first... can read the boot information that your server uses (see Figure 14-15 for an example) Figure 14-15 Read the menu.lst file for an example of the options your server normally uses when booting 363 364 C HAPTER 14 UB U NTU S ER VER TR OU B L ES HOOTING 5 Enter the , , and lines from your default section in Then, type boot to start booting your server Observe your server at the same time to make sure that... volumes on bootup of the host server Therefore, the host server will find LVM volumes within the LVM devices and just activate them The result is that the virtual server that is supposed to use these volumes finds that they are already in use and concludes that it can’t use them The solution is to exclude the LVM devices from being scanned for LVM volumes when the host server boots To exclude LVM devices,... Troubleshooting goes much better from a work with your Ubuntu Server file system from a mounted directory; instead, you actually change the root of the rescue disk to this directory The advantage of this is that all utilities will work with their native paths For instance, if a command like file to be in , the utility is not going to work if, due expects its to the fact that you have mounted your server . 343 CHAPTER 14 Ubuntu Server Troubleshooting Fixing the Most Common Problems A lthough Ubuntu Server is an extremely stable server operating system,. in phase 3 of the boot procedure. CHAPTER 14 N UBUNTU SERVER TROUBLESHOOTING 347 4. Upstart: On Ubuntu Server, Upstart is responsible for starting the

Ngày đăng: 19/10/2013, 02:20

Tài liệu cùng người dùng

Tài liệu liên quan