Quantcast
Channel: ProLiant Servers (ML,DL,SL) topics
Viewing all articles
Browse latest Browse all 10362

logical drive unavailable after a disk was replaced

$
0
0

Hi.

 

I have a server with a long story. To understand a problem, I should first tell this story. I had a server with P212 controller and 12 drives 1.5Tb each. First two drives I used as a system mirror, but, since the 1.5Tb was a bit too much, I split the mirror to two slices. Then I created two R5 arrays five and four drive each, left one spare, and created the ZFS from 2 arrays/LDs and one slice I mentioned before. Since the server was intended for storing slow static objects, it was okay.

 

It was working for me for several years running Solaris 10. Then a drive in system failed. I replaced it. Then the controller died. This is where the actual story starts - I replaced it with the P410 one. Everything was still fine. Then after some months a drive died in second R5 array (still running fine !). Everything was fine until my technician replaced the drive. One thing: since we were unable to quickly find 1.5Tb drive, we decided to replace faulty one with 2Tb drive. Plus, I had some issues with the memory and we decided to clean the memory from a dust. So, the technician took the server offline, and replaced the rived along with cleaning the memory chips. After booting up I lost the pool, it started looking like this:

 

pool: datatank
    id: 11340815205521362361
 state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

        datatank    UNAVAIL  insufficient replicas
          c4t0d0p2  ONLINE
          c4t1d0    ONLINE
          c4t2d0    UNAVAIL  corrupted data <--- this is the R5 array where the disk was replaced

 

Second of all, when running format I started to see the sector/cylinders configuration for disks 2 and 3 (1st and 2nd R5 arrays identically) - weird, considering one was 1Tb bigger. Like system was unable to see something. The controller was showing for some time the 3rd LD as "Recovering" and I was hoping that after it will be completed I will see my pool back. But, unluckilly, it remained in this exact state.
 I tried powering server down via IPMI, reconfiguring reboot with reboot -- -rv, but I still didn't see the pool. I have backups and my information is unaffected, but I want to understand what happened.

 

Any ideas ?

 

hpacucli tells I'm fine:

 

hpacucli ctrl slot=1 ld all show

Smart Array P410 in Slot 1

   array A

      logicaldrive 1 (1.4 TB, RAID 1, OK)

   array B

      logicaldrive 2 (5.5 TB, RAID 5, OK)

   array C

      logicaldrive 3 (4.1 TB, RAID 5, OK)

 

format shows this (notice that 1st and 2nd seem to be identical in fdisk, which is impossible):

 

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c4t0d0 <DEFAULT cyl 3480 alt 2 hd 255 sec 189>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@0,0
       1. c4t1d0 <HP-LOGICAL VOLUME-6.40-5.46TB>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@1,0
       2. c4t2d0 <HP-LOGICAL VOLUME-6.40-4.09TB>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@2,0
Specify disk (enter its number): 1
selecting c4t1d0
[disk formatted]


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        fdisk      - run the fdisk program
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        inquiry    - show vendor, product and revision
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> fdisk
             Total disk size is 60799 cylinders
             Cylinder size is 192780 (512 byte) blocks

                                               Cylinders
      Partition   Status    Type          Start   End   Length    %
      =========   ======    ============  =====   ===   ======   ===
          1                 EFI               0  60798    60799    100





SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)
Enter Selection: 6


format> disk


AVAILABLE DISK SELECTIONS:
       0. c4t0d0 <DEFAULT cyl 3480 alt 2 hd 255 sec 189>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@0,0
       1. c4t1d0 <HP-LOGICAL VOLUME-6.40-5.46TB>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@1,0
       2. c4t2d0 <HP-LOGICAL VOLUME-6.40-4.09TB>
          /pci@0,0/pci8086,340a@3/pci103c,3243@0/sd@2,0
Specify disk (enter its number)[1]: 2
selecting c4t2d0
[disk formatted]
format> fdisk
             Total disk size is 60799 cylinders
             Cylinder size is 144585 (512 byte) blocks

                                               Cylinders
      Partition   Status    Type          Start   End   Length    %
      =========   ======    ============  =====   ===   ======   ===
          1                 EFI               0  60798    60799    100





SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)
Enter Selection: 6


format> quit

 

Thanks.


Viewing all articles
Browse latest Browse all 10362

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>