Problem Description
OSN3500 host version 5.21.13.47 device main control board failure needs to be replaced. When replacing the main control board, it was found that the service of the whole network was interrupted after the new main control board was plugged in and the power was turned on. After downloading the configuration data to the network elements, the service returns to normal.
Alarm information
None
Processing
Redownload the configuration data.
Root cause
Since all the services on the device are interrupted, firstly, we suspect that verify was automatically sent from the host side after the main control board was started, which led to the service interruption caused by sending the empty data from the new main control board to the single board. However, when analyzing the black box at the time of the failure, no verify was found, and the network element reported the ne_install alarm after the new main control board started working. That means no verify was issued on the network element side.
While analyzing the black box of the network element, we found that there are many records of requesting to start work on both the master and backup cross boards. When the new master board was started, the master and backup cross boards received separate responses for startup.
The only possibility through the black box analysis is that the main and backup cross boards had been reset before the replacement of the main control board, but after the reset, when applying for startup to the main control board, the main control board failed to get the confirmation from the host computer, and thus applied for startup without any limitation. After the new main control board started, the cross application switch was responded to, and the new main control board was in a safe state, so the automatic addition of the cross board rosette board was issued, resulting in business interruption.
Recommendations and Summary
In the replacement of the master control board needs to query the status of the cross board, use the command cfg-get-workstat: bid to determine that the status of both the main and backup cross boards is working. if the main and backup cross board status bits are not working (in this state business services will not be interrupted), the replacement of the master control board is bound to lead to business interruption.


Chinese
English





