Software Reliability Growth Model for FPGA-based Safety Critical Software System

Satrio Pradana, Jaecheon Jung*
Department of Nuclear Power Plant Engineering
KEPCO International Nuclear Graduate School, Ulsan, South Korea 45014
*Corresponding author: jcjung@kings.ac.kr

1. Introduction

In the NPP design, FPGA technology is mainly applied for safety critical I&C systems such as the Reactor Protection System (RPS). Recently, the FPGA technology is more and more extensively applied both for the new NPP I&C system design and for updating the obsolete systems of operating plants especially the safety system. The role of FPGA-based systems has become very significant; therefore, the reliability evaluation of the FPGA-based systems has drawn the attention of researchers [1].

As digital systems are continuously being introduced into nuclear power plants, the needs of reliability analysis for digital systems are increasing. Kang and Sung [2] identified (1) a piece of software’s reliability, (2) common-cause failures (CCFs), and (3) fault coverage as the three most critical factors during the reliability analysis of the digital system. For a reliability estimation of the safety-critical software (the software that is used in safety-critical digital systems), the FPGA-based need an approach to estimate the reliability and predicting the failure of software.

In this work, an attempt is made to analyze the reliability of FPGA-based system considering the software reliability growth model (SRGM) methodology – as the NRC Technical reference NUREG/CR-6101 reports and IEEE Std. 1633 that the SRGM is one of the possible methodologies to model the instrumentation and control system [3][4].

2. Methods

Several steps have to be taken in order to obtain failure data and analyze the reliability. Plant Protection System (PPS) are one of safety critical safety system in Nuclear Power Plant. It has a function to trip the reactor through bistable logic controller and generate trip signals based on the measurement channel value exceeding a setpoint then transmit the signals to the Coincidence Processor (CP) located in four redundant channels. PPS receives sixteen (16) signals indicating safety-related plant conditions: fourteen (14) analog signals and two (2) digital signals. Besides the sixteen signals, manual trip signals by operator are provided. In this work, we only focus on Low Pressurizer Pressure Trip (LPPT).

2.1 Test Case

Safety critical software applications often require proof that they have been thoroughly tested. Hence, programmers and testers are expected to write good test cases [5] which can verify the behavior of the entire system. However, in real-life applications, exhaustive testing is impractical as the input domain could be extremely large or infinite. Thus, the main challenge is to demonstrate the adequacy of testing effectively.

Figure 1. Vivado simulation for PPS LPPT
According to IEEE 1633-2016, Recommended Practice on Software Reliability, test cases from black box testing can be used as operational profiles to support test selection both for collect failure data and verification & validation activities.

Test cases were then combined into test benches for the entire system. The design simulation was performed using Vivado. Figure 1 illustrates the simulation environment. Based on the generated waveforms, the verification of the operation of the different system modules was performed. All the faults from this testing phase are gathered. It helps collected and making a dataset to be applied in software reliability growth model.

2.2 Test Bench

A well-established test benches are required to get high quality of testing.

2.3 White Box Testing

White-box testing is a method of testing the application at the level of the source code. It focuses on internal coding, flow of inputs and outputs through the application. White-box testing is developed with the test cases by executing methods and often used for verification phase by the programmer or independent test. Since this development of FPGA-based PPS has the VHDL code, the white-box testing can be used for verification process. In the other way, black box testing is the functional and behavioral testing, focuses on determining whether or not a program does what it is supposed to do based on its functional requirements [6].

Instead of black box testing, white box testing are used to collect failure data [7] as well as for V&V activities. In this research, white box testing was performed for PPS LPP bistable function using VIVIDO simulator and Aldec HDL Program. Figure 2 show a simulation of white box testing using Aldec HDL program.

This white box testing is only focus on internal behavior from input and output. Simulation on Vivado and Aldec are behavioral simulation. Synthesis, place & route for actual implementation of FPGA is not covered in this thesis.

Code coverage is a technique that allows engineers to collect the statistics on the execution of each line of HDL code, and evaluate the quality of their test. Code coverage can be roughly divided into statement coverage and branch coverage. Statement coverage provides information on which statements inside the VHDL or Verilog code were executed during simulation and how many times. Branch coverage examines the execution of conditional statements. It provides the data on which branches were executed during the simulation, how many times each branch was executed, and how many times the branch condition evaluated to true or false [8].

Code coverage and functional coverage are extensively used during validation to evaluate the effectiveness of testing. Ideally, designs must reach 100% statement, branch and functional coverage, however exceptions are made if it is known that a given coverage point is unreachable or not important.

![Figure 3 Coverage testing result](image1)

**Figure 3 Coverage testing result**

PPS LPPT VHDL test cases are generated to meet the 100% code coverage. ALDEC software tool is used to perform the code coverage test. Figure 3 shows coverage testing result.

3. Software Reliability Tools

The several software reliability tools are available to apply one or more of the software reliability model to a development effort and to determine the applicability of a particular model to a set of failure data. A major issue in modeling software reliability lies in the ease-of-use of currently available tools.

![Figure 2 White box testing of LPPT using Aldec HDL Program](image2)

**Figure 2 White box testing of LPPT using Aldec HDL Program**
Following tasks are handled by the SRE tools:
a) Collecting failure and test time information
b) Calculating estimates of model parameters using information available.
c) Testing to fit a model against the collected information.
d) Selecting a model to make predictions of remaining faults, time to test, etc.
e) Applying the model

As a recommendation from literature review as shown in the comparison [9] Table 1 and according to the availability of using free version of the tool, SMERFS tool has been selected.

<table>
<thead>
<tr>
<th>Table 1. Comparison table of tools</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Factors</strong></td>
</tr>
<tr>
<td>Language</td>
</tr>
<tr>
<td>Performance &amp; usability</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td>Number of models supported</td>
</tr>
<tr>
<td>Available models for Estimation Prediction</td>
</tr>
<tr>
<td>Graphics</td>
</tr>
<tr>
<td>User assistance</td>
</tr>
</tbody>
</table>

The software reliability prediction tool is SMERFS (Statistical Modeling and Estimation of Reliability Functions for Software), a well-known and widely accepted software application for evaluation of test data for failure rate and defect discovery rate prediction. The version of SMERFS used in this study included a total of 15 different reliability growth models. The input to SMERFS is a set of values consisting either of the time between discoveries of defects or the number of defects discovered per time period.

SMERFS then uses maximum likelihood methods or least squares methods to estimate the parameters used for one or more of these models (depending on the type of input and user selected options). Its output includes the parameter estimates, predicted values and a measure of the goodness-of-fit using the chi-squared distribution.

SMERFS is used to evaluate the different models and validate the software reliability according to the following steps: Perform Assessment/Prediction Analysis and Forecast Additional Test duration. There are two types of models in SMERFS:

a) Interval data counts models
   Typical interval data count models, including Generalized Poisson model (GPO), Brooks and Motley Poisson model (BMP), and binomial model (BMB), GO (also called NHPP) model, and Yamada delayed S-shaped model (YAM).

b) Failure-count models
   Typical FC models, including LV, geometric model (GEO), Musa Basic (MB), Jelinski/Moranda (JM), and Musa Okumoto (MO) [10].

4. Predicting Software Reliability Growth Model

For this task, SMERFS program are used. SMERFS pull-down menus are concise and leave little room for misunderstanding. Firstly, specify whether the input is time-between-failure or interval data. For interval models, failure counts for every interval, even those without failures, must be provided in a text file with care not to have a blank line at the end of the file. The models can be selected and executed but better to let SMERFS do the selections with accuracy analyses for the models. If the data are grossly inappropriate for a model, the tool will informs.

The following models are abbreviated in SMERFS Program:

- BMB: Brook and Motley’s Binomial
- BMP: Brook and Motley’s Poisson
- GP-(1-3): Generalized Poisson (Treatment 1-3)
- NHP: Non-homogeneous Poisson
- S-T(1-3): Schneidewind (Treatment 1-3)
- YAM: Yamada S-shaped

The data collected during 19 times execution of testing. When considering the whole dataset, the Table 2 shows the data gathered in format count of failure by interval. Failure data obtained during development phase. From several testing, data was chosen for 19 execution times with 33 failures.

<table>
<thead>
<tr>
<th>Table 2 PPS LPPT Failure Dataset</th>
</tr>
</thead>
<tbody>
<tr>
<td>No.</td>
</tr>
<tr>
<td># of Faults</td>
</tr>
</tbody>
</table>
From dataset given above, SMERFS program performs prediction for software reliability growth with accuracy analyses. After execution, Figure 4 shows the graph from all the model executed. The green nodes indicates the observed faults, and the graphs with different color indicates prediction of the models.

Figure 4 Failure Predicted Data plot from SMERFS program

Figure 5 Summary for execution of dataset

Figure 5 shows the summary for execution of dataset. It gives statistics of the data, accuracy rank, total number of faults (TNOF), total number of faults remaining (TNOFR), and chi squares are also included. Based on execution result, almost all model are executed except Generalized Poisson-3 (GP-3). It marked by red color that shown in Figure 5, means the model are not executed by reason of model are not suit for the dataset. For Schneidewind Treatment 2 (S-T2) and 3 (S-T3) model are marked by grey color, means the dataset are not applicable for the model.

5. Conclusion

This paper presents an approach for assessing a reliability measurement of safety critical software FPGA-based for plant protection system. The quality of model of the software reliability model also presented based on several test in verification and validation activities of FPGA-based system. The approach require numerous testing and management engineering before beginning of testing. Evaluating the model of software reliability growth is important to select the best model are fit the observed fault trends. After model are selected, reliability estimation can be performed to reduce the number of iteration in fixing the failure during development and reduces the likelihood of design errors because it allows tests quality to be increased in respect to an objective measurement.

Acknowledgements

This work was supported by the 2018 Research fund of KEPCO International Nuclear Graduate School (KINGS), Republic of Korea.

REFERENCES

[8] ALDEC, Collecting Code Coverage in Active-HDL, August, 2018