ePPR enable script

September 15, 2020

by Ali Natiq

Intended Audience Level: Intermediate

Code Sample Type: Complete Script

Nutanix Technologies: General

Minimum Product Version: 5.10

Script/Code Language: Python

REST API Sample? No

REST API Version: N/A

ePPR is one of the method in newer BIOS versions on G6/G7 platforms, to proactively detect and fix the DIMM errors.

This script fetches the BIOS configuration file from the node and update the ePPR parameter from Disable to Enable or vice-versa and push the updated configuration file to node. So that ePPR can be performed in next reboot of the node.

Code Sample Details

This section may be empty if additional code sample details are not available.
#########################################################################
    #   ePPR enable script
    #   Filename: eppr.py
    #   Script Version: 1.0
#########################################################################

#########################################################################
# Prerequisites

# 1. Successful execution of script depends on smcsumtool.

# 2. If the smcsumtool is not present on CVM at location:
# /home/nutanix/foundation/lib/bin/smcsumtool/sum, try to run
# the script from the CVM which have the smcsumtool binary.

# 3. If the smcsumtool is not present on any of the CVM at given location,
# Copy the file foundation_bmc-1.0.xxxxxx-py2.7.egg under ~/foundation/lib/foundation-platforms/ into ~/tmp directory and
# unzip file foundation_bmc-1.0.xxxxxx-py2.7.egg 

#NOTE: File name varies with each release
#nutanix@cvm:~/tmp$ unzip foundation_bmc-1.0.1558398598-py2.7.egg
#nutanix@cvm:~/tmp$ cd ~/tmp/bmc_utils/data/external_lib/bin/
#nutanix@cvm:~/tmp$ cp -r smcsumtool ~/foundation/lib/bin/
#########################################################################


#########################################################################
# Synopsis

# This script fetches the BIOS configuration file from the node
# and update the ePPR parameter from Disable to Enable or vice-versa
# and push the updated configuration file to node. So that ePPR can be
# performed in next reboot of the node.
#########################################################################


#########################################################################
# Usage Instructions:

# 1. Copy this script on any one of the CVMs in cluster
# at path (create analysis directory in /home/nutanix directory
# if it does not already exist): /home/nutanix/analysis/eppr.py

# 2. Make sure the execute permission is given to the script:
# chmod +x eppr.py

# 3. Make sure the smcsumtool binary file is present on the CVM
# at location: /home/nutanix/foundation/lib/bin/smcsumtool/sum
#########################################################################


#########################################################################
# Brief syntax usage:

# 1. For interactive mode:
# python eppr.py -i 

# 2. To enable ePPR and using csv/txt file as source of input, containing comma
# separated values IPMI_IP, USERNAME, PASSWORD

# python eppr.py -e <.csv or .txt filename> 

# 3. To disable ePPR and using csv/txt file as source of input, containing comma seperated values IPMI_IP,USERNAME,PASSWORD

# python eppr.py -d <.csv or .txt filename>
#########################################################################


#########################################################################
# Disclaimer                                                            #
# This code is intended as a standalone example. Subject to licensing   #
# restrictions defined on nutanix.dev, this can be downloaded, copied   #
# and/or modified in any way you see fit.                               #
# Please be aware that all public code samples provided by Nutanix are  #
# unofficial in nature, are provided as examples only, are unsupported  #
# and will need to be heavily scrutinized and potentially modified      #
# before they can be used in a production environment. All such code    #
# samples are provided on an as-is basis, and Nutanix expressly         #
# disclaims all warranties, express or implied.                         #
# All code samples are © Nutanix, Inc., and are provided as-is under    #
# the MIT license. (https://opensource.org/licenses/MIT)                #
#########################################################################

import csv
import argparse
import getpass
import socket
import subprocess
import codecs
import os
import re,subprocess,sys,logging,datetime

#########################################################################
# Set Variables
#########################################################################
version = 1.0
smcsumtool_location = '/home/nutanix/foundation/lib/bin/smcsumtool/sum'
directory = '/home/nutanix/analysis/'

# List to store the IPMI IPs for which error was seen while getting
# the BIOS configuration file or while pushing the config file.
error_node = []

# Dictionary to store the supportability of ePPR before attempting
# to enable/disable ePPR and for storing ePPR enable/disable status after
# the attempt to toggle ePPR status.
status_eppr = {}
IPMI_PORT = 623
ipmi = []

# Minimum version of BIOS to support ePPR.
supported_bios = "42.300"

#########################################################################
####### NO NEED TO CHANGE ANYTHING BELOW HERE #######
#########################################################################
# Function to get the command output from CVM
def run_command(cmd):
    logging.info("Running Command: %s"%cmd)
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
        stderr=subprocess.PIPE, shell=True)
    (output, err) = p.communicate()
    output = str(output)
    err = str(err)
    rc = p.returncode
    return (output,err,rc)



# Function to toggle ePPR by updating ePPR parameter in BIOS configuration file.
def toggle_eppr(ip, toggle, user='ADMIN', password='ADMIN'):
    with open(directory + ip + '_bios.txt', "r") as f:
        data = f.read()

    param = re.search(r'<Setting name="Enhanced PPR" '\
        'selectedOption="(.*.)" type="Option">', data).group(1)

    logging.info("Old eppr setting is %s, and requested operation is %s", \
        param, toggle)

    if toggle == 'Enable' and param == toggle:
        logging.info("ePPR is already " + toggle)
        return 0

    old_setting = param if toggle == "Disable" else "Disable"

    with open(directory + ip + '_bios.txt', "r+") as f:
        data = f.read()
        f.seek(0)
        f.write(re.sub(r'<Setting name="Enhanced PPR" '\
            'selectedOption="%s" type="Option">' % old_setting,
            r'<Setting name="Enhanced PPR" '\
            'selectedOption="%s" type="Option">' % toggle, data))
        f.truncate()

    logging.info("Attempting to " + toggle + " ePPR for %s", ip)
    (out,err,rc) = run_command(smcsumtool_location + ' -i ' + ip + ' -u ' + \
        user + ' -p ' + password + ' -c ChangeBiosCfg --file ' + directory + \
        ip + '_bios.txt')
    logging.info(out)
    return rc


def is_ipmi_open(ip):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    try:
        s.connect((ip, IPMI_PORT))
        s.shutdown(2)
        return True
    except:
        return False


# Function to fetch BIOS config file
def bios(ip, toggle, user='ADMIN', password='ADMIN'):
    # checking the network connectivity to IPMI IP
    if is_ipmi_open(ip):
        if not os.path.exists(directory + ip + '_bios.txt'):
            logging.info("Attempting to fetch BIOS conifg for %s.", ip)
            cmd = smcsumtool_location + ' -i ' + ip + ' -u ' + user + \
            ' -p ' + password + ' -c GetCurrentBiosCfg --file ' + directory + \
            ip + '_bios.txt'
            (output2,err2,rc) = run_command(cmd)
            logging.info("\n %s \n", output2)

            if rc != 0 :
                logging.error("Error seen while fetching BIOS config for %s", ip)
                status_eppr[ip]['eppr'] = "Not " + toggle + "d, Error while Fetching config"
                error_node.append(ip)
                return 1

        logging.info("Checking the bios version from %s", directory + ip + '_bios.txt')
        cmd = 'grep -i "bios version" ' + directory + ip + '_bios.txt'
        (out, err, rc) = run_command(cmd)
        bios_version = re.search( r'.*.Version\(\w\w(.*.)\)', out).group(1)

        if (bios_version < supported_bios):
            status_eppr[ip]["eppr"] = "Not " + toggle + "d, Unsupported BIOS-" + bios_version
            logging.info("BIOS version %s on %s is not from ePPR supported version",
                         bios_version, ip)
            return 0

        else:
            rc = toggle_eppr(ip, toggle, user, password)

            if rc == 0:
                logging.info("Successfully scheduled ePPR " + toggle + " on " + ip)
                status_eppr[ip]['eppr'] = "Successfully " + toggle + "d, BIOS-" + bios_version
                return 0
            else:
                logging.error("Error seen during " + toggle +" of ePPR for node %s", ip)
                status_eppr[ip]['eppr'] = "Not " + toggle + "d, Error During " + toggle
                if ip not in error_node:
                    error_node.append(ip)
                return 0

    else:
        logging.error("Problem in Network connectivity to %s", ip)
        status_eppr[ip]['eppr'] = "Not " + toggle + "d, Error in Network Connectivity"
        if ip not in error_node:
            error_node.append(ip)
        return 0



def file_choices(choices,fname):
    parser = argparse.ArgumentParser()
    if not os.path.exists(directory + fname):
        logging.error("File path does not exists")
        parser.error("File {} does not exists.".format(fname))
    ext = os.path.splitext(fname)[1][1:]
    if ext not in choices:
        parser.error("file doesn't end with one of {}".format(choices))
    return fname



def delete_old_config(sup_ipmi):
    for ip in sup_ipmi:
        if os.path.exists(directory + ip + '_bios.txt'):
            logging.info("Deleting the older existing file %s", \
                directory + ip + '_bios.txt')
            os.remove(directory + ip + '_bios.txt')


def main():
    if not os.path.exists(directory):
        logging.error("Directory %s not found, please create it before proceeding.", directory)
        sys.exit(0)

    logging.basicConfig(filename='eppr-enable.log', level=logging.INFO, \
        format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

    #redirect logs to console for troubleshooting
    root = logging.getLogger()
    root.setLevel(logging.INFO)
    ch = logging.StreamHandler(sys.stdout)
    ch.setLevel(logging.INFO)
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    ch.setFormatter(formatter)
    root.addHandler(ch)

    parser = argparse.ArgumentParser(version="%s" %version)
    group = parser.add_mutually_exclusive_group()

    group.add_argument('-e', metavar='<.csv/.txt filename>', type=str,
    help="For enabling ePPR, enter the .txt or .csv filename which has comma "\
    "separated values of IPMI_IP,USERNAME,PASSWORD")

    group.add_argument('-d', metavar='<.csv/.txt filename>', type=str,
    help="For Disabling ePPR, enter the .txt or .csv filename which has comma "\
    "separated values of IPMI_IP,USERNAME,PASSWORD")

    group.add_argument('-i', help = "Interactive Mode to enable or disable ePPR.",
                        action="store_true")

    args = parser.parse_args()

    if not (args.e or args.i or args.d):
        parser.error("No action requested, add -e or -i or -d, "\
        "use -h to see available options.")

    if args.e:
        file_choices(("csv","txt"),args.e)

    elif args.d:
        file_choices(("csv","txt"),args.d)


    logging.info("This script will Enable/Disable ePPR feature for G6, G7 nodes " \
        "which are on BIOS version 42.300 and above.\n")

    sup_ipmi = []       # The list of IPMI IPs for G6, G7 nodes.
    unknown_ip = []     # The list of unknown IPMI IPs entered.
    file_input = []     # 2D List that holds the list corresponding to each line in CSV file.
    required_ip = []    # The list that contains IPMI IPs which were queried \
                        # for BIOS cfg during execution.

    command = 'ncli host ls'
    (out, err, rc) = run_command(command)

    if rc == 0:
        ipmi = re.findall(r'IPMI Address.*.:.(.*.)\n.*.', out)
        cvm = re.findall(r'Controller VM Address.*.:.(.*.)\n.*.', out)
        host = re.findall(r'Hypervisor Address.*.:.(.*.)\n.*.', out)
        model = re.findall(r'Block Serial.*.\((.*.)\)\n', out)

        columns = ["IPMI IP", "CVM IP", "HOST IP", "Hardware Model", \
                   "Supported Hardware", "ePPR Status"]

        for i in range(len(ipmi)):
            if ('G6' in model[i]) or ('G7' in model[i]):
                status_eppr[ipmi[i]] = {'model': model[i], 'cvm': cvm[i], 'host':host[i], \
                'eppr': 'Supported Node Model', 'supported': 'Yes'}
            else:
                status_eppr[ipmi[i]] = {'model': model[i], 'cvm': cvm[i], 'host':host[i], \
                'eppr': 'Not Enabled, Unsupported Node Model', 'supported': 'No'}


        # Printing the initial Supported status for ePPR for each node.
        st = '\n' + '              '.join(columns) + '\n'
        for i in status_eppr:
            st = st + i + '          ' + status_eppr[i]['cvm'] + '         ' + \
            status_eppr[i]['host'] + '           ' + status_eppr[i]['model'] + \
            '                  ' + status_eppr[i]['supported'] + \
            '                      ' + status_eppr[i]['eppr'] + '\n'

            if status_eppr[i]['eppr'] == 'Supported Node Model':
                sup_ipmi.append(i)

        logging.info(st)

        if len(sup_ipmi) != 0:
            if os.path.exists(smcsumtool_location):
                logging.info("ePPR supported IPMI ips are %s", ",".join(sup_ipmi))

                if args.e or args.d:
                    logging.info("Correct file format found.")
                    delete_old_config(sup_ipmi)
                    input_file = args.d if args.d else args.e

                    with codecs.open(input_file, 'r', encoding='utf-8-sig') as file:
                        reader = csv.reader(file, skipinitialspace = True, delimiter=',', \
                            quoting=csv.QUOTE_ALL)
                        for row in reader:
                            if not row:
                                continue
                            file_input.append(row)

                    if not file_input:
                        logging.error("Input file %s is empty.", input_file)
                        sys.exit(0)

                    operation = "Disable" if args.d else "Enable"

                    for line in file_input:
                        IP = line[0].strip()
                        user = line[1].strip()
                        pwd = line[2].strip()
                        logging.info("Passed parameters are %s, %s", IP, user)
                        if IP in ipmi:
                            bios(IP, operation, user, pwd)
                            if IP not in required_ip:
                                required_ip.append(IP)
                        else:
                            logging.error("Unknown IPMI IP %s found, continuing with next IP",\
                                         IP)
                            unknown_ip.append(IP)

                    if unknown_ip:
                        logging.error("In the given input file following "\
                                      "incorrect IPMI IPs were present %s",
                         ','.join(unknown_ip))

                elif args.i:

                    while True :
                        print "\nSelect one of the following options:\n", \
                        "1. Enable ePPR on all Nodes, IPMI IPs will be fetched from " \
                        "cluster and default username/password ADMIN/ADMIN will be used.\n", \
                        "2. Enable ePPR by manually giving IPMI IPs and their "\
                        "username/password for selective nodes.\n", \
                        "3. Configure System to skip ePPR in next boot by giving IPMI IP "\
                        "username/password for selective nodes.\n", \
                        "4. Quit\n"

                        choice = raw_input("\nPlease Enter your choice.\n")

                        if (choice == '1') or (choice == '2') or (choice == '3'):
                            delete_old_config(sup_ipmi)

                        if choice == '1':
                            logging.info("Executing choice %s", choice)

                            while True :
                                user_pwd = raw_input("Are the username and password "\
                                                     "same for all the nodes?,"\
                                                     "Enter Yes or No\n").lower()
                                if user_pwd == 'no':
                                    logging.info("Shifting to option 2")
                                    logging.info("Executing choice 2 instead")
                                    choice = '2'
                                    break

                                elif user_pwd == 'yes':
                                    for ip in sup_ipmi:
                                        bios(ip, "Enable")
                                        if ip not in required_ip:
                                            required_ip.append(ip)
                                    break

                                else:
                                    print "Enter the correct input\n"

                        if choice == '1':
                            break

                        if choice == '2' or choice == '3':
                            logging.info("Executing choice %s.", choice)
                            logging.info("Enter the individual IPMI IP "\
                                         "and their username/password.")
                            more = 'None'
                            retry = 'None'

                            while (more != 'no') or (retry != 'no'):
                                if retry != 'yes':
                                    while True:
                                        ip = raw_input("Enter the IPMI ip address\n")
                                        if ip in sup_ipmi:
                                            break
                                        else:
                                            logging.info("This IPMI ip %s is not "\
                                            "in ePPR supported node list, try again.", ip)

                                user = raw_input("Enter the IPMI username\n")
                                password = getpass.getpass(prompt = "Enter the IPMI password\n")
                                if choice == '2':
                                    ret = bios(ip, "Enable", user, password)
                                    if ip not in required_ip:
                                        required_ip.append(ip)

                                if choice == '3':
                                    ret = bios(ip, "Disable", user, password)
                                    if ip not in required_ip:
                                        required_ip.append(ip)

                                if ret:
                                    logging.error("If the credentials were wrong "\
                                                      "in previous attempt, try again.")
                                    while True:
                                        retry = raw_input("Do you want to try again for " \
                                                         + ip + "?, Yes or No\n").lower()
                                        if retry == 'yes':
                                            if ip in error_node:
                                                error_node.remove(ip)
                                        if (retry == 'no') or (retry == 'yes'):
                                            break
                                        else:
                                            logging.error("Incorrect input, enter yes or no.")

                                if (ret == 0) or (retry == 'no'):
                                    if ret == 0:
                                        if ip in error_node:
                                            logging.info("Removing %s from "\
                                                         "error_node list", ip)
                                            error_node.remove(ip)
                                        retry = 'no'
                                    while True:
                                        logging.info("Try to proceed for other nodes.")
                                        more = raw_input("Would you like to proceed "\
                                                    "with this operation for another node?" \
                                            "Enter Yes or No\n").lower()
                                        if (more == 'no') or (more == 'yes'):
                                            break
                                        else:
                                            logging.info("Incorrect input, enter yes or no.")
                            break

                        if choice == '4':
                            pass
                            break

                        else:
                            logging.info("Enter the choice from list.")

                logging.info(status_eppr)

                #printing final status of ePPR after attempting to toggle ePPR for each node.

                ast = "\n" + '              '.join(columns) + '\n'
                for i in required_ip:
                    ast = ast + i + '          ' + status_eppr[i]['cvm'] +  '         ' + \
                    status_eppr[i]['host'] + '           ' + status_eppr[i]['model'] + \
                    '                  ' + status_eppr[i]['supported'] + \
                    '                      ' + status_eppr[i]['eppr'] + '\n'

                logging.info(ast)

                if error_node:
                    logging.error("Error was seen while fetching the BIOS configuration " \
                    "for the following IPMI IPs %s, please check the "\
                    "error message in eppr-enable.log for " \
                    "them and try again.", ','.join(error_node))

                if required_ip:
                    logging.info("Nodes for which ePPR is Enabled "\
                    "successfully, require a reboot for ePPR feature to fix DIMM errors." )

                else:
                    logging.info("None of the nodes had change in ePPR setting.")

            else:
                logging.error("smcsumtool does not exist at location %s on this CVM, "\
                "try running script from another CVM where smcsumtool binary is present at %s. "\
                "As smcsumtool directory location depends on foundation version "\
                "check KB-8889 to find the actual location for smcsumtool on the CVM "\
                "and copy the complete smcsumtool directory to location: "\
                "/home/nutanix/foundation/lib/bin/.", \
                smcsumtool_location, smcsumtool_location)

        else:
            logging.error("The node models are not G6 or G7," \
                "ePPR feature is not supported on this node model so this script" \
                "is not required.\n")
    else:
        logging.error("Error %s seen while executing command %s on this CVM, " \
            "terminating the script please resolve the issue and try again." \
            , out, command)



if __name__ == "__main__":
    main()