Detailed View

Availability data comes from the SPLUNK System as this was deemed more accurate than obtaining network outages from the current Dashboard.

Login to the production SPLUNK instance (https://62.40.104.207:8000/ https://splunk-fra-sh.geant.net:8000) with your windows credentials.

** RS to check with Evangelos that team can access SPLUNK server 

In the new search edit box, please use the following:

`BGP_Total_Downtime_Percent_v3(*,*,31)`

** Pay attention to the value of the number of days in the month we are processing and the use of the ` character before the B of BGP and after the ) character

Click the Previous Month drop down and select Date Range. Ensure that the to and from date is correct then click the Apply button

Now click the Magnifying Glass icon to start the search. The results appear in the Statistics tab, shown in the image below.


Using your mouse, hightlight the rows of the results and copy them to the clipboard. You will only be able to do a page at a time (each page can by up to 100 rows).

Paste your results in an Excel Spreadsheet.

Use this Excel command to create some SQL statements:

=CONCATENATE("INSERT INTO SPLUNK_DATA (year,month,ip_address,no_of_failures,outage_time,availability,interface_description) VALUES (2017,7,'",A1,"',",B1,",'",TEXT(C1,"hh:mm:ss"),"',",D1/100,",'",E1,"');")

** Pay attention to the year and month values in the Excel concatenate command

** WARNING Excel cannot handle hour values that are greater than 23, so when you paste the results into Excel, in the column that contains the Total_Outage data, you must add a ` character in front of the first digit for all hour values greater than 23.

Copy the column that has the SQL statements and paste them into your favourite MySQL tool.

** WARNING Excel cannot handle hour values that are greater than 23, so when you paste the results into your favourite MySQL tool, you will have to get rid of the ` character in front of the first digit for all hour values greater than 23.

Run the amended queries.

in Eclipse run the "Application.java" in the msrgenerator application and then go to the browser "http://localhost:8080/splunk/2018/5/" note the year and month values (these are changeable) . The browser should display "Number of records : XYZNumber"

Once the above step is completed, do the steps below.

** There is a table within the reports database called SPLUNK_LINK, this links the IP Address of the results to the Circuit Name within OpsDB. This table could get out of date as Operations add more circuits.

Run the following SQL command in the reports database (Heidi-SQL) on test-msr.geant.net:

select concat('UPDATE report_service_availability SET outage="',SPLUNK_DATA.outage_time,'", number_of_failures=',SPLUNK_DATA.no_of_failures,', avail=',SPLUNK_DATA.availability,' WHERE base_absid=',opsdb_circuit.circuit_absid,' AND year=',SPLUNK_DATA.year,' AND month=',SPLUNK_DATA.month,';')
FROM SPLUNK_DATA LEFT JOIN SPLUNK_LINK ON (SPLUNK_LINK.ip_address = SPLUNK_DATA.ip_address)
LEFT JOIN opsdb_circuit ON (opsdb_circuit.name = SPLUNK_LINK.circuit_name)
where SPLUNK_DATA.year = 2017
and SPLUNK_DATA.month = 7
and opsdb_circuit.circuit_usage = 'Access'
and opsdb_circuit.status = 'Operational';

** Pay attention to the year and month values in the SQL statement

Export the data using Heidi-SQL using the Ouput format as Delimited Text. Paste the results back into Heidi-SQL and run the queries.

Check your results by using the link: http://test-msr.geant.net:8888/msr/ms_ip_avail_new.jsp

=========================================================================================================================================

Between these 2 lines is optional, depending on Emma

This is for manual updates if Emma asks to change certain values

** WARNING - Sometimes it might be necessary to manually update the data. Run the following two SQL statements to correct any of the data.

UPDATE reports.report_service_availability SET
outage = '13:15:53',
number_of_failures = 27
WHERE year = 2017
AND month = 7
AND base_absid = 31467;

** Pay attention to the circuit absidyear and month values in the SQL statement

** The base_absid is the absid of the circuit we want to change.  You can verify the circuit information by checking the OpsDB Application using the following link: http://opsdb.dante.net/circuit-display-basic.phtml?absid=31467  alternatively you could use the following query:

SELECT * FROM reports.opsdb_circuit

WHERE cirtcuit_absid = 31467;

=====================================================================================================================================

After you have run the previous update SQL statement, then you will need to run the following SQL statement to calculate the availability, MTTR, MTBF of the service.

** WARNING in order to obtain accurate availability, availability during maintenance, mean-time-to-repair (MTTR), mean-time-between-failures (MTBF), we need to use the correct number of seconds for the particular month.

28 Days = 2419200 seconds

29 Days = 2505600 seconds

30 Days = 2592000 seconds

31 Days = 2678400 seconds

UPDATE reports.report_service_availability SET

MTTR = TIME_FORMAT(SEC_TO_TIME(time_to_sec(outage)/number_of_failures),'%H:%i:%s'),
MTBF = TIME_FORMAT(SEC_TO_TIME((2678400 - time_to_sec(outage))/number_of_failures),'%H:%i:%s'),
avail = (2678400 - time_to_sec(outage))/2678400 ,
avail_maint = (2678400 - time_to_sec(out_maint))/2678400 
WHERE year = 2017
AND month = 7
AND number_of_failures > 0;

** Pay attention to the year and month and seconds values in the SQL statement

Overall View

You will notice that some NRENS have one connection to GEANT such as: 

AMRES, ASNET-NM, AzScienceNet,BASNET,MREN,RASH,SANET and UoM 

We now manually insert the data into the reports.report_composed_service_availability table.

An example of such a query would be: (Heidi-SQL)

SELECT CONCAT('UPDATE reports.report_composed_service_availability SET outage = \'',a.outage,'\', number_of_failures = ',a.number_of_failures,' WHERE year = 2017 AND month = 7 AND name = \'',SUBSTRING_INDEX(b.name,' ',1),'\';') as 'SQL Commands'
FROM report_service_availability a
LEFT JOIN reports.opsdb_circuit b ON (b.circuit_absid = a.base_absid)
WHERE SUBSTRING_INDEX(b.name,' ',1) in ('AMRES', 'ASNET-AM', 'AzScienceNet', 'BASNET', 'CAREN', 'CERN', 'MREN', 'RASH', 'ROEDUNET', 'SANET', 'UoM')
AND b.name NOT LIKE '%IAS'
AND a.year = 2017
AND a.month = 7
AND a.number_of_failures > 0;

** Pay attention to the year and month values in the SQL statement

Export the data using Heidi-SQL using the Ouput format as Delimited Text. Paste the results back into Heidi-SQL and run the queries

We can ascertain the single homed NRENs by ignoring IAS circuits and seeing if there are any with only one AP circuit - we should be able to do this in the SQL

===============================================================================================================================================

Between these 2 lines is optional, depending on Emma

NRENS that have dual connection that have had failures in both connections at the same time (determined by reading the issues page: http://test-msr.geant.net:8888/msr/mo_issues_new.jsp , look at the impact column for statements referring to total service failure or AP1 and AP2 down) , will require the manual insertion of data into the reports.report_composed_service_availability table.

An example of such a query would be:

UPDATE reports.report_composed_service_availability SET
outage = '13:15:53',
number_of_failures = 27
WHERE year = 2017
AND month = 7
AND name = 'CYNET';

** Pay attention to the year and month values in the SQL statement

NOW: RUN THE FOLLOWING ONE SQL QUERY AGAIN.

=====================================================================================================================================================

** WARNING in order to obtain accurate availability, availability during maintenance, mean-time-to-repair (MTTR), mean-time-between-failures (MTBF), we need to use the correct number of seconds for the particular month.

28 Days = 2419200 seconds

29 Days = 2505600 seconds

30 Days = 2592000 seconds

31 Days = 2678400 seconds


** IMPROVEMENT - You could use the following (day(last_day(CURDATE() - INTERVAL 1 MONTH))*86400) within the SQL statement then you would not need to change the number of seconds in the month as it would be calculated for you in the query.

Now run the following example query after you have run the above query:

UPDATE reports.report_composed_service_availability SET

MTTR = TIME_FORMAT(SEC_TO_TIME(time_to_sec(outage)/number_of_failures),'%H:%i:%s'),
MTBF = TIME_FORMAT(SEC_TO_TIME((2678400 - time_to_sec(outage))/number_of_failures),'%H:%i:%s'),
avail = (2678400 - time_to_sec(outage))/2678400 ,
avail_maint = (2678400 - time_to_sec(out_maint))/2678400
WHERE year = 2017
AND month = 7
AND number_of_failures > 0;

** Pay attention to the year and month and seconds values in the SQL statement


Now run the following query after you have run the above query:

update report_composed_service_availability set MTBF = '00:00:00', MTTR = '00:00:00'

where MTBF is NULL and MTTR is null
and year = 2017 and month = 7;

** Pay attention to the year and month values in the SQL statement


Check your results by using the link: http://test-msr.geant.net:8888/msr/ms_ip_avail_over_new.jsp

===========================================================================================================================

This is a fix, only needed twice this far

** WARNING sometimes this page does not display properly, and may only show one or two lines of data. The reason for this is the query behind the scenes uses a left join to a table called reports.msr_sla. If a new circuit has been created, a row will be missing from the reports.msr_sla table. To fix this, run the query below and note the name of the circuit that has a corresponding null value in the sla column. Then, insert a corresponding row into the reports.msr_sla and check the results.

SELECT report_composed_service_availability.name, report_composed_service_availability.avail, report_composed_service_availability.MTTR, msr_sla.sla
FROM reports.report_composed_service_availability LEFT JOIN reports.msr_sla ON (msr_sla.name = report_composed_service_availability.name)
WHERE (report_composed_service_availability.project='GEANT2')
AND (report_composed_service_availability.circuit_usage='Access')
AND (report_composed_service_availability.year=2017)
AND (report_composed_service_availability.month=7)
AND (report_composed_service_availability.custom_type='CCT')
ORDER BY name;

** Pay attention to the year and month values in the SQL statement

============================================================================================================================

Summary View

The Java Server Page ms_ip_summ_avail_new.jsp (found at /var/lib/tomcat6/webapps/msr on the server) uses the following query to generate the data. The following is for information only:

SELECT opsdb_circuit.status, opsdb_circuit.date_production, opsdb_circuit.date_terminated, opsdb_circuit.name, excel_service_availability.avail
FROM reports.excel_service_availability, reports.opsdb_circuit
WHERE excel_service_availability.base_absid = opsdb_circuit.circuit_absid
AND (excel_service_availability.base_type='CCT')
AND (opsdb_circuit.project='GEANT2')
AND (opsdb_circuit.circuit_usage='Access')
AND (excel_service_availability.year=2017)
AND (excel_service_availability.month=7)
AND ((opsdb_circuit.status='Operational')
OR (date_format(str_to_date(concat(year(opsdb_circuit.date_terminated),',',month(opsdb_circuit.date_terminated),',1'),'%Y,%m,%d'),'%Y%m')<='201707'))
ORDER BY excel_service_availability.avail desc;

** Pay attention to the year and month values in the SQL statement

Check your results by using the link: http://test-msr.geant.net:8888/msr/ms_ip_summ_avail_new.jsp